how to best "characterize" a thread - Unix
This is a discussion on how to best "characterize" a thread - Unix ; This is an unusual thread question because it's not about thread
programming per se but rather thread _auditing_. I'm working on a
process auditing tool (think strace but put to a different purpose).
Imagine that a threaded program is run ...
-
how to best "characterize" a thread
This is an unusual thread question because it's not about thread
programming per se but rather thread _auditing_. I'm working on a
process auditing tool (think strace but put to a different purpose).
Imagine that a threaded program is run twice under control of this
auditor and the results are left in two output files "trace1" and
trace2". I need to be able to "match up" the threads between trace1 and
trace2 for comparison purposes. And BTW, for now I only want to consider
pthreads; if the answer can be adapted to other implementations I can
probably handle that myself later on.
The obvious answer is to compare thread ids but this breaks down pretty
quickly. Some platforms issue random thread ids while others increment
from 1, but even on the latter type, due to the basic asynchronous
nature of threads, thread #3 in one run might be #4 in another.
Right now the best I can think of is a combination of (1) the address
from which pthread_create() was called, (2) the address of the
start_routine, and (3) the address of the argument to the start_routine.
It seems to me that, though it's possible for that same combination to
occur multiple times in the same process, even if they did the threads
would be logically interchangeable (i.e. they'd be doing the same thing
so it would be ok to treat them the same).
I can see at least one flaw in the above scheme; the data that the 'arg'
pointer address could change even if the pointer itself didn't. I don't
see an answer to that.
It might make sense to use the first argument to pthread_create (the
address to which the thread id is written, as opposed to the id itself)
as well, since I can't see it usually making sense to write multiple
thread ids to the same place.
These are my ideas; does anyone have better ones? Or is this by chance a
problem with a documented solution?
Thanks,
Arch Stanton
-
Re: how to best "characterize" a thread
Arch Stanton wrote:
> This is an unusual thread question because it's not about thread
> programming per se but rather thread _auditing_. I'm working on a
> process auditing tool (think strace but put to a different purpose).
> Imagine that a threaded program is run twice under control of this
> auditor and the results are left in two output files "trace1" and
> trace2". I need to be able to "match up" the threads between trace1 and
> trace2 for comparison purposes. And BTW, for now I only want to consider
> pthreads; if the answer can be adapted to other implementations I can
> probably handle that myself later on.
>
> The obvious answer is to compare thread ids but this breaks down pretty
> quickly. Some platforms issue random thread ids while others increment
> from 1, but even on the latter type, due to the basic asynchronous
> nature of threads, thread #3 in one run might be #4 in another.
>
> Right now the best I can think of is a combination of (1) the address
> from which pthread_create() was called, (2) the address of the
> start_routine, and (3) the address of the argument to the start_routine.
> It seems to me that, though it's possible for that same combination to
> occur multiple times in the same process, even if they did the threads
> would be logically interchangeable (i.e. they'd be doing the same thing
> so it would be ok to treat them the same).
>
> I can see at least one flaw in the above scheme; the data that the 'arg'
> pointer address could change even if the pointer itself didn't. I don't
> see an answer to that.
>
> It might make sense to use the first argument to pthread_create (the
> address to which the thread id is written, as opposed to the id itself)
> as well, since I can't see it usually making sense to write multiple
> thread ids to the same place.
>
> These are my ideas; does anyone have better ones? Or is this by chance a
> problem with a documented solution?
I may have misunderstood, so let me restate what I think your
problem is. You have a multi-threaded program in which various
threads write data to the same output file, perhaps a log of some
kind. Running the program a second time, even with the same inputs,
may produce a different log because "corresponding" threads may be
scheduled differently in the two executions. Your job is to separate
the individual threads' log entries from each of the two logs, and to
match them with the "corresponding" threads' entries from the other log.
If that's not what you're after, just skip this message ...
It seems to me that you must be in control of what is written to
the log, since you're contemplating making a change to it. Instead
of relying on implementation-specific and out-of-your-control things
like thread ID's and so on, why not just assign each thread an "ID"
of your own when you launch it? Just pass your own sequence number
or string or whatever to each new thread at pthread_create() time,
and thereafter have each thread tag its own log entries with the ID
you gave it. That is, your log entries would be tagged "T1" and "T2"
even if the system's thread ID's in one run were 42 and 4242, and in
the other run 25 and 99.
... or have I mistaken your intent?
--
Eric.Sosman@sun.com
-
Re: how to best "characterize" a thread
On Sep 18, 5:00 pm, Arch Stanton wrote:
> It might make sense to use the first argument to
> pthread_create (the address to which the thread id is written,
> as opposed to the id itself) as well, since I can't see it
> usually making sense to write multiple thread ids to the same
> place.
If the threads are detached, there's no reason to keep the
pthread_t variable around. When starting a detached thread,
I'll use a local variable for it, which disappears when I return
from the function. And if I call the function again, there's a
distinct possibility that it will end up at the same address.
--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34