Forking from a thread? - Unix
This is a discussion on Forking from a thread? - Unix ; Hello everyone,
I'm currently writing an extension to Nautilus, and it needs to
retrieve some data by calling a child process and parsing the output
via popen(). I'd like to do this from a thread so Nautilus can
continue on ...
-
Forking from a thread?
Hello everyone,
I'm currently writing an extension to Nautilus, and it needs to
retrieve some data by calling a child process and parsing the output
via popen(). I'd like to do this from a thread so Nautilus can
continue on its merry way, but I was just wondering if there's anything
I should do when forking from a thread. I've read the man pages for
pthreads and fork, and both of them seem to say that I don't need to do
anything special; I just wanted it confirmed by experienced
programmers.
Thanks,
Rob Hoelz
-
Re: Forking from a thread?
Rob Hoelz wrote On 10/15/07 12:46,:
> Hello everyone,
>
> I'm currently writing an extension to Nautilus, and it needs to
> retrieve some data by calling a child process and parsing the output
> via popen(). I'd like to do this from a thread so Nautilus can
> continue on its merry way, but I was just wondering if there's anything
> I should do when forking from a thread. I've read the man pages for
> pthreads and fork, and both of them seem to say that I don't need to do
> anything special; I just wanted it confirmed by experienced
> programmers.
You should be all right IF you "don't do much" in
the child process before calling exec*().
When you fork() a multi-threaded process, the only
thread in the child is a duplicate of the thread that
called fork() in the parent. The problem is that the
child also has copies of all the parent's data, including
its mutexes and other locks. So if the parent's thread T1
holds a mutex M1 at the moment when T2 calls fork(), the
child process inherits a copy of the locked M1 and a copy
of the running T2, but the child has no T1 to unlock M1.
If T2 in the child tries to lock M1, it will block forever.
It would be a Bad Idea, for example, for the child to
call malloc(): If the parent's T1 was executing malloc()
at the time of fork() and if malloc() held any internal
locks, those locks' copies in the child will still be
locked -- with no prospect of ever being released.
Also, the child has copies of the parent's data that
was being protected by M1, and since M1 was locked for a
reason (one presumes), it is more than likely that the data
structures are in a fragile or inconsistent state. It
would be a Bad Idea for the child's T2 to try to use such
data structures without locking "because there's no one
here but me," because it might find them full of garbage.
So: fork() away, but the child should do as little
as possible before calling some form of exec*(). In
particular, the child should not call malloc() nor any
of its relatives, nor call exit() (use _exit() if you
must), nor use the C library's standard I/O machinery
(write error messages to 2, not to stderr), nor try to
touch any of the inherited synchronization objects. It
is possible to gain a little more freedom, but that *is*
a lot of additional work.
--
Eric.Sosman@sun.com
-
Re: Forking from a thread?
Eric Sosman wrote:
> Rob Hoelz wrote On 10/15/07 12:46,:
> > Hello everyone,
> >
> > I'm currently writing an extension to Nautilus, and it needs to
> > retrieve some data by calling a child process and parsing the output
> > via popen(). I'd like to do this from a thread so Nautilus can
> > continue on its merry way, but I was just wondering if there's
> > anything I should do when forking from a thread. I've read the man
> > pages for pthreads and fork, and both of them seem to say that I
> > don't need to do anything special; I just wanted it confirmed by
> > experienced programmers.
>
> You should be all right IF you "don't do much" in
> the child process before calling exec*().
>
> When you fork() a multi-threaded process, the only
> thread in the child is a duplicate of the thread that
> called fork() in the parent. The problem is that the
> child also has copies of all the parent's data, including
> its mutexes and other locks. So if the parent's thread T1
> holds a mutex M1 at the moment when T2 calls fork(), the
> child process inherits a copy of the locked M1 and a copy
> of the running T2, but the child has no T1 to unlock M1.
> If T2 in the child tries to lock M1, it will block forever.
> It would be a Bad Idea, for example, for the child to
> call malloc(): If the parent's T1 was executing malloc()
> at the time of fork() and if malloc() held any internal
> locks, those locks' copies in the child will still be
> locked -- with no prospect of ever being released.
>
> Also, the child has copies of the parent's data that
> was being protected by M1, and since M1 was locked for a
> reason (one presumes), it is more than likely that the data
> structures are in a fragile or inconsistent state. It
> would be a Bad Idea for the child's T2 to try to use such
> data structures without locking "because there's no one
> here but me," because it might find them full of garbage.
>
> So: fork() away, but the child should do as little
> as possible before calling some form of exec*(). In
> particular, the child should not call malloc() nor any
> of its relatives, nor call exit() (use _exit() if you
> must), nor use the C library's standard I/O machinery
> (write error messages to 2, not to stderr), nor try to
> touch any of the inherited synchronization objects. It
> is possible to gain a little more freedom, but that *is*
> a lot of additional work.
>
Thanks for the quick reply! All I'm calling in the child process is
close(), dup2(), chdir(), then exec(), so unless they do some freaky
stuff with mutexes that I'm not aware of, I think I'll be ok. However,
now I've encountered a slightly different problem: I would like my
main thread to block on a condition variable in a function, waiting for
my other thread to generate a result and then signal that condition
variable. Here's some sample code:
/* Pseudo-C */
setupCallbackForOtherThread(pthread_cond_signal(¬ifyTrigger));
pthread_mutex_unlock(&cacheLock);
pthread_cond_notify(&workerTrigger);
pthread_cond_wait(¬ifyTrigger);
My fear is that the other thread will call the callback that signals
notifyTrigger sometime after the callback has been set up and sometime
before pthread_cond_wait(¬ifyTrigger) is called in the main thread,
thus causing the main thread to block forever on that call (only the
main thread ever blocks on notifyTrigger, and the other thread only
calls a distinct callback once) What would be a good solution to this
dilemma? I can provide more code if needed.
Thanks,
Rob Hoelz
-
Re: Forking from a thread?
Rob Hoelz wrote On 10/15/07 15:23,:
> [...]
> now I've encountered a slightly different problem: I would like my
> main thread to block on a condition variable in a function, waiting for
> my other thread to generate a result and then signal that condition
> variable. Here's some sample code:
>
> /* Pseudo-C */
> setupCallbackForOtherThread(pthread_cond_signal(¬ifyTrigger));
> pthread_mutex_unlock(&cacheLock);
> pthread_cond_notify(&workerTrigger);
> pthread_cond_wait(¬ifyTrigger);
>
> My fear is that the other thread will call the callback that signals
> notifyTrigger sometime after the callback has been set up and sometime
> before pthread_cond_wait(¬ifyTrigger) is called in the main thread,
> thus causing the main thread to block forever on that call (only the
> main thread ever blocks on notifyTrigger, and the other thread only
> calls a distinct callback once) What would be a good solution to this
> dilemma? [...]
You should never wait "for" a condition variable (in
fact, you cannot do so reliably). The thing you wait
for is some kind of predicate: "work is complete" or
"Elvis has left the building." Lock the mutex that
protects the data items that make up the predicate,
test the predicate, and if it isn't satisfied call
pthread_cond_wait(). When it returns, TEST AGAIN and
call pthread_cond_wait() again if the predicate still
doesn't hold. One of these times you'll presumably find
that the predicate is true, at which point you can take
whatever action is appropriate and then release the mutex.
Here's the canonical pattern:
pthread_mutex_lock(&mutex);
while (! predicate()) /* what you "wait for" */
pthread_cond_wait(&condvar, &mutex);
do_protected_things();
pthread_mutex_unlock(&mutex);
.... and the partner in crime does
pthread_mutex_lock(&mutex);
make_predicate_true(); /* THIS is the "event" */
pthread_mutex_unlock(&mutex);
pthread_cond_signal(&condvar);
If your code uses condvars but doesn't use this pattern
(perhaps with doodads and fripperies), your code is wrong.
R-O-N-G, wrong. Reread your source of Pthreads information.
(Note: There have been heated debates about whether to
unlock the mutex before or after signalling the condvar.
Despite T.A. Edison, "heated" does not imply "enlightened.")
--
Eric.Sosman@sun.com