Forking from a thread? - Unix

This is a discussion on Forking from a thread? - Unix ; Hello everyone, I'm currently writing an extension to Nautilus, and it needs to retrieve some data by calling a child process and parsing the output via popen(). I'd like to do this from a thread so Nautilus can continue on ...

+ Reply to Thread
Results 1 to 4 of 4

Thread: Forking from a thread?

  1. Forking from a thread?

    Hello everyone,

    I'm currently writing an extension to Nautilus, and it needs to
    retrieve some data by calling a child process and parsing the output
    via popen(). I'd like to do this from a thread so Nautilus can
    continue on its merry way, but I was just wondering if there's anything
    I should do when forking from a thread. I've read the man pages for
    pthreads and fork, and both of them seem to say that I don't need to do
    anything special; I just wanted it confirmed by experienced
    programmers.

    Thanks,
    Rob Hoelz

  2. Re: Forking from a thread?

    Rob Hoelz wrote On 10/15/07 12:46,:
    > Hello everyone,
    >
    > I'm currently writing an extension to Nautilus, and it needs to
    > retrieve some data by calling a child process and parsing the output
    > via popen(). I'd like to do this from a thread so Nautilus can
    > continue on its merry way, but I was just wondering if there's anything
    > I should do when forking from a thread. I've read the man pages for
    > pthreads and fork, and both of them seem to say that I don't need to do
    > anything special; I just wanted it confirmed by experienced
    > programmers.


    You should be all right IF you "don't do much" in
    the child process before calling exec*().

    When you fork() a multi-threaded process, the only
    thread in the child is a duplicate of the thread that
    called fork() in the parent. The problem is that the
    child also has copies of all the parent's data, including
    its mutexes and other locks. So if the parent's thread T1
    holds a mutex M1 at the moment when T2 calls fork(), the
    child process inherits a copy of the locked M1 and a copy
    of the running T2, but the child has no T1 to unlock M1.
    If T2 in the child tries to lock M1, it will block forever.
    It would be a Bad Idea, for example, for the child to
    call malloc(): If the parent's T1 was executing malloc()
    at the time of fork() and if malloc() held any internal
    locks, those locks' copies in the child will still be
    locked -- with no prospect of ever being released.

    Also, the child has copies of the parent's data that
    was being protected by M1, and since M1 was locked for a
    reason (one presumes), it is more than likely that the data
    structures are in a fragile or inconsistent state. It
    would be a Bad Idea for the child's T2 to try to use such
    data structures without locking "because there's no one
    here but me," because it might find them full of garbage.

    So: fork() away, but the child should do as little
    as possible before calling some form of exec*(). In
    particular, the child should not call malloc() nor any
    of its relatives, nor call exit() (use _exit() if you
    must), nor use the C library's standard I/O machinery
    (write error messages to 2, not to stderr), nor try to
    touch any of the inherited synchronization objects. It
    is possible to gain a little more freedom, but that *is*
    a lot of additional work.

    --
    Eric.Sosman@sun.com

  3. Re: Forking from a thread?

    Eric Sosman wrote:

    > Rob Hoelz wrote On 10/15/07 12:46,:
    > > Hello everyone,
    > >
    > > I'm currently writing an extension to Nautilus, and it needs to
    > > retrieve some data by calling a child process and parsing the output
    > > via popen(). I'd like to do this from a thread so Nautilus can
    > > continue on its merry way, but I was just wondering if there's
    > > anything I should do when forking from a thread. I've read the man
    > > pages for pthreads and fork, and both of them seem to say that I
    > > don't need to do anything special; I just wanted it confirmed by
    > > experienced programmers.

    >
    > You should be all right IF you "don't do much" in
    > the child process before calling exec*().
    >
    > When you fork() a multi-threaded process, the only
    > thread in the child is a duplicate of the thread that
    > called fork() in the parent. The problem is that the
    > child also has copies of all the parent's data, including
    > its mutexes and other locks. So if the parent's thread T1
    > holds a mutex M1 at the moment when T2 calls fork(), the
    > child process inherits a copy of the locked M1 and a copy
    > of the running T2, but the child has no T1 to unlock M1.
    > If T2 in the child tries to lock M1, it will block forever.
    > It would be a Bad Idea, for example, for the child to
    > call malloc(): If the parent's T1 was executing malloc()
    > at the time of fork() and if malloc() held any internal
    > locks, those locks' copies in the child will still be
    > locked -- with no prospect of ever being released.
    >
    > Also, the child has copies of the parent's data that
    > was being protected by M1, and since M1 was locked for a
    > reason (one presumes), it is more than likely that the data
    > structures are in a fragile or inconsistent state. It
    > would be a Bad Idea for the child's T2 to try to use such
    > data structures without locking "because there's no one
    > here but me," because it might find them full of garbage.
    >
    > So: fork() away, but the child should do as little
    > as possible before calling some form of exec*(). In
    > particular, the child should not call malloc() nor any
    > of its relatives, nor call exit() (use _exit() if you
    > must), nor use the C library's standard I/O machinery
    > (write error messages to 2, not to stderr), nor try to
    > touch any of the inherited synchronization objects. It
    > is possible to gain a little more freedom, but that *is*
    > a lot of additional work.
    >


    Thanks for the quick reply! All I'm calling in the child process is
    close(), dup2(), chdir(), then exec(), so unless they do some freaky
    stuff with mutexes that I'm not aware of, I think I'll be ok. However,
    now I've encountered a slightly different problem: I would like my
    main thread to block on a condition variable in a function, waiting for
    my other thread to generate a result and then signal that condition
    variable. Here's some sample code:

    /* Pseudo-C */
    setupCallbackForOtherThread(pthread_cond_signal(&notifyTrigger));
    pthread_mutex_unlock(&cacheLock);
    pthread_cond_notify(&workerTrigger);
    pthread_cond_wait(&notifyTrigger);

    My fear is that the other thread will call the callback that signals
    notifyTrigger sometime after the callback has been set up and sometime
    before pthread_cond_wait(&notifyTrigger) is called in the main thread,
    thus causing the main thread to block forever on that call (only the
    main thread ever blocks on notifyTrigger, and the other thread only
    calls a distinct callback once) What would be a good solution to this
    dilemma? I can provide more code if needed.

    Thanks,
    Rob Hoelz

  4. Re: Forking from a thread?

    Rob Hoelz wrote On 10/15/07 15:23,:
    > [...]
    > now I've encountered a slightly different problem: I would like my
    > main thread to block on a condition variable in a function, waiting for
    > my other thread to generate a result and then signal that condition
    > variable. Here's some sample code:
    >
    > /* Pseudo-C */
    > setupCallbackForOtherThread(pthread_cond_signal(&notifyTrigger));
    > pthread_mutex_unlock(&cacheLock);
    > pthread_cond_notify(&workerTrigger);
    > pthread_cond_wait(&notifyTrigger);
    >
    > My fear is that the other thread will call the callback that signals
    > notifyTrigger sometime after the callback has been set up and sometime
    > before pthread_cond_wait(&notifyTrigger) is called in the main thread,
    > thus causing the main thread to block forever on that call (only the
    > main thread ever blocks on notifyTrigger, and the other thread only
    > calls a distinct callback once) What would be a good solution to this
    > dilemma? [...]


    You should never wait "for" a condition variable (in
    fact, you cannot do so reliably). The thing you wait
    for is some kind of predicate: "work is complete" or
    "Elvis has left the building." Lock the mutex that
    protects the data items that make up the predicate,
    test the predicate, and if it isn't satisfied call
    pthread_cond_wait(). When it returns, TEST AGAIN and
    call pthread_cond_wait() again if the predicate still
    doesn't hold. One of these times you'll presumably find
    that the predicate is true, at which point you can take
    whatever action is appropriate and then release the mutex.

    Here's the canonical pattern:

    pthread_mutex_lock(&mutex);
    while (! predicate()) /* what you "wait for" */
    pthread_cond_wait(&condvar, &mutex);
    do_protected_things();
    pthread_mutex_unlock(&mutex);

    .... and the partner in crime does

    pthread_mutex_lock(&mutex);
    make_predicate_true(); /* THIS is the "event" */
    pthread_mutex_unlock(&mutex);
    pthread_cond_signal(&condvar);

    If your code uses condvars but doesn't use this pattern
    (perhaps with doodads and fripperies), your code is wrong.
    R-O-N-G, wrong. Reread your source of Pthreads information.

    (Note: There have been heated debates about whether to
    unlock the mutex before or after signalling the condvar.
    Despite T.A. Edison, "heated" does not imply "enlightened.")

    --
    Eric.Sosman@sun.com

+ Reply to Thread