handling multiple SIGCHLDs reliably - Unix

This is a discussion on handling multiple SIGCHLDs reliably - Unix ; Hi all, My program seems to not handle multiple SIGCHLDs reliably; some SIGCHLDs seem to be getting "lost". I've googled the manual several times, and looked for other resources, but seem to be at a loss still. Here is the ...

+ Reply to Thread
Results 1 to 6 of 6

Thread: handling multiple SIGCHLDs reliably

  1. handling multiple SIGCHLDs reliably


    Hi all,

    My program seems to not handle multiple SIGCHLDs reliably; some
    SIGCHLDs seem to be getting "lost". I've googled the manual several
    times, and looked for other resources, but seem to be at a loss still.

    Here is the signal handler

    void catcher ( int signo, siginfo_t *psig, void *ctx)
    {
    pid_t pid;
    int status;

    pid = -1;
    status = 0;
    pid = waitpid( -1, &status, WNOHANG);
    printf(" catcher: caught signal %d around process %d\n", signo,
    pid );
    }

    and here is the main program that sets up the catcher

    struct sigaction handler_action;

    sigemptyset( & (handler_action.sa_mask) );
    sigaddset( & (handler_action.sa_mask), SIGINT );
    sigaddset( & (handler_action.sa_mask), SIGQUIT );
    sigaddset( & (handler_action.sa_mask), SIGCHLD ); /* reliably block
    SIGCHLDs in handler */
    handler_action.sa_flags = SA_SIGINFO|SA_NOCLDWAIT;
    handler_action.sa_sigaction = catcher;

    sigaction ( SIGCHLD, &handler_action, (struct sigaction *)0 );

    /* remaining part of main program */
    i = atoi(argv[1]); /* this is usually 4 or 5 */

    while (i > 0)
    {
    if ((pid = fork()) == 0) { /* forks i children */

    printf(" running command in background \n");
    execv(argv[2], &argv[2]); perror("The execl() call must have
    failed"); exit(255);

    } else {

    printf (" child pid = %d\n", pid);
    }
    i--;
    }

    for(; /* main program continues to do this waiting for Ctl-C
    */
    {
    printf (" zzz \n");
    sleep(3);
    }
    My questions:

    1. I've set up the sa_mask to block SIGCHLDs so that means each time
    the handler runs, it should block other SIGCHLDs i.e. handler should
    be invoked 5 times. But my handler is called only once or twice at
    most. What happens to the other SIGCHLDs? (This is an AIX 5.3 system,
    BTW)
    2. I don't see any zombies so I assume the children did exit properly,
    fair assumption?
    3. I am relying on the waitpid() in the handler to return the pid of
    the child that was most likely responsible for the SIGCHLD. But the
    waitpid always returns -1. Can anyone explain this?
    4. Is there a different/better way to do this? My basic requirement is
    to start a child process in the background and for the main program to
    get notified when the child exits. My manual-and-google-reading seems
    to indicate sigaction() + waitpid() as the best way so far. Any other
    ideas?

    Thanks in advance
    Sam

  2. Re: handling multiple SIGCHLDs reliably

    sam.n.seaborn@gmail.com wrote:
    > Hi all,
    >
    > My program seems to not handle multiple SIGCHLDs reliably; some
    > SIGCHLDs seem to be getting "lost". I've googled the manual several
    > times, and looked for other resources, but seem to be at a loss still.
    >
    > Here is the signal handler
    >
    > void catcher ( int signo, siginfo_t *psig, void *ctx)
    > {
    > pid_t pid;
    > int status;
    >
    > pid = -1;
    > status = 0;
    > pid = waitpid( -1, &status, WNOHANG);
    > printf(" catcher: caught signal %d around process %d\n", signo,
    > pid );
    > }


    You should loop on waitpid() until you get a return code of 0 to handle
    multiple simultaneous child exits (or -1 in the case of an error).

    > 3. I am relying on the waitpid() in the handler to return the pid of
    > the child that was most likely responsible for the SIGCHLD. But the
    > waitpid always returns -1. Can anyone explain this?


    If the return code is -1, check the value of "errno" to find out the
    actual problem.

    Chris

  3. Re: handling multiple SIGCHLDs reliably

    On Aug 19, 4:08 pm, sam.n.seab...@gmail.com wrote:
    > My program seems to not handle multiple SIGCHLDs reliably; some
    > SIGCHLDs seem to be getting "lost". I've googled the manual several
    > times, and looked for other resources, but seem to be at a loss still.

    ....
    > sigaddset( & (handler_action.sa_mask), SIGCHLD ); /* reliably block
    > SIGCHLDs in handler */


    This is unnecessary, as the signal being handled is blocked unless the
    SA_NODEFER flag is set in sa_flags.


    > handler_action.sa_flags = SA_SIGINFO|SA_NOCLDWAIT;


    Here's your bug: you set SA_NOCLDWAIT. Please read the description of
    that flag again to see why it makes you code behave the way it does.
    Note that any references in that description to wait() also apply to
    waitpid(), wait3(), and wait4().

    Also, why are you using SA_SIGINFO? You're not using the siginfo_t
    passed to the signal handler. If you're concerned about losing
    SIGCHLD signals, then call waitpid() in a loop in the handler until it
    returns -1.


    Philip Guenther

  4. Re: handling multiple SIGCHLDs reliably

    sam.n.seaborn@gmail.com writes:
    > My program seems to not handle multiple SIGCHLDs reliably; some
    > SIGCHLDs seem to be getting "lost".


    [ see original for code ]

    > 1. I've set up the sa_mask to block SIGCHLDs so that means each time
    > the handler runs, it should block other SIGCHLDs i.e. handler should
    > be invoked 5 times.


    As someone else already wrote: The signal being caught is
    automatically blocked while the handler runs (unless requested
    otherwise).

    > But my handler is called only once or twice at
    > most. What happens to the other SIGCHLDs? (This is an AIX 5.3 system,
    > BTW)


    Usually, there are none: Exactly one instance of of (non-realtime)
    signal can be pending for a process.

    > 2. I don't see any zombies so I assume the children did exit properly,
    > fair assumption?


    You are mixing up two different things: You had specified SA_NOCLDWAIT
    when setting up the signal handler. This requests that terminated
    child processes are not turned into zombies after they have
    exited. Usually, they are so that the parent process can access the
    exit status at some later time. This doesn't depend on the way the
    child terminated.

    > 3. I am relying on the waitpid() in the handler to return the pid of
    > the child that was most likely responsible for the SIGCHLD. But the
    > waitpid always returns -1. Can anyone explain this?


    Since the kernel keeps no status information for terminated processes,
    there is no such information to return.

  5. Re: handling multiple SIGCHLDs reliably

    On Aug 19, 4:30*pm, Chris Friesen wrote:
    > sam.n.seab...@gmail.com wrote:
    > > Hi all,

    >
    > > My program seems to not handle multiple SIGCHLDs reliably; some
    > > SIGCHLDs seem to be getting "lost". I've googled the manual several
    > > times, and looked for other resources, but seem to be at a loss still.

    >
    > > Here is the signal handler

    >
    > > void catcher ( int signo, siginfo_t *psig, void *ctx)
    > > {
    > > * * pid_t pid;
    > > * * int status;

    >
    > > * * pid = -1;
    > > * * status = 0;
    > > * * pid = waitpid( -1, &status, WNOHANG);
    > > * * printf(" catcher: caught signal %d around process %d\n", signo,
    > > pid );
    > > }

    >
    > You should loop on waitpid() until you get a return code of 0 to handle
    > multiple simultaneous child exits (or -1 in the case of an error).
    >


    Why would looping it make a difference? I don't see it.


  6. Re: handling multiple SIGCHLDs reliably

    K-mart Cashier writes:
    > On Aug 19, 4:30*pm, Chris Friesen wrote:
    >> sam.n.seab...@gmail.com wrote:
    >> > Hi all,

    >>
    >> > My program seems to not handle multiple SIGCHLDs reliably; some
    >> > SIGCHLDs seem to be getting "lost". I've googled the manual several
    >> > times, and looked for other resources, but seem to be at a loss still.

    >>
    >> > Here is the signal handler

    >>
    >> > void catcher ( int signo, siginfo_t *psig, void *ctx)
    >> > {
    >> > * * pid_t pid;
    >> > * * int status;

    >>
    >> > * * pid = -1;
    >> > * * status = 0;
    >> > * * pid = waitpid( -1, &status, WNOHANG);
    >> > * * printf(" catcher: caught signal %d around process %d\n", signo,
    >> > pid );
    >> > }

    >>
    >> You should loop on waitpid() until you get a return code of 0 to handle
    >> multiple simultaneous child exits (or -1 in the case of an error).
    >>

    >
    > Why would looping it make a difference?


    Because only one SIGCHLD can be pending at any point in time, n
    deceased children can (and usually will) result in less than n signals
    being sent. Hence to loop, which ensures that everything dead hic et
    nunc is dealt with.


+ Reply to Thread