How to eliminate the defunct problem? - TCP-IP

This is a discussion on How to eliminate the defunct problem? - TCP-IP ; I am writing a simple BSD socket server under Linux. It listens on port 9999 and uses select() to determine whether a client connects or not. A new child process is created by fork() if a client connects. The new ...

+ Reply to Thread
Results 1 to 10 of 10

Thread: How to eliminate the defunct problem?

  1. How to eliminate the defunct problem?

    I am writing a simple BSD socket server under Linux. It listens on
    port 9999 and uses select() to determine whether a client connects or
    not. A new child process is created by fork() if a client connects.
    The new child process will take care of the connection between server
    and client.

    Now, I have 3 questions: (1) Why the child process terminates
    immediately (It just looks like crash) after the client close the
    connection? (2) How to eliminate the defunct child process? (3) How to
    determine whether a connection is closed by peer or alive? (Use
    select()?)

    To run server side: ./test
    To run client side: ./test

    To terminate the program: CTRL+C

    The following is my code:
    ------------------------------------
    #include
    #include
    #include
    #include
    #include
    #include
    #include
    #include
    #include
    #include
    #include

    using namespace std;

    #include "bruce_assertion.h"

    bool is_stop=false;

    void on_int(int signum)
    {
    printf("This program is about to quit.\n");
    is_stop=true;
    signal(SIGINT, SIG_DFL);
    }

    void establish_connection(int l_socket)
    {
    DECLARE_DBG_SCOPE(establish_connection, true)

    char buf[256]="";
    int socket=-1;
    /*
    sockaddr_un addr={0};
    */
    sockaddr_in addr={0};
    socklen_t addr_len=sizeof(addr);
    fd_set fds;
    timeval tv={0};

    srand((unsigned int)time(NULL));

    socket=accept(l_socket, (sockaddr *)&addr, &addr_len);
    ASSERT_CRT(socket>=0, "Fail to accept a connection.\n");

    while(!is_stop)
    {
    usleep(1000000);
    printf("1111\n");
    FD_ZERO(&fds); FD_SET(socket, &fds);
    if(select(socket+1, &fds, NULL, NULL, &tv))
    {
    read(socket, buf, 256);
    printf("RECV: %s\n", buf);
    }

    FD_ZERO(&fds); FD_SET(socket, &fds);
    printf("2222\n");
    if(select(socket+1, NULL, &fds, NULL, &tv))
    {
    sprintf(buf, "Server: %04d", rand()%1000);
    write(socket, buf, 13);
    }

    FD_ZERO(&fds); FD_SET(socket, &fds);
    printf("3333\n");
    if(select(socket+1, NULL, NULL, &fds, &tv))
    break;
    printf("4444\n");
    }

    close(socket);
    printf("[Server] Child process exit.\n");
    exit(0);
    }

    void start_client(void)
    {
    DECLARE_DBG_SCOPE(main, true)

    char buf[256]="";
    int socket=-1;
    /*
    sockaddr_un addr={0};
    */
    sockaddr_in addr={0};
    socklen_t addr_len=sizeof(addr);
    fd_set fds;
    timeval tv={0};

    /*
    addr.sun_family=AF_UNIX;
    strcpy(addr.sun_path, "test.socket");
    */
    addr.sin_family=AF_INET;
    addr.sin_port=9999;
    addr.sin_addr.s_addr=inet_addr("10.110.142.230");
    /*
    socket=::socket(PF_UNIX, SOCK_STREAM, 0);
    */
    socket=::socket(PF_INET, SOCK_STREAM, 0);
    ASSERT_CRT(socket>=0, "Fail to create socket.\n");

    ASSERT_CRT(connect(socket, (sockaddr *)&addr, addr_len)>=0,
    "Fail to connection to socket.\n");

    while(!is_stop)
    {
    usleep(1000000);
    FD_ZERO(&fds); FD_SET(socket, &fds);
    if(select(socket+1, &fds, NULL, NULL, &tv))
    {
    read(socket, buf, 256);
    printf("RECV: %s\n", buf);
    }

    FD_ZERO(&fds); FD_SET(socket, &fds);
    if(select(socket+1, NULL, &fds, NULL, &tv))
    {
    sprintf(buf, "Client: %04d", rand()%1000);
    write(socket, buf, 13);
    }

    FD_ZERO(&fds); FD_SET(socket, &fds);
    if(select(socket+1, NULL, NULL, &fds, &tv))
    break;
    }

    close(socket);
    printf("Client process exit.\n");
    exit(0);
    }

    int main(int argc, char argv[])
    {
    DECLARE_DBG_SCOPE(main, true)

    int socket=-1;
    /*
    sockaddr_un addr={0};
    */
    sockaddr_in addr={0};
    socklen_t addr_len=sizeof(addr);
    fd_set fds;
    timeval tv={0};

    signal(SIGINT, on_int);

    if(argc>=2)
    start_client();
    /*
    addr.sun_family=AF_UNIX;
    strcpy(addr.sun_path, "test.socket");
    */
    addr.sin_family=AF_INET;
    addr.sin_port=9999;

    /*
    socket=::socket(PF_UNIX, SOCK_STREAM, 0);
    */
    socket=::socket(PF_INET, SOCK_STREAM, 0);
    ASSERT_CRT(socket>=0, "Fail to create socket.\n");
    /*
    ASSERT_CRT(bind(socket, (sockaddr *)&addr, addr_len)>=0, "Fail to
    perform binding.\n");
    */
    ASSERT_CRT(bind(socket, (sockaddr *)&addr, addr_len)>=0,
    "Fail to perform binding.\n");
    ASSERT_CRT(!listen(socket, 1), "Fail to perform listening.\n");

    while(!is_stop)
    {
    usleep(1000000);
    printf("Waiting...Parent...\n");
    FD_ZERO(&fds); FD_SET(socket, &fds);

    if(select(socket+1, &fds, NULL, NULL, &tv))
    {
    printf("A new connection is about to be established.\n");
    if(fork())
    continue;
    establish_connection(socket);
    }
    }

    printf("[Server] Parent process exit.\n");
    close(socket);
    /*
    unlink(addr.sun_path);
    */
    return 0;
    }

  2. Re: How to eliminate the defunct problem?

    In article
    <04672c61-1821-4d0b-b171-4c15a3ed6ed5@g16g2000pri.googlegroups.com>,
    Bruce Hsu wrote:

    > I am writing a simple BSD socket server under Linux. It listens on
    > port 9999 and uses select() to determine whether a client connects or
    > not. A new child process is created by fork() if a client connects.
    > The new child process will take care of the connection between server
    > and client.
    >
    > Now, I have 3 questions: (1) Why the child process terminates
    > immediately (It just looks like crash) after the client close the
    > connection? (2) How to eliminate the defunct child process? (3) How to


    You need to set up a handler for SIGCHLD and call wait() when a child
    exits. On some operating systems, setting SIGCHLD to SIG_IGN will
    prevent defunct children from becoming zombies.

    > determine whether a connection is closed by peer or alive? (Use
    > select()?)


    Check the return value of read(). It will return 0 when the peer has
    closed the connection and you've read everything it sent.

    >
    > To run server side: ./test
    > To run client side: ./test

    >
    > To terminate the program: CTRL+C
    >
    > The following is my code:
    > ------------------------------------
    > #include
    > #include
    > #include
    > #include
    > #include
    > #include
    > #include
    > #include
    > #include
    > #include
    > #include
    >
    > using namespace std;
    >
    > #include "bruce_assertion.h"
    >
    > bool is_stop=false;
    >
    > void on_int(int signum)
    > {
    > printf("This program is about to quit.\n");
    > is_stop=true;
    > signal(SIGINT, SIG_DFL);
    > }
    >
    > void establish_connection(int l_socket)
    > {
    > DECLARE_DBG_SCOPE(establish_connection, true)
    >
    > char buf[256]="";
    > int socket=-1;
    > /*
    > sockaddr_un addr={0};
    > */
    > sockaddr_in addr={0};
    > socklen_t addr_len=sizeof(addr);
    > fd_set fds;
    > timeval tv={0};
    >
    > srand((unsigned int)time(NULL));
    >
    > socket=accept(l_socket, (sockaddr *)&addr, &addr_len);
    > ASSERT_CRT(socket>=0, "Fail to accept a connection.\n");
    >
    > while(!is_stop)
    > {
    > usleep(1000000);
    > printf("1111\n");
    > FD_ZERO(&fds); FD_SET(socket, &fds);
    > if(select(socket+1, &fds, NULL, NULL, &tv))
    > {
    > read(socket, buf, 256);


    You need to check the return value of read(). If it's -1, there was an
    error. If it's 0, it means EOF, i.e. the client has closed the
    connection. And if it's >0 it tells you how much data was read.

    > printf("RECV: %s\n", buf);


    %s expects a null-terminated string, but read() doesn't null-terminate
    the buffer.

    > }
    >
    > FD_ZERO(&fds); FD_SET(socket, &fds);
    > printf("2222\n");
    > if(select(socket+1, NULL, &fds, NULL, &tv))
    > {
    > sprintf(buf, "Server: %04d", rand()%1000);
    > write(socket, buf, 13);
    > }
    >
    > FD_ZERO(&fds); FD_SET(socket, &fds);
    > printf("3333\n");
    > if(select(socket+1, NULL, NULL, &fds, &tv))
    > break;
    > printf("4444\n");
    > }
    >
    > close(socket);
    > printf("[Server] Child process exit.\n");
    > exit(0);
    > }
    >
    > void start_client(void)
    > {
    > DECLARE_DBG_SCOPE(main, true)
    >
    > char buf[256]="";
    > int socket=-1;
    > /*
    > sockaddr_un addr={0};
    > */
    > sockaddr_in addr={0};
    > socklen_t addr_len=sizeof(addr);
    > fd_set fds;
    > timeval tv={0};
    >
    > /*
    > addr.sun_family=AF_UNIX;
    > strcpy(addr.sun_path, "test.socket");
    > */
    > addr.sin_family=AF_INET;
    > addr.sin_port=9999;
    > addr.sin_addr.s_addr=inet_addr("10.110.142.230");
    > /*
    > socket=::socket(PF_UNIX, SOCK_STREAM, 0);
    > */
    > socket=::socket(PF_INET, SOCK_STREAM, 0);
    > ASSERT_CRT(socket>=0, "Fail to create socket.\n");
    >
    > ASSERT_CRT(connect(socket, (sockaddr *)&addr, addr_len)>=0,
    > "Fail to connection to socket.\n");
    >
    > while(!is_stop)
    > {
    > usleep(1000000);
    > FD_ZERO(&fds); FD_SET(socket, &fds);
    > if(select(socket+1, &fds, NULL, NULL, &tv))
    > {
    > read(socket, buf, 256);


    You need to check the return value of read(). If it's -1, there was an
    error. If it's 0, it means EOF, i.e. the client has closed the
    connection. And if it's >0 it tells you how much data was read.

    > printf("RECV: %s\n", buf);


    %s expects a null-terminated string, but read() doesn't null-terminate
    the buffer.

    > }
    >
    > FD_ZERO(&fds); FD_SET(socket, &fds);
    > if(select(socket+1, NULL, &fds, NULL, &tv))
    > {
    > sprintf(buf, "Client: %04d", rand()%1000);
    > write(socket, buf, 13);
    > }
    >
    > FD_ZERO(&fds); FD_SET(socket, &fds);
    > if(select(socket+1, NULL, NULL, &fds, &tv))
    > break;
    > }
    >
    > close(socket);
    > printf("Client process exit.\n");
    > exit(0);
    > }
    >
    > int main(int argc, char argv[])
    > {
    > DECLARE_DBG_SCOPE(main, true)
    >
    > int socket=-1;
    > /*
    > sockaddr_un addr={0};
    > */
    > sockaddr_in addr={0};
    > socklen_t addr_len=sizeof(addr);
    > fd_set fds;
    > timeval tv={0};
    >
    > signal(SIGINT, on_int);
    >
    > if(argc>=2)
    > start_client();
    > /*
    > addr.sun_family=AF_UNIX;
    > strcpy(addr.sun_path, "test.socket");
    > */
    > addr.sin_family=AF_INET;
    > addr.sin_port=9999;
    >
    > /*
    > socket=::socket(PF_UNIX, SOCK_STREAM, 0);
    > */
    > socket=::socket(PF_INET, SOCK_STREAM, 0);
    > ASSERT_CRT(socket>=0, "Fail to create socket.\n");
    > /*
    > ASSERT_CRT(bind(socket, (sockaddr *)&addr, addr_len)>=0, "Fail to
    > perform binding.\n");
    > */
    > ASSERT_CRT(bind(socket, (sockaddr *)&addr, addr_len)>=0,
    > "Fail to perform binding.\n");
    > ASSERT_CRT(!listen(socket, 1), "Fail to perform listening.\n");
    >
    > while(!is_stop)
    > {
    > usleep(1000000);
    > printf("Waiting...Parent...\n");
    > FD_ZERO(&fds); FD_SET(socket, &fds);
    >
    > if(select(socket+1, &fds, NULL, NULL, &tv))
    > {
    > printf("A new connection is about to be established.\n");
    > if(fork())
    > continue;
    > establish_connection(socket);
    > }
    > }
    >
    > printf("[Server] Parent process exit.\n");
    > close(socket);
    > /*
    > unlink(addr.sun_path);
    > */
    > return 0;
    > }


    --
    Barry Margolin,
    barmar@alum.mit.edu
    Arlington, MA
    *** PLEASE don't copy me on replies, I'll read them in the group ***

  3. Re: How to eliminate the defunct problem?

    On Jun 11, 8:20*pm, Barry Margolin wrote:
    > In article
    > <04672c61-1821-4d0b-b171-4c15a3ed6...@g16g2000pri.googlegroups.com>,
    > *Bruce Hsu wrote:
    >
    > > I am writing a simple BSD socket server under Linux. It listens on
    > > port 9999 and uses select() to determine whether a client connects or
    > > not. A new child process is created by fork() if a client connects.
    > > The new child process will take care of the connection between server
    > > and client.

    >
    > > Now, I have 3 questions: (1) Why the child process terminates
    > > immediately (It just looks like crash) after the client close the
    > > connection? (2) How to eliminate the defunct child process? (3) How to

    >
    > You need to set up a handler for SIGCHLD and call wait() when a child
    > exits. *On some operating systems, setting SIGCHLD to SIG_IGN will
    > prevent defunct children from becoming zombies.
    >
    > > determine whether a connection is closed by peer or alive? (Use
    > > select()?)

    >
    > Check the return value of read(). *It will return 0 when the peer has
    > closed the connection and you've read everything it sent.
    >
    >
    >
    >
    >
    >
    >
    > > To run server side: ./test
    > > To run client side: ./test

    >
    > > To terminate the program: CTRL+C

    >
    > > The following is my code:
    > > ------------------------------------
    > > #include
    > > #include
    > > #include
    > > #include
    > > #include
    > > #include
    > > #include
    > > #include
    > > #include
    > > #include
    > > #include

    >
    > > using namespace std;

    >
    > > #include "bruce_assertion.h"

    >
    > > bool is_stop=false;

    >
    > > void on_int(int signum)
    > > {
    > > *printf("This program is about to quit.\n");
    > > *is_stop=true;
    > > *signal(SIGINT, SIG_DFL);
    > > }

    >
    > > void establish_connection(int l_socket)
    > > {
    > > *DECLARE_DBG_SCOPE(establish_connection, true)

    >
    > > *char buf[256]="";
    > > *int socket=-1;
    > > /*
    > > *sockaddr_un addr={0};
    > > */
    > > *sockaddr_in addr={0};
    > > *socklen_t addr_len=sizeof(addr);
    > > *fd_set fds;
    > > *timeval tv={0};

    >
    > > *srand((unsigned int)time(NULL));

    >
    > > *socket=accept(l_socket, (sockaddr *)&addr, &addr_len);
    > > *ASSERT_CRT(socket>=0, "Fail to accept a connection.\n");

    >
    > > *while(!is_stop)
    > > *{
    > > * usleep(1000000);
    > > * printf("1111\n");
    > > * FD_ZERO(&fds); FD_SET(socket, &fds);
    > > * if(select(socket+1, &fds, NULL, NULL, &tv))
    > > * {
    > > * *read(socket, buf, 256);

    >
    > You need to check the return value of read(). *If it's -1, there was an
    > error. *If it's 0, it means EOF, i.e. the client has closed the
    > connection. *And if it's >0 it tells you how much data was read.
    >
    > > * *printf("RECV: %s\n", buf);

    >
    > %s expects a null-terminated string, but read() doesn't null-terminate
    > the buffer.
    >
    >
    >
    >
    >
    > > * }

    >
    > > * FD_ZERO(&fds); FD_SET(socket, &fds);
    > > * printf("2222\n");
    > > * if(select(socket+1, NULL, &fds, NULL, &tv))
    > > * {
    > > * *sprintf(buf, "Server: %04d", rand()%1000);
    > > * *write(socket, buf, 13);
    > > * }

    >
    > > * FD_ZERO(&fds); FD_SET(socket, &fds);
    > > * printf("3333\n");
    > > * if(select(socket+1, NULL, NULL, &fds, &tv))
    > > * *break;
    > > * printf("4444\n");
    > > *}

    >
    > > *close(socket);
    > > *printf("[Server] Child process exit.\n");
    > > *exit(0);
    > > }

    >
    > > void start_client(void)
    > > {
    > > *DECLARE_DBG_SCOPE(main, true)

    >
    > > *char buf[256]="";
    > > *int socket=-1;
    > > /*
    > > *sockaddr_un addr={0};
    > > */
    > > *sockaddr_in addr={0};
    > > *socklen_t addr_len=sizeof(addr);
    > > *fd_set fds;
    > > *timeval tv={0};

    >
    > > /*
    > > *addr.sun_family=AF_UNIX;
    > > *strcpy(addr.sun_path, "test.socket");
    > > */
    > > *addr.sin_family=AF_INET;
    > > *addr.sin_port=9999;
    > > *addr.sin_addr.s_addr=inet_addr("10.110.142.230");
    > > /*
    > > *socket=::socket(PF_UNIX, SOCK_STREAM, 0);
    > > */
    > > *socket=::socket(PF_INET, SOCK_STREAM, 0);
    > > *ASSERT_CRT(socket>=0, "Fail to create socket.\n");

    >
    > > *ASSERT_CRT(connect(socket, (sockaddr *)&addr, addr_len)>=0,
    > > * * * * * * "Fail to connection to socket.\n");

    >
    > > *while(!is_stop)
    > > *{
    > > * usleep(1000000);
    > > * FD_ZERO(&fds); FD_SET(socket, &fds);
    > > * if(select(socket+1, &fds, NULL, NULL, &tv))
    > > * {
    > > * *read(socket, buf, 256);

    >
    > You need to check the return value of read(). *If it's -1, there was an
    > error. *If it's 0, it means EOF, i.e. the client has closed the
    > connection. *And if it's >0 it tells you how much data was read.
    >
    > > * *printf("RECV: %s\n", buf);

    >
    > %s expects a null-terminated string, but read() doesn't null-terminate
    > the buffer.
    >


    I thought the data/and or string sent over TCP/IP(?) was null
    terminated .

    Chad


  4. Re: How to eliminate the defunct problem?

    ["Followup-To:" header set to comp.protocols.tcp-ip.]

    On Thu, 12 Jun 2008 14:28:32 +0000 (UTC), John Gordon wrote:
    > In <2b8d1a90-058b-4f95-82b6-699caa975c9a@34g2000hsf.googlegroups.com> K-mart Cashier writes:
    >
    >> I thought the data/and or string sent over TCP/IP(?) was null
    >> terminated .

    >
    > No. Why would it be? Strings are the only data type which need to be
    > null-terminated; it would be a waste for any other kind of data.


    (NUL-terminated is the proper term, I think.)

    True, but maybe misleading. Many or most protocols on top of TCP
    are text based, i.e. they pass strings around, but none of the popular
    ones send NUL-terminated strings. They tend to use CRLF, or '.' on
    a line of its own, or close the socket, things like that.

    /Jorgen

    --
    // Jorgen Grahn \X/ snipabacken.se> R'lyeh wgah'nagl fhtagn!

  5. Re: How to eliminate the defunct problem?

    On Jun 12, 5:54*am, K-mart Cashier wrote:

    > I thought the data/and or string sent over TCP/IP(?) was null
    > terminated .
    >
    > Chad


    That doesn't matter. We're talking about *RECEIVING*, not sending.

    DS


  6. Re: How to eliminate the defunct problem?

    On Jun 12, 1:04*pm, David Schwartz wrote:
    > On Jun 12, 5:54*am, K-mart Cashier wrote:
    >
    > > I thought the data/and or string sent over TCP/IP(?) was null
    > > terminated .

    >
    > > Chad

    >
    > That doesn't matter. We're talking about *RECEIVING*, not sending.
    >
    > DS


    Huh? What do you mean? I'm once again drawing a blank.

    Chad

  7. Re: How to eliminate the defunct problem?

    On Jun 12, 5:14*pm, K-mart Cashier wrote:

    > Huh? What do you mean? I'm once again drawing a blank.
    >
    > Chad


    Suppose you always send messages with a dot at the end. When I call
    'recv' or 'read', there's no guarantee I'll get a whole message. So
    there's no guarantee that what I receive will have a dot at the end.

    Sending X does not guarantee receiving X, when X is defined in terms
    of properties the transport does not preserve.

    DS

  8. Re: How to eliminate the defunct problem?

    On Jun 12, 8:24 pm, Jorgen Grahn wrote:
    > ["Followup-To:" header set to comp.protocols.tcp-ip.]
    >
    > On Thu, 12 Jun 2008 14:28:32 +0000 (UTC), John Gordon wrote:
    > > In <2b8d1a90-058b-4f95-82b6-699caa975...@34g2000hsf.googlegroups.com> K-mart Cashier writes:

    >
    > >> I thought the data/and or string sent over TCP/IP(?) was null
    > >> terminated .

    >
    > > No. Why would it be? Strings are the only data type which need to be
    > > null-terminated; it would be a waste for any other kind of data.

    >
    > (NUL-terminated is the proper term, I think.)

    No, NUL is ASCII. The proper term is 'string'.
    The definition of string given by ISO C is that a string is a
    contiguous sequence of characters terminated by and including the
    first null character.
    > True, but maybe misleading. Many or most protocols on top of TCP
    > are text based, i.e. they pass strings around, but none of the popular
    > ones send NUL-terminated strings. They tend to use CRLF, or '.' on
    > a line of its own, or close the socket, things like that.

    A common idiom when dealing with such protocols is:

    char buffer[512];
    ssize_t recvn;

    recvn = recv(sock, buffer, sizeof buffer - 1, 0);
    if(recvn == -1) /* error */
    else if(recvn == 0) /* client closed connection, or O_NONBLOCK socket,
    in which case errno = EAGAIN */
    else buffer[recvn] = 0;

  9. Re: How to eliminate the defunct problem?

    On Jun 13, 7:27*am, vipps...@gmail.com wrote:

    > A common idiom when dealing with such protocols is:
    >
    > char buffer[512];
    > ssize_t recvn;
    >
    > recvn = recv(sock, buffer, sizeof buffer - 1, 0);
    > if(recvn == -1) /* error */
    > else if(recvn == 0) /* client closed connection, or O_NONBLOCK socket,
    > in which case errno = EAGAIN */
    > else buffer[recvn] = 0;


    There are two things you have to be aware of when using this approach:

    1) The other end, either maliciously or due to a bug, might send an
    embedded zero byte. As a result, 'strlen(buffer)' may not be equal to
    'recvn'. It's important to detect this condition and handle it sanely
    (possibly by terminating the connection).

    2) If you are scanning for a terminator like '.' or a line ending, an
    embedded zero byte will cause you to scan only the bytes before the
    embedded zero. So don't use 'strchr', use 'memchr', or handle an
    embedded zero some other way.

    DS

+ Reply to Thread