Select not seeing a readable fd on a listening socket - Linux

This is a discussion on Select not seeing a readable fd on a listening socket - Linux ; Hi, I have a weird problem that I've been getting nowhere with for quite some time and was hoping this might be the place to point me in the right direction? I have two apps, usually living on separate machines, ...

+ Reply to Thread
Results 1 to 6 of 6

Thread: Select not seeing a readable fd on a listening socket

  1. Select not seeing a readable fd on a listening socket

    Hi, I have a weird problem that I've been getting nowhere with for
    quite some time and was hoping this might be the place to point me in
    the right direction?

    I have two apps, usually living on separate machines, one a TCP
    client, one a TCP server. They use non-blocking I/O and use select to
    multiplex between multiple connections. When the two apps are on
    separate machines this works solidly, very solidly, long up-times, no
    drops and excellent detection of why something has gone wrong if it
    has.

    The problem comes when I run them on the same machine talking to one
    another. The client successfully gets a connection to the server and
    believes it can then do work. The server however NEVER has the
    listening socket set readable in select, thus preventing the server
    reacting correctly.

    I have a variety of options set on these sockets:

    O_NONBLOCK, FD_CLOEXEC, SO_REUSEADDR, SO_OOBINLINE, TCP_NODELAY.

    I check the return values of all pertinent calls (socket, bind,
    listen, accept, connect, read/write etc.) and nothing appears untoward
    or failing.

    I'm building for Linux version 2.6.7 (gcc version 3.3.3 20040412
    (Gentoo Linux 3.3.3-r6, ssp-3.3.2-2, pie-8.7.6))

    I swear I've gone through Richard Stevens book with a fine toothed
    comb now but I am still at a loss! Any help or pointers would be
    gratefully appreciated,

    Joe


  2. Re: Select not seeing a readable fd on a listening socket

    On Oct 30, 10:19 am, Joe wrote:
    > Hi, I have a weird problem that I've been getting nowhere with for
    > quite some time and was hoping this might be the place to point me in
    > the right direction?
    >
    > I have two apps, usually living on separate machines, one a TCP
    > client, one a TCP server. They use non-blocking I/O and use select to
    > multiplex between multiple connections. When the two apps are on
    > separate machines this works solidly, very solidly, long up-times, no
    > drops and excellent detection of why something has gone wrong if it
    > has.
    >
    > The problem comes when I run them on the same machine talking to one
    > another. The client successfully gets a connection to the server and
    > believes it can then do work. The server however NEVER has the
    > listening socket set readable in select, thus preventing the server
    > reacting correctly.
    >
    > I have a variety of options set on these sockets:
    >
    > O_NONBLOCK, FD_CLOEXEC, SO_REUSEADDR, SO_OOBINLINE, TCP_NODELAY.
    >
    > I check the return values of all pertinent calls (socket, bind,
    > listen, accept, connect, read/write etc.) and nothing appears untoward
    > or failing.
    >
    > I'm building for Linux version 2.6.7 (gcc version 3.3.3 20040412
    > (Gentoo Linux 3.3.3-r6, ssp-3.3.2-2, pie-8.7.6))
    >
    > I swear I've gone through Richard Stevens book with a fine toothed
    > comb now but I am still at a loss! Any help or pointers would be
    > gratefully appreciated,
    >
    > Joe


    I should have mentioned that on the server in the failure case the
    call to socket() returns 0, which seems odd as I don't close the stdin/
    out/err file descriptors. On the rare occasions it returns an fd > 0
    the system works.


  3. Re: Select not seeing a readable fd on a listening socket

    On Oct 30, 12:09 pm, Joe wrote:
    > On Oct 30, 10:19 am, Joe wrote:
    >
    >
    >
    > > Hi, I have a weird problem that I've been getting nowhere with for
    > > quite some time and was hoping this might be the place to point me in
    > > the right direction?

    >
    > > I have two apps, usually living on separate machines, one a TCP
    > > client, one a TCP server. They use non-blocking I/O and use select to
    > > multiplex between multiple connections. When the two apps are on
    > > separate machines this works solidly, very solidly, long up-times, no
    > > drops and excellent detection of why something has gone wrong if it
    > > has.

    >
    > > The problem comes when I run them on the same machine talking to one
    > > another. The client successfully gets a connection to the server and
    > > believes it can then do work. The server however NEVER has the
    > > listening socket set readable in select, thus preventing the server
    > > reacting correctly.

    >
    > > I have a variety of options set on these sockets:

    >
    > > O_NONBLOCK, FD_CLOEXEC, SO_REUSEADDR, SO_OOBINLINE, TCP_NODELAY.

    >
    > > I check the return values of all pertinent calls (socket, bind,
    > > listen, accept, connect, read/write etc.) and nothing appears untoward
    > > or failing.

    >
    > > I'm building for Linux version 2.6.7 (gcc version 3.3.3 20040412
    > > (Gentoo Linux 3.3.3-r6, ssp-3.3.2-2, pie-8.7.6))

    >
    > > I swear I've gone through Richard Stevens book with a fine toothed
    > > comb now but I am still at a loss! Any help or pointers would be
    > > gratefully appreciated,

    >
    > > Joe

    >
    > I should have mentioned that on the server in the failure case the
    > call to socket() returns 0, which seems odd as I don't close the stdin/
    > out/err file descriptors. On the rare occasions it returns an fd > 0
    > the system works.


    My bad. I was closing an fd I had initialised to 0. Doh! Nut.


  4. Re: Select not seeing a readable fd on a listening socket

    Joe writes:

    > On Oct 30, 10:19 am, Joe wrote:
    >> Hi, I have a weird problem that I've been getting nowhere with for
    >> quite some time and was hoping this might be the place to point me in
    >> the right direction?
    >>


    [..snip..]

    > I should have mentioned that on the server in the failure case the
    > call to socket() returns 0, which seems odd as I don't close the stdin/
    > out/err file descriptors. On the rare occasions it returns an fd > 0
    > the system works.
    >


    "I don't close the stdin..." does this mean that you willingly don't
    close them? If so have you verified[1] that something doesn't do this
    by mistake?

    [1] Via strace-ing or setting a breakpoint on calls to close

    --
    mailto:av1474@comtv.ru

  5. Re: Select not seeing a readable fd on a listening socket


    Joe wrote:

    > The problem comes when I run them on the same machine talking to one
    > another. The client successfully gets a connection to the server and
    > believes it can then do work. The server however NEVER has the
    > listening socket set readable in select, thus preventing the server
    > reacting correctly.


    The usual problem is that you forgot to put the descriptor back in the
    fd set. Usually because of code like this:

    A) setup fd set
    B) call select
    C) if(listen socket) do stuff
    D) If(data socket) do stuff
    E) goto B (OOPS! We wanted to goto A)

    It's hard to say without seeing your code though.

    DS


  6. Re: Select not seeing a readable fd on a listening socket

    On Oct 30, 12:09 pm, Joe wrote:
    > On Oct 30, 10:19 am, Joe wrote:
    >
    >
    >
    > > Hi, I have a weird problem that I've been getting nowhere with for
    > > quite some time and was hoping this might be the place to point me in
    > > the right direction?

    >
    > > I have two apps, usually living on separate machines, one a TCP
    > > client, one a TCP server. They use non-blocking I/O and use select to
    > > multiplex between multiple connections. When the two apps are on
    > > separate machines this works solidly, very solidly, long up-times, no
    > > drops and excellent detection of why something has gone wrong if it
    > > has.

    >
    > > The problem comes when I run them on the same machine talking to one
    > > another. The client successfully gets a connection to the server and
    > > believes it can then do work. The server however NEVER has the
    > > listening socket set readable in select, thus preventing the server
    > > reacting correctly.

    >
    > > I have a variety of options set on these sockets:

    >
    > > O_NONBLOCK, FD_CLOEXEC, SO_REUSEADDR, SO_OOBINLINE, TCP_NODELAY.

    >
    > > I check the return values of all pertinent calls (socket, bind,
    > > listen, accept, connect, read/write etc.) and nothing appears untoward
    > > or failing.

    >
    > > I'm building for Linux version 2.6.7 (gcc version 3.3.3 20040412
    > > (Gentoo Linux 3.3.3-r6, ssp-3.3.2-2, pie-8.7.6))

    >
    > > I swear I've gone through Richard Stevens book with a fine toothed
    > > comb now but I am still at a loss! Any help or pointers would be
    > > gratefully appreciated,

    >
    > > Joe

    >
    > I should have mentioned that on the server in the failure case the
    > call to socket() returns 0, which seems odd as I don't close the stdin/
    > out/err file descriptors. On the rare occasions it returns an fd > 0
    > the system works.


    My bad. I was closing an fd I had initialised to 0. Doh! Nut.


+ Reply to Thread