Select not seeing a readable fd on a listening socket - Linux
This is a discussion on Select not seeing a readable fd on a listening socket - Linux ; Hi, I have a weird problem that I've been getting nowhere with for
quite some time and was hoping this might be the place to point me in
the right direction?
I have two apps, usually living on separate machines, ...
-
Select not seeing a readable fd on a listening socket
Hi, I have a weird problem that I've been getting nowhere with for
quite some time and was hoping this might be the place to point me in
the right direction?
I have two apps, usually living on separate machines, one a TCP
client, one a TCP server. They use non-blocking I/O and use select to
multiplex between multiple connections. When the two apps are on
separate machines this works solidly, very solidly, long up-times, no
drops and excellent detection of why something has gone wrong if it
has.
The problem comes when I run them on the same machine talking to one
another. The client successfully gets a connection to the server and
believes it can then do work. The server however NEVER has the
listening socket set readable in select, thus preventing the server
reacting correctly.
I have a variety of options set on these sockets:
O_NONBLOCK, FD_CLOEXEC, SO_REUSEADDR, SO_OOBINLINE, TCP_NODELAY.
I check the return values of all pertinent calls (socket, bind,
listen, accept, connect, read/write etc.) and nothing appears untoward
or failing.
I'm building for Linux version 2.6.7 (gcc version 3.3.3 20040412
(Gentoo Linux 3.3.3-r6, ssp-3.3.2-2, pie-8.7.6))
I swear I've gone through Richard Stevens book with a fine toothed
comb now but I am still at a loss! Any help or pointers would be
gratefully appreciated,
Joe
-
Re: Select not seeing a readable fd on a listening socket
On Oct 30, 10:19 am, Joe wrote:
> Hi, I have a weird problem that I've been getting nowhere with for
> quite some time and was hoping this might be the place to point me in
> the right direction?
>
> I have two apps, usually living on separate machines, one a TCP
> client, one a TCP server. They use non-blocking I/O and use select to
> multiplex between multiple connections. When the two apps are on
> separate machines this works solidly, very solidly, long up-times, no
> drops and excellent detection of why something has gone wrong if it
> has.
>
> The problem comes when I run them on the same machine talking to one
> another. The client successfully gets a connection to the server and
> believes it can then do work. The server however NEVER has the
> listening socket set readable in select, thus preventing the server
> reacting correctly.
>
> I have a variety of options set on these sockets:
>
> O_NONBLOCK, FD_CLOEXEC, SO_REUSEADDR, SO_OOBINLINE, TCP_NODELAY.
>
> I check the return values of all pertinent calls (socket, bind,
> listen, accept, connect, read/write etc.) and nothing appears untoward
> or failing.
>
> I'm building for Linux version 2.6.7 (gcc version 3.3.3 20040412
> (Gentoo Linux 3.3.3-r6, ssp-3.3.2-2, pie-8.7.6))
>
> I swear I've gone through Richard Stevens book with a fine toothed
> comb now but I am still at a loss! Any help or pointers would be
> gratefully appreciated,
>
> Joe
I should have mentioned that on the server in the failure case the
call to socket() returns 0, which seems odd as I don't close the stdin/
out/err file descriptors. On the rare occasions it returns an fd > 0
the system works.
-
Re: Select not seeing a readable fd on a listening socket
On Oct 30, 12:09 pm, Joe wrote:
> On Oct 30, 10:19 am, Joe wrote:
>
>
>
> > Hi, I have a weird problem that I've been getting nowhere with for
> > quite some time and was hoping this might be the place to point me in
> > the right direction?
>
> > I have two apps, usually living on separate machines, one a TCP
> > client, one a TCP server. They use non-blocking I/O and use select to
> > multiplex between multiple connections. When the two apps are on
> > separate machines this works solidly, very solidly, long up-times, no
> > drops and excellent detection of why something has gone wrong if it
> > has.
>
> > The problem comes when I run them on the same machine talking to one
> > another. The client successfully gets a connection to the server and
> > believes it can then do work. The server however NEVER has the
> > listening socket set readable in select, thus preventing the server
> > reacting correctly.
>
> > I have a variety of options set on these sockets:
>
> > O_NONBLOCK, FD_CLOEXEC, SO_REUSEADDR, SO_OOBINLINE, TCP_NODELAY.
>
> > I check the return values of all pertinent calls (socket, bind,
> > listen, accept, connect, read/write etc.) and nothing appears untoward
> > or failing.
>
> > I'm building for Linux version 2.6.7 (gcc version 3.3.3 20040412
> > (Gentoo Linux 3.3.3-r6, ssp-3.3.2-2, pie-8.7.6))
>
> > I swear I've gone through Richard Stevens book with a fine toothed
> > comb now but I am still at a loss! Any help or pointers would be
> > gratefully appreciated,
>
> > Joe
>
> I should have mentioned that on the server in the failure case the
> call to socket() returns 0, which seems odd as I don't close the stdin/
> out/err file descriptors. On the rare occasions it returns an fd > 0
> the system works.
My bad. I was closing an fd I had initialised to 0. Doh! Nut.
-
Re: Select not seeing a readable fd on a listening socket
Joe writes:
> On Oct 30, 10:19 am, Joe wrote:
>> Hi, I have a weird problem that I've been getting nowhere with for
>> quite some time and was hoping this might be the place to point me in
>> the right direction?
>>
[..snip..]
> I should have mentioned that on the server in the failure case the
> call to socket() returns 0, which seems odd as I don't close the stdin/
> out/err file descriptors. On the rare occasions it returns an fd > 0
> the system works.
>
"I don't close the stdin..." does this mean that you willingly don't
close them? If so have you verified[1] that something doesn't do this
by mistake?
[1] Via strace-ing or setting a breakpoint on calls to close
--
mailto:av1474@comtv.ru
-
Re: Select not seeing a readable fd on a listening socket
Joe wrote:
> The problem comes when I run them on the same machine talking to one
> another. The client successfully gets a connection to the server and
> believes it can then do work. The server however NEVER has the
> listening socket set readable in select, thus preventing the server
> reacting correctly.
The usual problem is that you forgot to put the descriptor back in the
fd set. Usually because of code like this:
A) setup fd set
B) call select
C) if(listen socket) do stuff
D) If(data socket) do stuff
E) goto B (OOPS! We wanted to goto A)
It's hard to say without seeing your code though.
DS
-
Re: Select not seeing a readable fd on a listening socket
On Oct 30, 12:09 pm, Joe wrote:
> On Oct 30, 10:19 am, Joe wrote:
>
>
>
> > Hi, I have a weird problem that I've been getting nowhere with for
> > quite some time and was hoping this might be the place to point me in
> > the right direction?
>
> > I have two apps, usually living on separate machines, one a TCP
> > client, one a TCP server. They use non-blocking I/O and use select to
> > multiplex between multiple connections. When the two apps are on
> > separate machines this works solidly, very solidly, long up-times, no
> > drops and excellent detection of why something has gone wrong if it
> > has.
>
> > The problem comes when I run them on the same machine talking to one
> > another. The client successfully gets a connection to the server and
> > believes it can then do work. The server however NEVER has the
> > listening socket set readable in select, thus preventing the server
> > reacting correctly.
>
> > I have a variety of options set on these sockets:
>
> > O_NONBLOCK, FD_CLOEXEC, SO_REUSEADDR, SO_OOBINLINE, TCP_NODELAY.
>
> > I check the return values of all pertinent calls (socket, bind,
> > listen, accept, connect, read/write etc.) and nothing appears untoward
> > or failing.
>
> > I'm building for Linux version 2.6.7 (gcc version 3.3.3 20040412
> > (Gentoo Linux 3.3.3-r6, ssp-3.3.2-2, pie-8.7.6))
>
> > I swear I've gone through Richard Stevens book with a fine toothed
> > comb now but I am still at a loss! Any help or pointers would be
> > gratefully appreciated,
>
> > Joe
>
> I should have mentioned that on the server in the failure case the
> call to socket() returns 0, which seems odd as I don't close the stdin/
> out/err file descriptors. On the rare occasions it returns an fd > 0
> the system works.
My bad. I was closing an fd I had initialised to 0. Doh! Nut.