broken client, or broken TCP stack? - TCP-IP

This is a discussion on broken client, or broken TCP stack? - TCP-IP ; Platform: Client and Server are running 32-bit FedoraCore 6 Not shown: Successful connection open, transfer of data from client to server. Also, this is a subset of the original capture, and therefore the relative seq/ack numbers start at 1. Sorry ...

+ Reply to Thread
Results 1 to 14 of 14

Thread: broken client, or broken TCP stack?

  1. broken client, or broken TCP stack?

    Platform: Client and Server are running 32-bit FedoraCore 6

    Not shown: Successful connection open, transfer of data from client
    to server. Also, this is a subset of the original capture, and
    therefore the relative seq/ack numbers start at 1. Sorry for some
    possible confusion.

    In testing the connection-loss-then-reconnect function in the client,
    the server is restarted after some arbitrary period of data transfer
    from client to server. At the restart, the server TCP sends a FIN ACK
    to the client, and the client TCP ACKS that packet.

    Within the debugger, however, the client is happily writing to that
    socket but the trace shows that none of this is actually getting on
    the wire. Finally, we restart the client and the client TCP sends a
    RST to the server.

    Socket options on the client are KEEPALIVE and LINGER {1,0}.

    I would expect that the client TCP would have immediately sent a FIN
    ACK after its ACK of the server FIN ACK.

    Is this a broken client, a broken client TCP, or something else?

    Thanks very much for any insight. I'm extremely happy to do my own
    work if there's a FM to read, in which case I'd be really grateful for
    a good link that describes what I'm seeing here. I haven't found this
    case explained in Stevens yet (TCP v1 or Unix Network Programming:
    Network APIs) but I'm still looking.

    |Time | 192.168.169.29 | 192.168.169.32 |

    [....]

    |0.000 | PSH, ACK - Len: 503 |Seq = 1 Ack = 1
    | |(57397) ------------------> (5001) |
    |0.000 | ACK | |Seq = 1 Ack = 504
    | |(57397) <------------------ (5001) |
    |0.000 | PSH, ACK - Len: 194 |Seq = 504 Ack = 1
    | |(57397) ------------------> (5001) |
    |0.000 | ACK | |Seq = 1 Ack = 698
    | |(57397) <------------------ (5001) |
    |38.124 | FIN, ACK | |Seq = 1 Ack =
    698 # Server Restarted
    | |(57397) <------------------ (5001) |
    |38.126 | ACK | |Seq = 698 Ack = 2
    | |(57397) ------------------> (5001) |
    |619.097 | RST, ACK | |Seq = 698 Ack =
    2 # Client Restarted
    | |(57397) ------------------> (5001) |
    |627.107 | SYN | |Seq = 0 Ack = ?
    | |(57549) ------------------> (5001) |
    |627.107 | SYN, ACK | |Seq = 0 Ack = 1
    | |(57549) <------------------ (5001) |
    |627.107 | ACK | |Seq = 1 Ack = 1
    | |(57549) ------------------> (5001) |

  2. Re: broken client, or broken TCP stack?

    On Nov 6, 4:49*am, robop...@gmail.com wrote:

    > Platform: *Client and Server are running 32-bit FedoraCore 6


    I have tried quite hard to follow your sequence of events, but I just
    can't do it. Your descriptions are very compressed and vague.

    > Not shown: *Successful connection open, transfer of data from client
    > to server. *Also, this is a subset of the original capture, and
    > therefore the relative seq/ack numbers start at 1. *Sorry for some
    > possible confusion.


    > In testing the connection-loss-then-reconnect function in the client,


    Is this an application-level thing or an OS TCP-level thing?

    > the server is restarted after some arbitrary period of data transfer
    > from client to server. *At the restart, the server TCP sends a FIN ACK
    > to the client, and the client TCP ACKS that packet.


    Why would the server send a FIN ACK after it's restarted? It would
    have no knowledge of the previous connection.

    > Within the debugger, however, the client is happily writing to that
    > socket but the trace shows that none of this is actually getting on
    > the wire.


    Seems normal. That's a common TCP state.

    >*Finally, we restart the client and the client TCP sends a
    > RST to the server.


    When you say "restart the client", are you talking about the
    application or the machine? Are you rebooting the OS? Is the TCP
    connection intended to be kept across the restart?

    > Socket options on the client are KEEPALIVE and LINGER {1,0}.
    >
    > I would expect that the client TCP would have immediately sent a FIN
    > ACK after its ACK of the server FIN ACK.


    Are you talking about after one or the other has been restarted? And
    by restarted are we talking applications, systems, or what?

    > Is this a broken client, a broken client TCP, or something else?


    No idea. I can't follow your descriptions.

    > |0.000 * *| * * * * PSH, ACK - Len: 503 * * * * * |Seq = 1 Ack = 1
    > | * * * * |(57397) *------------------> *(5001) * |
    > |0.000 * *| * * * * ACK * * * | * * * * * ** * * |Seq = 1 Ack = 504
    > | * * * * |(57397) *<------------------ *(5001) * |
    > |0.000 * *| * * * * PSH, ACK - Len: 194 * * * * * |Seq = 504 Ack = 1
    > | * * * * |(57397) *------------------> *(5001) * |
    > |0.000 * *| * * * * ACK * * * | * * * * * ** * * |Seq = 1 Ack = 698
    > | * * * * |(57397) *<------------------ *(5001) * |
    > |38.124 * | * * * * FIN, ACK *| * * * * * * * * * |Seq = 1 Ack =


    There doesn't seem to be a direction on this FIN,ACK that I can see.
    But the line wrapping has made it very hard to read anyway.

    > 698 * *# Server Restarted
    > | * * * * |(57397) *<------------------ *(5001) * |


    Is this the direction ont he FIN,ACK above? Or what?

    > |38.126 * | * * * * ACK * * * | * * * * * * * * * |Seq = 698 Ack = 2
    > | * * * * |(57397) *------------------> *(5001) * |
    > |619.097 *| * * * * RST, ACK *| * * * * * * * * * |Seq = 698 Ack =
    > 2 * *# Client Restarted
    > | * * * * |(57397) *------------------> *(5001) * |


    Where does this line end? There are |'s everywhere.

    > |627.107 *| * * * * SYN * * * | * * * * * * * * * |Seq = 0 Ack = ?
    > | * * * * |(57549) *------------------> *(5001) * |
    > |627.107 *| * * * * SYN, ACK *| * * * * * * * * * |Seq = 0 Ack = 1
    > | * * * * |(57549) *<------------------ *(5001) * |
    > |627.107 *| * * * * ACK * * * | * * * * * * * * * |Seq = 1 Ack = 1
    > | * * * * |(57549) *------------------> *(5001) * |


    Perhaps a link to an URL where it's readable?

    DS

  3. Re: broken client, or broken TCP stack?

    On Thu, 06 Nov 2008 04:49:22 -0800, robopoet wrote:

    > Within the debugger, however, the client is happily writing to that
    > socket but the trace shows that none of this is actually getting on the
    > wire. Finally, we restart the client and the client TCP sends a RST to
    > the server.


    Quite probably the first packets do get out. At a certain moment, the
    window is filled, so the sender cannot send any more packets. That does
    not mean the sender cannot accept them from the program, the sender is
    free to buffer this internally somewhere.

    But I may be way of here.....

    M4

  4. Re: broken client, or broken TCP stack?

    On Nov 6, 7:49*am, robop...@gmail.com wrote:
    > Platform: *Client and Server are running 32-bit FedoraCore 6
    >
    > Not shown: *Successful connection open, transfer of data from client
    > to server. *Also, this is a subset of the original capture, and
    > therefore the relative seq/ack numbers start at 1. *If you open the
    > trace of the connection (below), please bear in mind that the
    > connection has been in progress for some time even though the
    > acks = 1.


    Note: The terms "client" and "server" generally refer to the data-
    sending and data-receiving applications running on two different
    machines.

    The exception to this is when I say "client TCP" or "server TCP" in
    which case I am talking about the TCP in the kernel on the machine
    running the client or server application.

    I am testing that the client application can detect and recover from a
    connection loss with the server application. The specific test case
    is that the server application has been restarted, which is a
    production offlier case but nonetheless has to be handled.

    The rest of the comments reference the trace of the behavior I am
    describing, which can be viewed as plain text at:

    http://mysite.verizon.net/vze3jmtr/tmx/bug_961.txt

    When the server application is restarted, the server TCP sends a FIN
    ACK to the client TCP, which ACKs this right away. Within the
    debugger, however, I see the client app continue to write() on the
    socket handle, without error, even though no packets appear on the
    wire. After about 10 minutes, I restart the client application and
    see the client TCP send a RST to the server TCP.

    I am surprised by this. I would expect that the client TCP would send
    a FIN ACK back to the server TCP upon receipt of the FIN ACK, which it
    evidently received. I am also surprised that the client application
    can continue to write() to the socket handle ad infinitum without an
    error being reported.

    Am I correct in being surprised by both of these behaviors? If not,
    what do I not understand about either TCP or socket programming? A
    citation would be great.

    If I am correct in being surprised by either or both of these
    behaviors, is there something I can do about it in the client
    application or is it a TCP/kernel issue?

    > Socket options on the client are KEEPALIVE and LINGER {1,0}.


    Forget trying to read this. Please look at

    http://mysite.verizon.net/vze3jmtr/tmx/bug_961.txt

    if you'd like one that should be readable.

    > |Time * * | 192.168.169.29 * *| 192.168.169.32 * *|
    >
    > [....]
    >
    > |0.000 * *| * * * * PSH, ACK - Len: 503 * * * * * |Seq = 1 Ack = 1
    > | * * * * |(57397) *------------------> *(5001) * |


    Thanks much.

  5. Re: broken client, or broken TCP stack?

    On Nov 6, 8:37*am, David Schwartz wrote:
    > On Nov 6, 4:49*am, robop...@gmail.com wrote:
    >
    > > Platform: *Client and Server are running 32-bit FedoraCore 6

    >
    > I have tried quite hard to follow your sequence of events, but I just
    > can't do it. Your descriptions are very compressed and vague.


    Sorry about that. I've posted a clarification followup to my initial
    post that I hope helps.

    >> [....]

    > [....]


    > Perhaps a link to an URL where it's readable?


    http://mysite.verizon.net/vze3jmtr/tmx/bug_961.txt

  6. Re: broken client, or broken TCP stack?

    In comp.unix.programmer robopoet@gmail.com wrote:

    > When the server application is restarted, the server TCP sends a FIN
    > ACK to the client TCP, which ACKs this right away. Within the
    > debugger, however, I see the client app continue to write() on the
    > socket handle, without error, even though no packets appear on the
    > wire. After about 10 minutes, I restart the client application and
    > see the client TCP send a RST to the server TCP.


    > I am surprised by this. I would expect that the client TCP would send
    > a FIN ACK back to the server TCP upon receipt of the FIN ACK, which it
    > evidently received. I am also surprised that the client application
    > can continue to write() to the socket handle ad infinitum without an
    > error being reported.


    FIN only means that the sender of the FIN will be sending no more
    data. It says nothing about its willingness to accept more data.

    Sending of a FIN is triggered by an application-level call to either
    shutdown() or close(). A TCP stack will never spontaneously generate
    a FIN in response to a FIN received from the remote.

    rick jones
    --
    oxymoron n, commuter in a gas-guzzling luxury SUV with an American flag
    these opinions are mine, all mine; HP might not want them anyway...
    feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...

  7. Re: broken client, or broken TCP stack?

    On Nov 6, 2:11*pm, Rick Jones wrote:
    > In comp.unix.programmer robop...@gmail.com wrote:
    > > When the server application is restarted, the server TCP sends a FIN
    > > ACK to the client TCP, which ACKs this right away. *Within the
    > > debugger, however, I see the client app continue to write() on the
    > > socket handle, without error, even though no packets appear on the
    > > wire. *After about 10 minutes, I restart the client application and
    > > see the client TCP send a RST to the server TCP.
    > > I am surprised by this. *I would expect that the client TCP would send
    > > a FIN ACK back to the server TCP upon receipt of the FIN ACK, which it
    > > evidently received. *I am also surprised that the client application
    > > can continue to write() to the socket handle ad infinitum without an
    > > error being reported.

    >
    > FIN only means that the sender of the FIN will be sending no more
    > data. *It says nothing about its willingness to accept more data.
    >
    > Sending of a FIN is triggered by an application-level call to either
    > shutdown() or close(). *A TCP stack will never spontaneously generate
    > a FIN in response to a FIN received from the remote.


    Thanks Rick.

    Is there to do within the client application to learn that the other
    side of the connection has sent a FIN ? It sounds like my expectation
    that the client application's write() would fail is incorrectly
    informed.

    Even though in this specific case the client application doesn't
    read() anything from the server application, maybe I can make the
    socket non-blocking, then do a read() on it and check for a -1 return
    with errno set to EPIPE, EBADF, EINVAL... or, at any rate, something
    other than EWOULDBLOCK.

    Or would a zero from the read() (=> EOF) be how this is conveyed?

    Argh. Any specific chapters in the usual references I should
    (re)read?

    Thanks much,

    Robo

    >
    > rick jones
    > --
    > oxymoron n, commuter in a gas-guzzling luxury SUV with an American flag
    > these opinions are mine, all mine; HP might not want them anyway...
    > feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...



  8. Re: broken client, or broken TCP stack?

    2008-11-6, 11:43(-08), robopoet@gmail.com:
    [...]
    >> FIN only means that the sender of the FIN will be sending no more
    >> data. *It says nothing about its willingness to accept more data.

    [...]
    > Is there to do within the client application to learn that the other
    > side of the connection has sent a FIN ?

    [...]
    > Or would a zero from the read() (=> EOF) be how this is conveyed?

    [...]

    Yes. That's the same for a pipe(2) or socketpair(2) for
    instance, and it seems quite intuitive to me.

    --
    Stéphane

  9. Re: broken client, or broken TCP stack?

    In comp.protocols.tcp-ip robopoet@gmail.com wrote:
    > On Nov 6, 2:11?pm, Rick Jones wrote:
    > > FIN only means that the sender of the FIN will be sending no more
    > > data. It says nothing about its willingness to accept more data.
    > >
    > > Sending of a FIN is triggered by an application-level call to either
    > > shutdown() or close(). A TCP stack will never spontaneously generate
    > > a FIN in response to a FIN received from the remote.


    > Thanks Rick.


    > Is there to do within the client application to learn that the other
    > side of the connection has sent a FIN ? It sounds like my expectation
    > that the client application's write() would fail is incorrectly
    > informed.


    Perhaps slightly muddled...

    > Even though in this specific case the client application doesn't
    > read() anything from the server application, maybe I can make the
    > socket non-blocking, then do a read() on it and check for a -1 return
    > with errno set to EPIPE, EBADF, EINVAL... or, at any rate, something
    > other than EWOULDBLOCK.


    Actual error returns on a socket call only happen with hard errors -
    if RST's are received. Receipt of a FIN will not cause any subsequent
    socket call to return an error. IIRC even a second, third,
    fourth... read after the first which gets the zero return will still
    get a return of zero.

    > Or would a zero from the read() (=> EOF) be how this is conveyed?


    Yes. The read return of zero is the indication that a FIN has arrived.

    Now we get into a grey area - was the FIN triggered by the remote
    application calling close()/shutdown(SHUT_RDWR), or was it simply a
    shutdown(SHUT_WR)? In and of itself the FIN and read return of zero
    does not say. If it was a close()/shutdown(SHUT_RDWR) then a
    subsequent write() at this end _may_ fail with an error message.
    Typically it would be the second or later write after the arrival of
    the FIN - the first write will cause a segment to arrive at the remote
    which was not expecting data. That triggers a RST, which most of the
    time makes it back to the sender, terminating the connection with
    extreme predjudice. The next socket call after that happens will get
    an error return.

    If however it was just a shutdown(SHUT_WR) on the remote, the remote
    is still willing to accept data, and the local can keep calling
    write() to its heart's content.

    > Argh. Any specific chapters in the usual references I should
    > (re)read?


    I don't have chapter numbers mesmerized, but thumbing through the
    likes if Unix Network Programming Third (IIRC) Edition would be
    goodness I suppose.

    rick jones
    --
    denial, anger, bargaining, depression, acceptance, rebirth...
    where do you want to be today?
    these opinions are mine, all mine; HP might not want them anyway...
    feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...

  10. Re: broken client, or broken TCP stack?

    robopoet@gmail.com wrote:
    > On Nov 6, 7:49 am, robop...@gmail.com wrote:

    [snip]
    > I am testing that the client application can detect and recover from a
    > connection loss with the server application. The specific test case
    > is that the server application has been restarted, which is a
    > production offlier case but nonetheless has to be handled.
    >
    > The rest of the comments reference the trace of the behavior I am
    > describing, which can be viewed as plain text at:
    >
    > http://mysite.verizon.net/vze3jmtr/tmx/bug_961.txt
    >
    > When the server application is restarted, the server TCP sends a FIN
    > ACK to the client TCP, which ACKs this right away. Within the
    > debugger, however, I see the client app continue to write() on the
    > socket handle, without error, even though no packets appear on the
    > wire. After about 10 minutes, I restart the client application and
    > see the client TCP send a RST to the server TCP.
    >
    > I am surprised by this. I would expect that the client TCP would send
    > a FIN ACK back to the server TCP upon receipt of the FIN ACK, which it
    > evidently received.


    I think this has been covered by other posts.

    > I am also surprised that the client application
    > can continue to write() to the socket handle ad infinitum without an
    > error being reported.


    It's not surprising that write()s are successful even though no data are
    transmitted, since the data may be buffered in kernel memory. If you
    waited long enough or increased the data rate, write() would eventually
    block.

    However, the fact that no data are transmitted is surprising. Given the
    information available, the only explanation I can come up with is that
    the FIN-set packet from the server also advertised a zero window, but I
    can't think why that would be the case. I suspect it is more likely I am
    missing something than there is a bug in the TCP stack .

    How do you restart the server?

    Alex

  11. Re: broken client, or broken TCP stack?

    On Nov 6, 3:53*pm, Alex Fraser wrote:
    > robop...@gmail.com wrote:
    > > On Nov 6, 7:49 am, robop...@gmail.com wrote:

    > [snip]
    > > I am testing that the client application can detect and recover from a
    > > connection loss with the server application. *The specific test case
    > > is that the server application has been restarted, which is a
    > > production offlier case but nonetheless has to be handled.

    >
    > > The rest of the comments reference the trace of the behavior I am
    > > describing, which can be viewed as plain text at:

    >
    > >http://mysite.verizon.net/vze3jmtr/tmx/bug_961.txt

    >
    > > When the server application is restarted, the server TCP sends a FIN
    > > ACK to the client TCP, which ACKs this right away. *Within the
    > > debugger, however, I see the client app continue to write() on the
    > > socket handle, without error, even though no packets appear on the
    > > wire. *After about 10 minutes, I restart the client application and
    > > see the client TCP send a RST to the server TCP.

    >
    > > I am surprised by this. *I would expect that the client TCP would send
    > > a FIN ACK back to the server TCP upon receipt of the FIN ACK, which it
    > > evidently received.

    >
    > I think this has been covered by other posts.
    >
    > > I am also surprised that the client application
    > > can continue to write() to the socket handle ad infinitum without an
    > > error being reported.

    >
    > It's not surprising that write()s are successful even though no data are
    > transmitted, since the data may be buffered in kernel memory. If you
    > waited long enough or increased the data rate, write() would eventually
    > block.
    >
    > However, the fact that no data are transmitted is surprising. Given the
    > information available, the only explanation I can come up with is that
    > the FIN-set packet from the server also advertised a zero window, but I
    > can't think why that would be the case. I suspect it is more likely I am
    > missing something than there is a bug in the TCP stack .


    The window size on the FIN is much bigger than 0 -- 38196 or so (I
    don't have the capture in front of me, but I remember checking this).

    > How do you restart the server?


    killproc (sends a SIGTERM) then startproc, each wrapped by 'restart'
    => 'stop' then 'start' in an /etc/init.d script.

  12. Re: broken client, or broken TCP stack?

    robopoet@gmail.com wrote:
    > On Nov 6, 3:53 pm, Alex Fraser wrote:
    >> robop...@gmail.com wrote:
    >>> On Nov 6, 7:49 am, robop...@gmail.com wrote:

    [snip]
    >>> When the server application is restarted, the server TCP sends a FIN
    >>> ACK to the client TCP, which ACKs this right away. Within the
    >>> debugger, however, I see the client app continue to write() on the
    >>> socket handle, without error, even though no packets appear on the
    >>> wire. After about 10 minutes, I restart the client application and
    >>> see the client TCP send a RST to the server TCP.

    [snip]
    > The window size on the FIN is much bigger than 0 -- 38196 or so (I
    > don't have the capture in front of me, but I remember checking this).


    OK.

    >> How do you restart the server?

    >
    > killproc (sends a SIGTERM) then startproc, each wrapped by 'restart'
    > => 'stop' then 'start' in an /etc/init.d script.


    Sorry, I didn't ask the right question. I was trying to understand what
    happens to the server process(es) at a system call level when you
    restart the server. What happens after SIGTERM is raised?

    If the connected socket in the server is closed (eg because the process
    exits) with no unread data, a FIN being sent is normal, and if a segment
    is received after that, a RST will be sent.

    If, as seems to be the case here, the client is periodically making
    small (<=MSS) write() calls and the network is fast compared to the
    write() frequency, I'd expect the following sequence:

    1. The FIN is received by the client. This also acknowledges all data
    previously sent by the client.
    2. Some time later, the client calls write(), which returns OK.
    3. A segment is transmitted.
    4. The segment is received and a RST sent.
    5. The RST is received.
    6. Some time later, the client calls write() again. This time it fails
    with EPIPE or causes SIGPIPE to be raised.

    The question is, why does this sequence - in particular #3 which should
    follow from #2 - not apply? Zero window, which you have ruled out, is
    the only reason I can see.

    Alex

  13. Re: broken client, or broken TCP stack?

    On Nov 7, 2:27*am, Alex Fraser wrote:
    > robop...@gmail.com wrote:
    > > On Nov 6, 3:53 pm, Alex Fraser wrote:
    > >> robop...@gmail.com wrote:
    > >>> On Nov 6, 7:49 am, robop...@gmail.com wrote:

    > [snip]
    > >>> When the server application is restarted, the server TCP sends a FIN
    > >>> ACK to the client TCP, which ACKs this right away. *Within the
    > >>> debugger, however, I see the client app continue to write() on the
    > >>> socket handle, without error, even though no packets appear on the
    > >>> wire. *After about 10 minutes, I restart the client application and
    > >>> see the client TCP send a RST to the server TCP.

    > [snip]
    > > The window size on the FIN is much bigger than 0 -- 38196 or so (I
    > > don't have the capture in front of me, but I remember checking this).

    >
    > OK.
    >
    > >> How do you restart the server?

    >
    > > killproc (sends a SIGTERM) then startproc, each wrapped by 'restart'
    > > => 'stop' then 'start' in an /etc/init.d script.

    >
    > Sorry, I didn't ask the right question. I was trying to understand what
    > happens to the server process(es) at a system call level when you
    > restart the server. What happens after SIGTERM is raised?
    >
    > If the connected socket in the server is closed (eg because the process
    > exits) with no unread data, a FIN being sent is normal, and if a segment
    > is received after that, a RST will be sent.


    It might be ugly to figure that out, but I'll try. The server
    application is running inside a JVM. I'll see if there's some
    explicit handling of the socket close/shutdown in the server
    application on SIGTERM, or if it just leaves everything open and has
    the JVM deal with it.

    Let me see what I can do.

    > If, as seems to be the case here, the client is periodically making
    > small (<=MSS) write() calls and the network is fast compared to the
    > write() frequency, I'd expect the following sequence:
    >
    > 1. The FIN is received by the client. This also acknowledges all data
    > previously sent by the client.
    > 2. Some time later, the client calls write(), which returns OK.
    > 3. A segment is transmitted.
    > 4. The segment is received and a RST sent.
    > 5. The RST is received.
    > 6. Some time later, the client calls write() again. This time it fails
    > with EPIPE or causes SIGPIPE to be raised.


    I took a screenshot of this section of the packet capture of the
    session:

    http://mysite.verizon.net/vze3jmtr/tmx/bug961.PNG

    ..32 is the server-side, .29 is the client-side.

    Packet # 32 is the FIN ACK from the server-side. The window is 39168.
    Packet # 33 is the ACK of that.
    Packet # 34 is the RST from the client side, about 10 minutes later,
    when I finally restart the client app.

    In the meantime, the client app is happily writing to the socket
    handle and nothing is popping up on the wire. The client app is
    normal C / socket API stuff.

    I'll rerun the test and see what's happening to the SEND-Q on the
    client app side of the connection -- although that doesn't explain why
    nothing is showing up on the wire. The client app output rate is
    about 1400 bytes every 0.02ms (thumbnail estimate, but close enough to
    show it's generating output).

    > The question is, why does this sequence - in particular #3 which should
    > follow from #2 - not apply? Zero window, which you have ruled out, is
    > the only reason I can see.


    It's got me scratching my head, that's for sure.

    Thanks very much for your help so far.

  14. Re: broken client, or broken TCP stack?

    On Nov 7, 2:27*am, Alex Fraser wrote:
    > robop...@gmail.com wrote:
    > > On Nov 6, 3:53 pm, Alex Fraser wrote:
    > >> robop...@gmail.com wrote:
    > >>> On Nov 6, 7:49 am, robop...@gmail.com wrote:

    > [snip]
    > >>> When the server application is restarted, the server TCP sends a FIN
    > >>> ACK to the client TCP, which ACKs this right away. *Within the
    > >>> debugger, however, I see the client app continue to write() on the
    > >>> socket handle, without error, even though no packets appear on the
    > >>> wire. *After about 10 minutes, I restart the client application and
    > >>> see the client TCP send a RST to the server TCP.

    > [snip]
    > > The window size on the FIN is much bigger than 0 -- 38196 or so (I
    > > don't have the capture in front of me, but I remember checking this).

    >
    > OK.
    >
    > >> How do you restart the server?

    >
    > > killproc (sends a SIGTERM) then startproc, each wrapped by 'restart'
    > > => 'stop' then 'start' in an /etc/init.d script.

    >
    > Sorry, I didn't ask the right question. I was trying to understand what
    > happens to the server process(es) at a system call level when you
    > restart the server. What happens after SIGTERM is raised?
    >
    > If the connected socket in the server is closed (eg because the process
    > exits) with no unread data, a FIN being sent is normal, and if a segment
    > is received after that, a RST will be sent.


    It might be ugly to figure that out, but I'll try. The server
    application is running inside a JVM. I'll see if there's some
    explicit handling of the socket close/shutdown in the server
    application on SIGTERM, or if it just leaves everything open and has
    the JVM deal with it.

    Let me see what I can do.

    > If, as seems to be the case here, the client is periodically making
    > small (<=MSS) write() calls and the network is fast compared to the
    > write() frequency, I'd expect the following sequence:
    >
    > 1. The FIN is received by the client. This also acknowledges all data
    > previously sent by the client.
    > 2. Some time later, the client calls write(), which returns OK.
    > 3. A segment is transmitted.
    > 4. The segment is received and a RST sent.
    > 5. The RST is received.
    > 6. Some time later, the client calls write() again. This time it fails
    > with EPIPE or causes SIGPIPE to be raised.


    I took a screenshot of this section of the packet capture of the
    session:

    http://mysite.verizon.net/vze3jmtr/tmx/bug961.PNG

    ..32 is the server-side, .29 is the client-side.

    Packet # 32 is the FIN ACK from the server-side. The window is 39168.
    Packet # 33 is the ACK of that.
    Packet # 34 is the RST from the client side, about 10 minutes later,
    when I finally restart the client app.

    In the meantime, the client app is happily writing to the socket
    handle and nothing is popping up on the wire. The client app is
    normal C / socket API stuff.

    I'll rerun the test and see what's happening to the SEND-Q on the
    client app side of the connection -- although that doesn't explain why
    nothing is showing up on the wire. The client app output rate is
    about 1400 bytes every 0.02ms (thumbnail estimate, but close enough to
    show it's generating output).

    > The question is, why does this sequence - in particular #3 which should
    > follow from #2 - not apply? Zero window, which you have ruled out, is
    > the only reason I can see.


    It's got me scratching my head, that's for sure.

    Thanks very much for your help so far.

+ Reply to Thread