Looking for a clue: weird RST segments in mid-stream after sack'dsegment?
Hi,
I'm trying to figure out this weird FTP upload problem we're having
connecting from one Linux box to another across two NAT boxes. We're
using passive mode connections and they seems to work just fine most of
the time. In rare instances, however, the connection just breaks down
after in mid-stream.
I've tcpdumped liberally on the external interface of our NAT box, but
the results keep puzzling me. Maybe somebody can help me make sense of
what I'm seeing here.
I've uploaded a complete dump of an aborted and a completed connection
each (tcpdump options used: -vvtttS) at
[url]http://baetzler.de/sandbox/aborted.txt.gz[/url] and
[url]http://baetzler.de/sandbox/completed.txt.gz[/url] for your reference.
I'll summarise the relevant parts and my questions here.
1) Successfully completed data transfer:
000000 IP (tos 0x0, ttl 63, id 61087, offset 0, flags [DF], proto: TCP
(6), length: 60) client.xxx > server.yyy: SWE, cksum 0xcc03
(correct), 2129476676:2129476676(0) win 5840 <mss 1460,sackOK,timestamp
460247771 0,nop,wscale 7>
015726 IP (tos 0x0, ttl 55, id 0, offset 0, flags [DF], proto: TCP (6),
length: 60) server.yyy > client.xxx: S, cksum 0x7e4c (corre
ct), 1606907384:1606907384(0) ack 2129476677 win 5792 <mss
1460,sackOK,timestamp 3448154964 460247771,nop,wscale 2>
000128 IP (tos 0x0, ttl 63, id 61088, offset 0, flags [DF], proto: TCP
(6), length: 52) client.xxx > server.yyy: ., cksum 0xc384 (c
orrect), 2129476677:2129476677(0) ack 1606907385 win 46
<nop,nop,timestamp 460247772 3448154964>
=> connection setup is fine, we agree on wscale 2, SACK option, no ECN.
[some packets later]
000006 IP (tos 0x0, ttl 63, id 61454, offset 0, flags [DF], proto: TCP
(6), length: 1500) client.xxx > server.yyy: . 2129992709:212
9994157(1448) ack 1606907385 win 46 <nop,nop,timestamp 460247788 3448155004>
000038 IP (tos 0x0, ttl 63, id 61455, offset 0, flags [DF], proto: TCP
(6), length: 1500) client.xxx > server.yyy: . 2129994157:212
9995605(1448) ack 1606907385 win 46 <nop,nop,timestamp 460247788 3448155004>
=> just sent a bunch of segments to the server, filling up its transmit
window
000030 IP (tos 0x0, ttl 55, id 57185, offset 0, flags [DF], proto: TCP
(6), length: 64) server.yyy > client.xxx: ., cksum 0xc112 (c
orrect), 1606907385:1606907385(0) ack 2129864877 win 32682
<nop,nop,timestamp 3448155004 460247787,nop,nop,sack 1
{2129866325:2129867773}>
=> sack indicates packet loss or reordering - server reports
out-of-order segment?
000021 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: TCP (6),
length: 40) client.xxx > server.yyy: R, cksum 0x4d4f (corre
ct), 2129864877:2129864877(0) win 0
=> I don't understand why the client tcp sends this RST segment and why
it doesn't terminate the connection.
000005 IP (tos 0x0, ttl 55, id 57186, offset 0, flags [DF], proto: TCP
(6), length: 52) server.yyy > client.xxx: ., cksum 0x4e68 (c
orrect), 1606907385:1606907385(0) ack 2129867773 win 32085
<nop,nop,timestamp 3448155004 460247787>
=> acks previously missing segments.
[continues to FIN]
2) Aborted connection
[Connection setup and options like above, so I'll jump right to the end]
000006 IP (tos 0x0, ttl 63, id 12194, offset 0, flags [DF], proto: TCP
(6), length: 1500) client.xxx > server.yyy: . 913295966:913297414(1448)
ack 787488674 win 46 <nop,nop,timestamp 461390389 3451011384>
039531 IP (tos 0x0, ttl 55, id 24493, offset 0, flags [DF], proto: TCP
(6), length: 52) server.yyy > client.xxx: ., cksum 0xbff2 (correct),
787488674:787488674(0) ack 913167790 win 32768 <nop,nop,timestamp
3451011394 461390387>
=> Server's tcp transmit window allows another packet
000139 IP (tos 0x0, ttl 63, id 12195, offset 0, flags [DF], proto: TCP
(6), length: 1500) client.xxx > server.yyy: . 913297414:913298862(1448)
ack 787488674 win 46 <nop,nop,timestamp 461390393 3451011394>
015779 IP (tos 0x0, ttl 55, id 24494, offset 0, flags [DF], proto: TCP
(6), length: 64) server.yyy > client.xxx: ., cksum 0x7743 (correct),
787488674:787488674(0) ack 913167790 win 32768 <nop,nop,timestamp
3451011398 461390387,nop,nop,sack 1 {913297414:913298862}>
=> Server receives out-of-order segment
000010 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: TCP (6),
length: 40) client.xxx > server.yyy: R, cksum 0x3451 (correct),
913167790:913167790(0) win 0
=> Mysterious RST again
198903 IP (tos 0x0, ttl 63, id 12196, offset 0, flags [DF], proto: TCP
(6), length: 1500) client.xxx > server.yyy: . 913167790:913169238(1448)
ack 787488674 win 46 <nop,nop,timestamp 461390415 3451011394>
=> Client starts to resend lost packet(s)
015834 IP (tos 0x0, ttl 55, id 62381, offset 0, flags [DF], proto: TCP
(6), length: 40) server.yyy > client.xxx: R, cksum 0xf1da (correct),
787488674:787488674(0) win 0
=> Server gives up?
To summarize my questions:
- Why is the client's tcp sometimes sending an RST segment in reply to
an ACK with a SACK attached?
- Why does the server sometimes ignore the client's RST even though it
seems to be valid in all cases that I've seen?
- What could I do to improve connection resilience?
Your input is very much appreciated. If you tell me to RTFM, please
include name of the manual and chapter - I'm willing to do the work, but
I don't know where to start right now.
TIA,
Thomas
--
(signature left blank intentionally)
Re: Looking for a clue: weird RST segments in mid-stream after sack'dsegment?
Thomas Bätzler wrote:[color=blue]
> Hi,
>
> I'm trying to figure out this weird FTP upload problem we're having
> connecting from one Linux box to another across two NAT boxes. We're
> using passive mode connections and they seems to work just fine most of
> the time. In rare instances, however, the connection just breaks down
> after in mid-stream.
>
> I've tcpdumped liberally on the external interface of our NAT box, but
> the results keep puzzling me. Maybe somebody can help me make sense of
> what I'm seeing here.
>
> I've uploaded a complete dump of an aborted and a completed connection
> each (tcpdump options used: -vvtttS) at
> [url]http://baetzler.de/sandbox/aborted.txt.gz[/url] and
> [url]http://baetzler.de/sandbox/completed.txt.gz[/url] for your reference.
>
> I'll summarise the relevant parts and my questions here.
>
> 1) Successfully completed data transfer:
>
> 000000 IP (tos 0x0, ttl 63, id 61087, offset 0, flags [DF], proto: TCP
> (6), length: 60) client.xxx > server.yyy: SWE, cksum 0xcc03
> (correct), 2129476676:2129476676(0) win 5840 <mss 1460,sackOK,timestamp
> 460247771 0,nop,wscale 7>
> 015726 IP (tos 0x0, ttl 55, id 0, offset 0, flags [DF], proto: TCP (6),
> length: 60) server.yyy > client.xxx: S, cksum 0x7e4c (corre
> ct), 1606907384:1606907384(0) ack 2129476677 win 5792 <mss
> 1460,sackOK,timestamp 3448154964 460247771,nop,wscale 2>
> 000128 IP (tos 0x0, ttl 63, id 61088, offset 0, flags [DF], proto: TCP
> (6), length: 52) client.xxx > server.yyy: ., cksum 0xc384 (c
> orrect), 2129476677:2129476677(0) ack 1606907385 win 46
> <nop,nop,timestamp 460247772 3448154964>
>
> => connection setup is fine, we agree on wscale 2, SACK option, no ECN.
>
> [some packets later]
>
> 000006 IP (tos 0x0, ttl 63, id 61454, offset 0, flags [DF], proto: TCP
> (6), length: 1500) client.xxx > server.yyy: . 2129992709:212
> 9994157(1448) ack 1606907385 win 46 <nop,nop,timestamp 460247788
> 3448155004>
> 000038 IP (tos 0x0, ttl 63, id 61455, offset 0, flags [DF], proto: TCP
> (6), length: 1500) client.xxx > server.yyy: . 2129994157:212
> 9995605(1448) ack 1606907385 win 46 <nop,nop,timestamp 460247788
> 3448155004>
>
> => just sent a bunch of segments to the server, filling up its transmit
> window
>
> 000030 IP (tos 0x0, ttl 55, id 57185, offset 0, flags [DF], proto: TCP
> (6), length: 64) server.yyy > client.xxx: ., cksum 0xc112 (c
> orrect), 1606907385:1606907385(0) ack 2129864877 win 32682
> <nop,nop,timestamp 3448155004 460247787,nop,nop,sack 1
> {2129866325:2129867773}>
>
> => sack indicates packet loss or reordering - server reports
> out-of-order segment?
>
> 000021 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: TCP (6),
> length: 40) client.xxx > server.yyy: R, cksum 0x4d4f (corre
> ct), 2129864877:2129864877(0) win 0
>
> => I don't understand why the client tcp sends this RST segment and why
> it doesn't terminate the connection.[/color]
The RST is bogus. Its SEQ number is 2129864877 < 2129867773 which the
server apparently has already seen < 2129995605 which we have seen the
client send last before the RST.
AFAIK, the RST SEQ number must be within the receive window in order to
be acceptable, so exactly 2129995605 (next SEQ and 2129864877 (last ACK)
+32682*4), if the client had actually sent it.
I think someone fakes the RST in reply to seeing ACK 2129864877 from the
server.
Marc
--
_ _ Marc A. Donges +49 721 6904-2130
'v' Klosterweg 28 / E110
/ \ 76131 Karlsruhe
W W [url]http://www.hadiko.de/~marc/[/url]