I need a Linux TCP stack guru - Embedded

This is a discussion on I need a Linux TCP stack guru - Embedded ; I am looking for someone who knows the internals of the TCP implementation on Linux (2.6.10 or thereabouts). Here's a brief overview of the issue I'm trying to resolve: Background: I'm trying to optimize transfers over a local GigE connection. ...

+ Reply to Thread
Results 1 to 5 of 5

Thread: I need a Linux TCP stack guru

  1. I need a Linux TCP stack guru

    I am looking for someone who knows the internals of the TCP implementation
    on Linux (2.6.10 or thereabouts). Here's a brief overview of the issue I'm
    trying to resolve:

    Background:
    I'm trying to optimize transfers over a local GigE connection. The Linux
    machine (MIPS) is supposed to send 500K+ of data using a single send()
    function from the test application. The socket buffer size is set to more
    than 1MB. Nagle is disabled (not that it should matter in this case). I've
    essentially disabled congestion control by initializing tcp_cwnd to something
    like 128. I've done everything I can think of to make sure the kernel and/or
    TCP stack have no reason to do anything but send this chunk of TCP data as
    fast as possible.

    Problem:
    Whenever the Linux TCP stack receives a packet from the peer indicating a
    larger window size, it seems to cause a delay of about 350 microseconds
    before additional TCP processing occurs on this connection. This occurs
    BEFORE the peer's window ever gets too small for the Linux machine to
    stop filling it, so it's not that the window closed and Linux had to stop
    sending data to the peer.

    Analysis:
    Doing the math, this chunk should be able to be transferred in under 5 milli-
    seconds (really, closer to 4 msec). Instead, it's taking around 20 msec.
    There are 41 of these window opening delay events in my test transfer, adding
    at least 15 msec to the transfer time.

    I don't know if I've explained this as clearly as I'd like. I could really
    use a quick chat with someone who knows the workings of the Linux stack
    inside and out (especially with regards to congestion control and ACK/
    window processing).

    Patrick
    ========= For LAN/WAN Protocol Analysis, check out PacketView Pro! =========
    Patrick Klos Email: patrick@klos.com
    Klos Technologies, Inc. Web: http://www.klos.com/
    ==================== http://www.loving-long-island.com/ ====================

  2. Re: I need a Linux TCP stack guru

    are you being bit by tcp's slow start feature here. TCP connections
    do a slow start just in case the connection crosses a congested link, so
    that it doesn't make the situation worse. After some epriod with a good
    acks and good RTT TCP winds up to full throughput.

    It's known problem with TCP on very fast uncongested networks, and
    can restrict tcp throughputs. It also hits apps where there are lots and
    lots of small tcp sessions (like the web :-().

    Check out rfc2001, google returns loads of refs.



    Patrick Klos wrote:
    > I am looking for someone who knows the internals of the TCP implementation
    > on Linux (2.6.10 or thereabouts). Here's a brief overview of the issue I'm
    > trying to resolve:


    > Background:
    > I'm trying to optimize transfers over a local GigE connection. The Linux
    > machine (MIPS) is supposed to send 500K+ of data using a single send()
    > function from the test application. The socket buffer size is set to more
    > than 1MB. Nagle is disabled (not that it should matter in this case). I've
    > essentially disabled congestion control by initializing tcp_cwnd to something
    > like 128. I've done everything I can think of to make sure the kernel and/or
    > TCP stack have no reason to do anything but send this chunk of TCP data as
    > fast as possible.


    > Problem:
    > Whenever the Linux TCP stack receives a packet from the peer indicating a
    > larger window size, it seems to cause a delay of about 350 microseconds
    > before additional TCP processing occurs on this connection. This occurs
    > BEFORE the peer's window ever gets too small for the Linux machine to
    > stop filling it, so it's not that the window closed and Linux had to stop
    > sending data to the peer.


    > Analysis:
    > Doing the math, this chunk should be able to be transferred in under 5 milli-
    > seconds (really, closer to 4 msec). Instead, it's taking around 20 msec.
    > There are 41 of these window opening delay events in my test transfer, adding
    > at least 15 msec to the transfer time.


    > I don't know if I've explained this as clearly as I'd like. I could really
    > use a quick chat with someone who knows the workings of the Linux stack
    > inside and out (especially with regards to congestion control and ACK/
    > window processing).


    > Patrick
    > ========= For LAN/WAN Protocol Analysis, check out PacketView Pro! =========
    > Patrick Klos Email: patrick@klos.com
    > Klos Technologies, Inc. Web: http://www.klos.com/
    > ==================== http://www.loving-long-island.com/ ====================


  3. Re: I need a Linux TCP stack guru

    In article ,
    Jim Jackson wrote:
    >are you being bit by tcp's slow start feature here. TCP connections
    >do a slow start just in case the connection crosses a congested link, so
    >that it doesn't make the situation worse. After some epriod with a good
    >acks and good RTT TCP winds up to full throughput.


    Thanks for the reply. Although slow start may also be involved, I determined
    that the primary reason I was seeing such delays was due to interrupt
    coalescing. When I disabled interrupt coalescing on the ethernet adapter,
    my transfer times became consistantly shorter.

    >It's known problem with TCP on very fast uncongested networks, and
    >can restrict tcp throughputs. It also hits apps where there are lots and
    >lots of small tcp sessions (like the web :-().
    >
    >Check out rfc2001, google returns loads of refs.


    I'll check that out. I'm still seeing symptoms that appear to be slow-
    start-like but they don't happen all the time. Does Linux TCP "remember"
    congestion information on a per-interface basis rather then on a per-
    connection basis?

    Patrick
    ========= For LAN/WAN Protocol Analysis, check out PacketView Pro! =========
    Patrick Klos Email: patrick@klos.com
    Klos Technologies, Inc. Web: http://www.klos.com/
    ==================== http://www.loving-long-island.com/ ====================

  4. Re: I need a Linux TCP stack guru


    In article you wrote:
    > >Check out rfc2001, google returns loads of refs.


    > I'll check that out. I'm still seeing symptoms that appear to be slow-
    > start-like but they don't happen all the time. Does Linux TCP

    "remember"
    > congestion information on a per-interface basis rather then on a per-
    > connection basis?


    Can't see how it can do. It might cache connection info by destination
    just in case there are multiple tcp sessions to same end point - it
    sounds like it would be a neat optimisation - but sorry, I'm no Linux
    Kernel TCP gearhead, so dunno. What kenrel version you using?




  5. Re: I need a Linux TCP stack guru

    Patrick Klos wrote:
    > Does Linux TCP "remember"
    > congestion information on a per-interface basis rather then on a per-
    > connection basis?


    It's kept in a metrics portion of the routing cache. It's based on
    broader route selection criteria, not interface. Stored metrics
    includes things like rtt, cwnd, initial cwnd, send threshold, pmtu,
    negotiated mss, etc. TCP also has per-connection state of course.
    Storing metrics in the routing tables seems pretty common, I know
    several other TCP implementations that do the same (e.g. Sun Solaris,
    at least as of a few years ago). This is the obvious way of doing it,
    since the route picked greatly affects network behavior, and two
    connections to the same address can end up with different routes, so
    may need different metrics.

+ Reply to Thread