Nagle's Algorithm Blues - TCP-IP

This is a discussion on Nagle's Algorithm Blues - TCP-IP ; Hi, I have a question. The TCP stack has Nagle's Algo enabled. TCP sends a 100 byte packet to the other end. Assume that it gets lost. Now, TCP gets a 50 byte packet from the upper layer and needs ...

+ Reply to Thread
Results 1 to 10 of 10

Thread: Nagle's Algorithm Blues

  1. Nagle's Algorithm Blues

    Hi,

    I have a question.

    The TCP stack has Nagle's Algo enabled. TCP sends a 100 byte packet to
    the other end. Assume that it gets lost. Now, TCP gets a 50 byte packet
    from the upper layer and needs to again send it to the same remote end.
    TCP waits till it receives an ACK from the other end before it sends a
    new segment carrying these 50 bytes.

    Since the 100 byte TCP pkt was lost, the senders retransmission timer
    expires and it retransmits again. Assume that for some reason this also
    gets lost.

    My question is this:

    For how long will the sender hold on to this 50 byte packet? Will it
    wait till eternity till it receives an ACK for the 100 byte packet, or
    will it after some threshold, send this packet?

    Thanks,
    Abhishek


  2. Re: Nagle's Algorithm Blues

    >For how long will the sender hold on to this 50 byte packet? Will it
    >wait till eternity till it receives an ACK for the 100 byte packet, or
    >will it after some threshold, send this packet?


    Sounds a bit like homework to me, but actually a not yet ack'ed
    previously sent out segment should not delay one that is redy to be
    transmitted unless sending this new segment would violate the so far
    anounced recieving window of the reciving end. In other words,
    retransmitting segments is done elsewhere - at least in my
    implementation. So, to answer your question, the sender will send out
    the 50 bytes as soon as Nagle tells it's ok. Last not least the
    reciving end usually is able to handle out of order segments and
    buffer the segment carrying the 50 bytes before they are delivered to
    the reciveing application layer up until the first 100 bytes are
    finally retransmitted and recived. Then, once the reciver get's the 50
    out of order bytes it will send out an ack for the last segment
    recived (the one before the 100 bytes) which in turn could (depending
    on the implementation of the sending stack) initiate the immediate
    retransmission of the missing 100 byte segment.

    So, the bottom line is that Nagle Algorithm is in almost all cases a
    good thing and almost never creates a problem. I think more problems
    are introdruced (performance wise) by people turning it off "just
    because" then because it's active.

    HTH

    Markus


  3. Re: Nagle's Algorithm Blues

    > Sounds a bit like homework to me, but actually a not yet ack'ed
    > previously sent out segment should not delay one that is redy to be
    > transmitted unless sending this new segment would violate the so far
    > anounced recieving window of the reciving end. In other words,
    > retransmitting segments is done elsewhere - at least in my
    > implementation. So, to answer your question, the sender will send out


    Yes, retransmission is done somewhere else.

    But if you look at your implementation then you will see that in some
    cases the idleness check fails and you delay sending the segment till
    you recieve the ACK for the other end.

    In tcp_output (BSD) idle is TRUE if maximum sequence number sent
    (snd_max) equals the oldest unacknowledged sequence number (snd_una),
    that is, if an ACK is not expected from the other end. Thus for us,
    idle would be set to 0 as we expect an ACK for the 100 bytes that we
    had sent and for which we have not yet received an ACK.

    > the 50 bytes as soon as Nagle tells it's ok. Last not least the
    > reciving end usually is able to handle out of order segments and
    > buffer the segment carrying the 50 bytes before they are delivered to
    > the reciveing application layer up until the first 100 bytes are
    > finally retransmitted and recived. Then, once the reciver get's the 50
    > out of order bytes it will send out an ack for the last segment
    > recived (the one before the 100 bytes) which in turn could (depending
    > on the implementation of the sending stack) initiate the immediate
    > retransmission of the missing 100 byte segment.


    Definitely.

    Cheers
    >
    > So, the bottom line is that Nagle Algorithm is in almost all cases a
    > good thing and almost never creates a problem. I think more problems
    > are introdruced (performance wise) by people turning it off "just
    > because" then because it's active.
    >
    > HTH
    >
    > Markus



  4. Re: Nagle's Algorithm Blues

    Abhishek wrote:
    > Hi,
    >
    > I have a question.
    >
    > The TCP stack has Nagle's Algo enabled. TCP sends a 100 byte packet to
    > the other end. Assume that it gets lost. Now, TCP gets a 50 byte packet
    > from the upper layer and needs to again send it to the same remote end.
    > TCP waits till it receives an ACK from the other end before it sends a
    > new segment carrying these 50 bytes.
    >
    > Since the 100 byte TCP pkt was lost, the senders retransmission timer
    > expires and it retransmits again. Assume that for some reason this also
    > gets lost.
    >
    > My question is this:
    >
    > For how long will the sender hold on to this 50 byte packet? Will it
    > wait till eternity till it receives an ACK for the 100 byte packet, or
    > will it after some threshold, send this packet?
    >
    > Thanks,
    > Abhishek
    >


    Neither. TCP will not send the 50 byte packet as long as it has not
    received the ACK for the 100 byte packet, since Nagle prevents this.
    However, there are two scenarios that could happen, which are
    implementation dependent.

    First, when it comes time to retransmit the 100 byte packet, it is
    perfectly reasonable for TCP to send all 150 bytes in a single
    packet at this point. Since TCP is a byte stream protocol, it
    can combine the data.

    On the other hand, some TCP stacks treat the data more like packets
    than streams, and will not send the 50 bytes. In that case it will
    have to hold on to them. However it is not for eternity. Since the
    100 bytes are not being ACK'd, the connection will eventually abort
    after what is commonly known as the tcp_ip_abort_interval. This is
    typically 6 minutes, but it could be something else.

    Of course, the sender really keeps a copy of all the data it has sent
    until it receives the ACK for that data.

    --
    blu

    Rose are #FF0000, Violets are #0000FF. All my base are belong to you.
    ----------------------------------------------------------------------
    Brian Utterback - OP/N1 RPE, Sun Microsystems, Inc.
    Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom

  5. Re: Nagle's Algorithm Blues

    Brian Utterback wrote:
    > after what is commonly known as the tcp_ip_abort_interval. This is


    Well, "commonly known" among those with TCP/IP stacks sharing a
    "Mentat" ancestry at least.

    rick jones
    --
    Wisdom Teeth are impacted, people are affected by the effects of events.
    these opinions are mine, all mine; HP might not want them anyway...
    feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...

  6. Re: Nagle's Algorithm Blues

    Abhishek wrote:
    > For how long will the sender hold on to this 50 byte packet? Will it
    > wait till eternity till it receives an ACK for the 100 byte packet,
    > or will it after some threshold, send this packet?


    Even if it were to send the 50 bytes before the previously sent 100
    bytes were ACKnowledged, the receiving application will not receive
    the 50 bytes until after it receives the previous 100 bytes.

    rick jones
    IIRC the rest has been handled in other responses.
    --
    a wide gulf separates "what if" from "if only"
    these opinions are mine, all mine; HP might not want them anyway...
    feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...

  7. Re: Nagle's Algorithm Blues

    On 2006-05-07, Markus Zingg wrote:

    > So, the bottom line is that Nagle Algorithm is in almost all cases a
    > good thing and almost never creates a problem. I think more problems
    > are introdruced (performance wise) by people turning it off "just
    > because" then because it's active.


    I wanted to chime in here not to refute the statement above, but just
    add another opinion. Nagle can be bad for real-time applications such
    as the transmission of status information critical to a factory
    operation, perhaps a medical concern, and financial industry
    information where profit is tied to the accuracy and timeliness of
    data. These cases are in the rare I'd think. There are practical
    cases where 1ms actually counts, especially in financial institutions
    involved in securities trading.

    Nagle accommodates for a lossy network case - in the instances I
    mentioned above the TCP/IP internetwork has been specifically
    designed to be as loss-less as possible and is usually only as
    geographically large as a city or two with long-haul links comprised
    of redundant high-speed optical transports.

    /dmfh

    ----
    __| |_ __ / _| |_ ____ __
    dmfh @ / _` | ' \| _| ' \ _ / _\ \ /
    \__,_|_|_|_|_| |_||_| (_) \__/_\_\
    ----

  8. Re: Nagle's Algorithm Blues

    dmfh@n0spam.dmfh.cx.spamn0t wrote...
    > On 2006-05-07, Markus Zingg wrote:
    >
    > > So, the bottom line is that Nagle Algorithm is in almost all cases a
    > > good thing and almost never creates a problem. I think more problems
    > > are introdruced (performance wise) by people turning it off "just
    > > because" then because it's active.

    >
    > I wanted to chime in here not to refute the statement above, but just
    > add another opinion. Nagle can be bad for real-time applications such
    > as the transmission of status information critical to a factory
    > operation, perhaps a medical concern, and financial industry
    > information where profit is tied to the accuracy and timeliness of
    > data. These cases are in the rare I'd think. There are practical
    > cases where 1ms actually counts, especially in financial institutions
    > involved in securities trading.
    >
    > Nagle accommodates for a lossy network case - in the instances I
    > mentioned above the TCP/IP internetwork has been specifically
    > designed to be as loss-less as possible and is usually only as
    > geographically large as a city or two with long-haul links comprised
    > of redundant high-speed optical transports.


    Yep. Which is exactly why you're often better off using UDP @ L3 for
    those types of applications, and writing whatever customized timing
    and msg loss handling logic into the app as needed. (Flamesuit on.)

    Or, as in your example case of the factory operations app, perhaps
    using a deterministic network structure, rather than Ethernet (which
    is what all factory operations systems I am aware of, do).

    --
    NBC, ABC, CBS, CNN, FNC, and MSNBC: Disney Channel for grownups.


  9. Re: Nagle's Algorithm Blues

    On 2006-05-23, udp4me@domain.invalid wrote:

    > dmfh@n0spam.dmfh.cx.spamn0t wrote...


    >> Nagle accommodates for a lossy network case - in the instances I
    >> mentioned above the TCP/IP internetwork has been specifically
    >> designed to be as loss-less as possible and is usually only as
    >> geographically large as a city or two with long-haul links comprised
    >> of redundant high-speed optical transports.

    >
    > Yep. Which is exactly why you're often better off using UDP @ L3 for
    > those types of applications, and writing whatever customized timing
    > and msg loss handling logic into the app as needed. (Flamesuit on.)


    No flamesuit required. I much prefer healthy debate to "being
    right" - nothing much comes out of just being right. I agree with
    you that UDP @ L3 is an answer for this - in financial networks
    various multicast topologies arise out of this need. TCP/IP can be
    tweaked up for high performance and some of the stack studies I've
    read from the mid-1990's on are wonderful. Solaris 10 seems to be
    delivering on this promise as well.

    With abundant CPU resources available, moving flow control and
    message verification into the application is possible.
    Unfortunately, a lot of the messaging middle ware seems to abstract
    the application developer from what I'd call "smart use" of both the
    messaging and network transport layer. Sending messages much larger
    than the physical MTU of a network interfaces and relying on the
    middle-ware to deliver without detailed knowledge of how the
    underlying OS IP stack buffers and processes packets lead to
    corner-case conditions that pop up at undesirable moments!

    The penchant of the industry to encapsulate protocols in protocols
    more than "one layer deep", such as TCP carrying HTTP, in which is
    contained say SOAP or a DCE RPE transaction, makes this problem more
    complex. It is, however, part of the promise that that old OSI model
    is delivering upon, however fuzzy L4-L7 has become.

    And as for Ethernet topologies, I do miss vampire taps and
    repeaters, but I don't miss mucking up a long piece of RG-8!

    /dmfh

    ----
    __| |_ __ / _| |_ ____ __
    dmfh @ / _` | ' \| _| ' \ _ / _\ \ /
    \__,_|_|_|_|_| |_||_| (_) \__/_\_\
    ----

  10. Re: Nagle's Algorithm Blues

    In article ,
    wrote:

    >Yep. Which is exactly why you're often better off using UDP @ L3 for
    >those types of applications, and writing whatever customized timing
    >and msg loss handling logic into the app as needed. (Flamesuit on.)


    That's perfectly rational, provide you include congestion detection and
    avoidance into your UDP-based application protocol.


    >Or, as in your example case of the factory operations app, perhaps
    >using a deterministic network structure, rather than Ethernet (which
    >is what all factory operations systems I am aware of, do).


    But that is horse manure, no matter how much nor for how many decades
    the deterministic snake oil vendors have been pumping their wares.

    The trouble is that when it comes down to it, no network is what the
    MAP, TOP, token ring, FDDI, and other deterministic salescritters mean
    by deterministic. On the other hand, when not overload, Ethernet is
    just as deterministic as the the rest. On of my old favorite ways to
    shoe the nature of the snake oil was to ask about worst case token
    latency on FDDI rings. The answer is a whole lot of seconds and in
    fact longer than the worst case CSMA/CD backoff. Then there is the
    non-determinancy in any network when bit rot happens. No one who has
    fought token ring style networks when something in the ring is flakey
    can honestly talk about determinancy.

    There are a bunch of good reasons to choose something other than classic
    CSMA/CD on copper on many factory floors, although none have anything
    to do with CSMA/CD.

    Then there is the fact that "Ethernet" today generally involves no
    CSMA/CD. Ethernet has become about what HP's 100VG-AnyLAN salescritters
    said their pet was going to be, and what it should have been if they'd
    not tried to make it too general and so made its worst case latency
    worse than 100 MHz CSMA/CD.
    --


    Vernon Schryver vjs@rhyolite.com

+ Reply to Thread