Nagle Algorithm - TCP-IP

This is a discussion on Nagle Algorithm - TCP-IP ; I'm a little confused with regard to the the Nagle algorithm. From googling I've read that the idea behind this algorithm is to reduce the number of small packets transmitted across the wire by introducing a short 100ms delay for ...

+ Reply to Thread
Results 1 to 9 of 9

Thread: Nagle Algorithm

  1. Nagle Algorithm

    I'm a little confused with regard to the the Nagle algorithm. From
    googling I've read that the idea behind this algorithm is to reduce
    the number of small packets transmitted across the wire by introducing
    a short 100ms delay for packets smaller than some minimum. I believe
    this minimum is controlled by tcp_naglim_def on Solaris systems.

    #ndd -get /dev/tcp tcp_naglim_def
    4095

    My question is how to properly disable this behavior when that may be
    desired. Is it sufficient to disable the behavior on the server side
    of the tcp connection or must something be changed on both sides to
    effectively disable.

    TIA,
    Vic

  2. Re: Nagle Algorithm

    Vic wrote:
    > My question is how to properly disable this behavior when that may be
    > desired. Is it sufficient to disable the behavior on the server side
    > of the tcp connection or must something be changed on both sides to
    > effectively disable.


    Since the Nagle algorithm can only be applied to outgoing data, and since
    their is no standard way to control it from other network nodes, disabling
    it on your server would affect only data outbound from that server, not
    data outbound from the clients.

    You would need to write the client and server apps such that they both set
    the TCP_NODELAY option (typically using the setsockopt function) on the
    socket(s) you wish to disable the algorithm on.

  3. Re: Nagle Algorithm

    It has been a while since I've trotted-out this boilerplate, but on
    the off chance it might be helpful to you:

    In broad terms, whenever an application does a send() call, the logic
    of the Nagle algorithm is supposed to go something like this:

    1) Is the quantity of data in this send, plus any queued, unsent data,
    greater than the MSS (Maximum Segment Size) for this connection? If
    yes, send the data in the user's send now (modulo any other
    constraints such as receiver's advertised window and the TCP
    congestion window). If no, go to 2.

    2) Is the connection to the remote otherwise idle? That is, is there
    no unACKed data outstanding on the network. If yes, send the data in
    the user's send now. If no, queue the data and wait. Either the
    application will continue to call send() with enough data to get to a
    full MSS-worth of data, or the remote will ACK all the currently sent,
    unACKed data, or our retransmission timer will expire.

    Now, where applications run into trouble is when they have what might
    be described as "write, write, read" behaviour, where they present
    logically associated data to the transport in separate 'send' calls
    and those sends are typically less than the MSS for the connection.
    It isn't so much that they run afoul of Nagle as they run into issues
    with the interaction of Nagle and the other heuristics operating on
    the remote. In particular, the delayed ACK heuristics.

    When a receiving TCP is deciding whether or not to send an ACK back to
    the sender, in broad handwaving terms it goes through logic similar to
    this:

    a) is there data being sent back to the sender? if yes, piggy-back the
    ACK on the data segment.

    b) is there a window update being sent back to the sender? if yes,
    piggy-back the ACK on the window update.

    c) has the standalone ACK timer expired.

    Window updates are generally triggered by the following heuristics:

    i) would the window update be for a non-trivial fraction of the window
    - typically somewhere at or above 1/4 the window, that is, has the
    application "consumed" at least that much data? if yes, send a
    window update. if no, check ii.

    ii) would the window update be for, the application "consumed," at
    least 2*MSS worth of data? if yes, send a window update, if no wait.

    Now, going back to that write, write, read application, on the sending
    side, the first write will be transmitted by TCP via logic rule 2 -
    the connection is otherwise idle. However, the second small send will
    be delayed as there is at that point unACKnowledged data outstanding
    on the connection.

    At the receiver, that small TCP segment will arrive and will be passed
    to the application. The application does not have the entire app-level
    message, so it will not send a reply (data to TCP) back. The typical
    TCP window is much much larger than the MSS, so no window update would
    be triggered by heuristic i. The data just arrived is < 2*MSS, so no
    window update from heuristic ii. Since there is no window update, no
    ACK is sent by heuristic b.

    So, that leaves heuristic c - the standalone ACK timer. That ranges
    anywhere between 50 and 200 milliseconds depending on the TCP stack in
    use.

    If you've read this far now we can take a look at the effect of
    various things touted as "fixes" to applications experiencing this
    interaction. We take as our example a client-server application where
    both the client and the server are implemented with a write of a small
    application header, followed by application data. First, the
    "default" case which is with Nagle enabled (TCP_NODELAY _NOT_ set) and
    with standard ACK behaviour:

    Client Server
    Req Header ->
    <- Standalone ACK after Nms
    Req Data ->
    <- Possible standalone ACK
    <- Rsp Header
    Standalone ACK ->
    <- Rsp Data
    Possible standalone ACK ->


    For two "messages" we end-up with at least six segments on the wire.
    The possible standalone ACKs will depend on whether the server's
    response time, or client's think time is longer than the standalone
    ACK interval on their respective sides. Now, if TCP_NODELAY is set we
    see:


    Client Server
    Req Header ->
    Req Data ->
    <- Possible Standalone ACK after Nms
    <- Rsp Header
    <- Rsp Data
    Possible Standalone ACK ->

    In theory, we are down two four segments on the wire which seems good,
    but frankly we can do better. First though, consider what happens
    when someone disables delayed ACKs

    Client Server
    Req Header ->
    <- Immediate Standalone ACK
    Req Data ->
    <- Immediate Standalone ACK
    <- Rsp Header
    Immediate Standalone ACK ->
    <- Rsp Data
    Immediate Standalone ACK ->

    Now we definitly see 8 segments on the wire. It will also be that way
    if both TCP_NODELAY is set and delayed ACKs are disabled.

    How about if the application did the "right" think in the first place?
    That is sent the logically associated data at the same time:


    Client Server
    Request ->
    <- Possible Standalone ACK
    <- Response
    Possible Standalone ACK ->

    We are down to two segments on the wire.

    For "small" packets, the CPU cost is about the same regardless of data
    or ACK. This means that the application which is making the propper
    gathering send call will spend far fewer CPU cycles in the networking
    stack.

    rick jones
    --
    denial, anger, bargaining, depression, acceptance, rebirth...
    where do you want to be today?
    these opinions are mine, all mine; HP might not want them anyway...
    feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...

  4. Re: Nagle Algorithm

    On Dec 12, 1:48 pm, Vic wrote:
    > I'm a little confused with regard to the the Nagle algorithm. From
    > googling I've read that the idea behind this algorithm is to reduce
    > the number of small packets transmitted across the wire by introducing
    > a short 100ms delay for packets smaller than some minimum. I believe
    > this minimum is controlled by tcp_naglim_def on Solaris systems.
    >
    > #ndd -get /dev/tcp tcp_naglim_def
    > 4095
    >
    > My question is how to properly disable this behavior when that may be
    > desired. Is it sufficient to disable the behavior on the server side
    > of the tcp connection or must something be changed on both sides to
    > effectively disable.


    Before you go any further, you should understand that about 95% of the
    time people think they need to disable Nagle, their belief is actually
    incorrect. Even if disabling Nagle actually solves their problem, it
    is still almost always the wrong way to solve the problem.

    Can you explain why you think you need to disable Nagle?

    If your application runs over the Internet, you are almost certainly
    wrong in your belief that disabling Nagle is the right thing to do. If
    it's a pure LAN application, you might be right, but it's still very
    unlikely.

    Did you know that Nagle has no effect on request/response protocols?
    So it is never appropriate to disable Nagle on web servers.

    DS

  5. Re: Nagle Algorithm

    David Schwartz wrote:
    > Even if disabling Nagle actually solves their problem, it is still
    > almost always the wrong way to solve the problem.


    Better to say "makes the symptoms go away."

    > Can you explain why you think you need to disable Nagle?


    > If your application runs over the Internet, you are almost certainly
    > wrong in your belief that disabling Nagle is the right thing to do. If
    > it's a pure LAN application, you might be right, but it's still very
    > unlikely.


    > Did you know that Nagle has no effect on request/response protocols?


    Properly implemented, single-request-outstanding request/response
    protocols at least - 99 times out of 10.

    > So it is never appropriate to disable Nagle on web servers.


    If there is a stream of logically associated "stuff" flowing, and each
    chunk of logically associated data is sub-MSS, then it might make
    sense to disable Nagle, even on a web server, or at least a web
    client, and at least when the requests are pipelined rather than
    simply persistent. It would be nice in the pipelined case for
    requests 2 through N to the web server be on their way before the
    first bytes of response 1 arrives, and it would be good to be sure
    that responses 2 through N aren't delayed simply because all the bytes
    of response 1 have not yet been ACKed by the client.

    For simply persistent connections, indeed, Nagle on/off should not
    matter, since that is the one request at a time on the connection
    case.

    rick jones
    --
    The computing industry isn't as much a game of "Follow The Leader" as
    it is one of "Ring Around the Rosy" or perhaps "Duck Duck Goose."
    - Rick Jones
    these opinions are mine, all mine; HP might not want them anyway...
    feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...

  6. Re: Nagle Algorithm

    >
    > Can you explain why you think you need to disable Nagle?
    >


    The application is a Solaris iSCSI initiator and I want to do some
    testing with an IO generation tool writing small blocksizes to the
    storage made available via the initiator. There are reports on the
    OpenSolaris storage discuss list where performance problems have been
    encountered and resolved by disabling Nagle. So what I wanted to do
    here is try and get a reasonable understanding of what Nagle is
    supposed to do and what the implications are if I disable Nagle.




  7. Re: Nagle Algorithm

    > Did you know that Nagle has no effect on request/response protocols?
    > So it is never appropriate to disable Nagle on web servers.


    Rick Jones tried to blow smoke over this howler, so let me clear the fog:

    VIRTUALLY EVERY WEB SERVER DISABLES NAGLE

    That includes every http browser, every http server, file servers such as Samba,
    and version control systems such as SVN.

    It is "never appropriate" to have an unintuitive, opaque, hard-to-debug performance hazard.
    The Nagle algorithm is one, and disabling it is always appropriate.

    The classic counter-argument is that Nagle only punishes bad programming.
    Partly true, but the punishment is a performance death penalty for a minor sin.

    Yes, Rick Jones found a bizarre setting on ttcp that shows reduced performance
    with Nagle disabled, but the reduced performance is to be expected from those settings,
    is highly visible (e.g. high cpu usage, high system call counts),
    is easily debugged, and is easily fixed.

    Vendors have even disabled Nagle in telnet, the program for which the algorithm was invented.
    Please stop promoting this perilous pitfall.

    Tom Truscott

  8. Re: Nagle Algorithm

    Tom Truscott wrote:
    > > Did you know that Nagle has no effect on request/response protocols?
    > > So it is never appropriate to disable Nagle on web servers.


    > Rick Jones tried to blow smoke over this howler, so let me clear the
    > fog:


    Hi Tom, how have you been

    > VIRTUALLY EVERY WEB SERVER DISABLES NAGLE


    > That includes every http browser, every http server, file servers
    > such as Samba, and version control systems such as SVN.


    You left-out either openssh or openssl, I cannot recall which -
    perhaps both.

    > It is "never appropriate" to have an unintuitive, opaque,
    > hard-to-debug performance hazard. The Nagle algorithm is one, and
    > disabling it is always appropriate.


    > The classic counter-argument is that Nagle only punishes bad
    > programming. Partly true, but the punishment is a performance death
    > penalty for a minor sin.


    > Yes, Rick Jones found a bizarre setting on ttcp that shows reduced
    > performance with Nagle disabled, but the reduced performance is to
    > be expected from those settings, is highly visible (e.g. high cpu
    > usage, high system call counts), is easily debugged, and is easily
    > fixed.


    Please, I "never" run ttcp, just netperf

    > Vendors have even disabled Nagle in telnet, the program for which
    > the algorithm was invented.


    Indeed, even my employer did that in HP-UX, and then promptly found a
    number of sites where the devices connecting via telnet got completely
    swamped by the great streams of single-character segments that ensued.

    rick jones

    > Please stop promoting this perilous pitfall.


    > Tom Truscott


    --
    No need to believe in either side, or any side. There is no cause.
    There's only yourself. The belief is in your own precision. - Jobert
    these opinions are mine, all mine; HP might not want them anyway...
    feel free to post, OR email to rick.jones2 in hp.com but NOT BOTH...

  9. Re: Nagle Algorithm

    On Dec 13, 12:25 pm, Vic wrote:
    > > Can you explain why you think you need to disable Nagle?

    >
    > The application is a Solaris iSCSI initiator and I want to do some
    > testing with an IO generation tool writing small blocksizes to the
    > storage made available via the initiator. There are reports on the
    > OpenSolaris storage discuss list where performance problems have been
    > encountered and resolved by disabling Nagle. So what I wanted to do
    > here is try and get a reasonable understanding of what Nagle is
    > supposed to do and what the implications are if I disable Nagle.


    That's actually a very good reason to disable Nagle. Just understand
    that if disabling Nagle helps, that indicate that something was wrong
    and it should be fixed. Disabling Nagle will demonstrate that the
    problem is poor write buffering in the server, and you can use that to
    know what to fix.

    DS

+ Reply to Thread