Sysctl knob(s) to set TCP 'nagle' time-out? - FreeBSD

This is a discussion on Sysctl knob(s) to set TCP 'nagle' time-out? - FreeBSD ; Hi, I'm wondering if anything exists to set this.. When you create an INET socket without the 'TCP_NODELAY' flag the network layer does 'naggling' on your transmitted data. Sometimes with hosts that use Delayed_ACK (net.inet.tcp. delayed_ack) it creates a dead-lock ...

+ Reply to Thread
Results 1 to 8 of 8

Thread: Sysctl knob(s) to set TCP 'nagle' time-out?

  1. Sysctl knob(s) to set TCP 'nagle' time-out?

    Hi,

    I'm wondering if anything exists to set this.. When you create an INET
    socket
    without the 'TCP_NODELAY' flag the network layer does 'naggling' on your
    transmitted data. Sometimes with hosts that use Delayed_ACK
    (net.inet.tcp.
    delayed_ack) it creates a dead-lock where the host will not ACK until
    it gets
    another packet and the client will not send another packet until it
    gets an ACK..

    The dead-lock gets broken by a time-out, which I think is around 200ms?

    But I would like to change that time-out if possible to something
    lower, yet
    I can't really see any sysctl knobs that have a name that suggests
    they do
    that..

    So does anyone know IF this can be tuned and if so by what?

    Cheers,
    Jerahmy.

    (And yes you could solve it by setting the TCP_NODELAY flag on the
    socket,
    but not everything has programmed in options to set it and you don't
    always
    have access to the source, besides setting a sysctl value would be much
    simpler than recompiling stuff)
    _______________________________________________
    freebsd-stable@freebsd.org mailing list
    http://lists.freebsd.org/mailman/lis...freebsd-stable
    To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"


  2. Re: Sysctl knob(s) to set TCP 'nagle' time-out?


    :Hi,
    :
    :I'm wondering if anything exists to set this.. When you create an INET
    :socket
    :without the 'TCP_NODELAY' flag the network layer does 'naggling' on your
    :transmitted data. Sometimes with hosts that use Delayed_ACK
    net.inet.tcp.
    :delayed_ack) it creates a dead-lock where the host will not ACK until
    :it gets
    :another packet and the client will not send another packet until it
    :gets an ACK..
    :
    :The dead-lock gets broken by a time-out, which I think is around 200ms?
    :
    :But I would like to change that time-out if possible to something
    :lower, yet
    :I can't really see any sysctl knobs that have a name that suggests
    :they do
    :that..
    :
    :So does anyone know IF this can be tuned and if so by what?
    :
    :Cheers,
    :Jerahmy.
    :
    And yes you could solve it by setting the TCP_NODELAY flag on the
    :socket,
    :but not everything has programmed in options to set it and you don't
    :always
    :have access to the source, besides setting a sysctl value would be much
    :simpler than recompiling stuff)

    There is a sysctl which adjusts the delayed-ack timing, its
    called net.inet.tcp.delacktime. The default is 1/10 of a second
    (100 == 100 ms = 1/10 of a second).

    BUT, it shouldn't be possible for nagle to deadlock against delayed acks
    unless the TCP implementation is broken somehow. A delayed ack is
    simply that... the ack is delayed 100 ms in order to improve its
    chances of being piggy-backed on return data. The ack is not blocked
    completely, just delayed, and certain events (such as the receiving
    end turning around and sending data back, which is typical for an
    interactive connection)... certain events will cause the delayed ack
    to be aborted and for the ack to be immediately sent with the return data.

    Can it break down and cause excessive lag? Yes, it can. Interactive
    games almost universally have to disable Nagle because the lag is
    actually due to the data relay from client 1 -> server then relaying
    the interactive event to client 2. Without an immediate interactive
    response to client 1 the ack gets delayed and the next event from
    client 1 hits Nagle and stops dead in the water until the first event
    reaches client 2 and client 2 reacts to it (then client 2 -> server ->
    (abort delayed ack and send) -> client 1 (client 1's nagle now allows
    the second event to be transmitted). That isn't a deadlock, just
    really poor interactive performance in that particular situation.

    Delayed acks also have a safety valve. The spec says that an ack
    cannot be delayed more then two packets. In a batch link when the
    second (unacked) packet is received, the delayed ack is aborted and
    an ack is immediately returned to the sender. This is to prevent
    congestion control (which is based on acks) from getting completely
    out of whack and also to prevent the TCP window from getting exhausted.

    In anycase, the usual solution is to disable Nagle rather then mess
    with delayed acks. What we need is a new Nagle that understands the
    new reality for interactive connections... something that doesn't break
    performance in the 'server in the middle' data relaying case.

    -Matt

    _______________________________________________
    freebsd-stable@freebsd.org mailing list
    http://lists.freebsd.org/mailman/lis...freebsd-stable
    To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"


  3. Re: Sysctl knob(s) to set TCP 'nagle' time-out?

    On Mon, Jun 23, 2008 at 05:25:49PM +1000, Jerahmy Pocott wrote:
    > So does anyone know IF this can be tuned and if so by what?


    You can tune it with net.inet.tcp.delacktime - it should be is ms.

    David.
    _______________________________________________
    freebsd-stable@freebsd.org mailing list
    http://lists.freebsd.org/mailman/lis...freebsd-stable
    To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"


  4. Re: Sysctl knob(s) to set TCP 'nagle' time-out?


    On 23/06/2008, at 6:27 PM, Matthew Dillon wrote:

    > Can it break down and cause excessive lag? Yes, it can.
    > Interactive


    > games almost universally have to disable Nagle because the lag is
    > actually due to the data relay from client 1 -> server then
    > relaying
    > the interactive event to client 2. Without an immediate
    > interactive
    > response to client 1 the ack gets delayed and the next event from
    > client 1 hits Nagle and stops dead in the water until the first
    > event
    > reaches client 2 and client 2 reacts to it (then client 2 ->
    > server ->
    > (abort delayed ack and send) -> client 1 (client 1's nagle now
    > allows
    > the second event to be transmitted). That isn't a deadlock, just
    > really poor interactive performance in that particular situation.


    Yeah, that's what I'm talking about.

    True, it's not really a dead-lock, but it's terribly slow! The
    interaction can
    cause a 200ms delay on a LAN, as can be seen with samba if you disable
    tcp_nodelay..


    > In anycase, the usual solution is to disable Nagle rather then mess
    > with delayed acks. What we need is a new Nagle that understands
    > the
    > new reality for interactive connections... something that doesn't
    > break
    > performance in the 'server in the middle' data relaying case.



    Exactly, there is nothing really wrong with delayed acks.. But with
    sysctl
    I CAN disable and mess with the delayed acks, but I can't seem to do
    anything to Nagle.

    That's why I was thinking if I could change the Nagle time-out to 0ms it
    would effectively disable it..

    Cheers.
    J.
    _______________________________________________
    freebsd-stable@freebsd.org mailing list
    http://lists.freebsd.org/mailman/lis...freebsd-stable
    To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"


  5. Re: Sysctl knob(s) to set TCP 'nagle' time-out?


    On 23/06/2008, at 7:00 PM, David Malone wrote:

    > On Mon, Jun 23, 2008 at 05:25:49PM +1000, Jerahmy Pocott wrote:
    >> So does anyone know IF this can be tuned and if so by what?

    >
    > You can tune it with net.inet.tcp.delacktime - it should be is ms.


    Yeah I saw that one. But that only changes the delayed ack...

    The default value of 100ms seems fairly reasonable unless you're
    talking about a LAN..

    I guess what I really want to do is disable Nagle in the tcp stack, but
    since you do that with the sockopts call on a per socket basis I'm
    guessing there isn't any system wide tunable for it..

    Thanks,
    Jerahmy.
    _______________________________________________
    freebsd-stable@freebsd.org mailing list
    http://lists.freebsd.org/mailman/lis...freebsd-stable
    To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"


  6. Re: Sysctl knob(s) to set TCP 'nagle' time-out?

    Matthew Dillon wrote:
    > In anycase, the usual solution is to disable Nagle rather then mess
    > with delayed acks. What we need is a new Nagle that understands the
    > new reality for interactive connections... something that doesn't break
    > performance in the 'server in the middle' data relaying case.


    One possibility I see is a statistic about DelACKs per TCP connection,
    counting those that were rightfully delayed (with hindsight). I.e.,
    if an ACK is delayed, but there was no chance to piggy-back it or to
    combine it with another ACK, it could have been sent without delay.
    Only those delayed ACKs that reduce load are "good", all others cause
    additional state to be maintained and may increase latencies for no
    good reason.

    Therefore, I thought about starting with Nagle enabled, but give up
    on delaying ACKs, when doing so is found to be ineffective.

    The only problem with this approach is that once TCP_NODELAY is
    implicitly set due to measured behavior of the communication, a
    situation that would benefit from delayed ACKs can no longer be
    detected. (Well, you could measure the delay between an ACK and
    the next data sent to the same destination; disable TCP_NODELAY
    if ACKs could have been piggy-backed on data packets without too
    much delay. May be we could really have TCP auto-tune with respect
    to use of delayed ACKs ...

    I had suggested this years back, when the issue was discussed, but
    consensus was, that you should just set TCP_NODELAY. But automatic
    adjustment could also (implicitly) take RTT, window size into
    consideration. And to me, automatic setting of TCP_NODELAY seems
    more useful than automatic clearing (after delayed ACKs had been
    found to be of no use for a window of say 8 or 16 ACKs).

    The implementation would be quite simple: Whenever a delayed ACK
    is sent, check whether it is sent on its own (bad) or whether it
    could be piggy-backed (good). If, say, 7 of 8 delayed ACKs had to
    be sent as ACK-only packets, anyway, set TCP_NODELAY and do not
    bother to keep on deciding whether delayed ACKs had become useful
    in a different phase of the communication. If you want to be able
    to automatically disable TCP_NODELAY, then just set a time-stamp
    whenever an ACK is sent and when the next data is sent through
    this same socket, check whether delaying the ACK had allowed to
    send it with that data packet (i.e. the delay was less than the
    maximum hold time of the delayed ACK). If it had been beneficial
    to delay ACKs (say 3 out of a window of 4) then clear TCP_NODELAY.

    I have no idea, whether SMP locking would be problematic, but I
    guess the checks and counter updates could be put in sections
    that are appropriately locked, anyway.

    Regards, STefan
    _______________________________________________
    freebsd-stable@freebsd.org mailing list
    http://lists.freebsd.org/mailman/lis...freebsd-stable
    To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"


  7. Re: Sysctl knob(s) to set TCP 'nagle' time-out?


    :One possibility I see is a statistic about DelACKs per TCP connection,
    :counting those that were rightfully delayed (with hindsight). I.e.,
    :if an ACK is delayed, but there was no chance to piggy-back it or to
    :combine it with another ACK, it could have been sent without delay.
    :Only those delayed ACKs that reduce load are "good", all others cause
    :additional state to be maintained and may increase latencies for no
    :good reason.
    :
    :...
    :consideration. And to me, automatic setting of TCP_NODELAY seems
    :more useful than automatic clearing (after delayed ACKs had been
    :found to be of no use for a window of say 8 or 16 ACKs).
    :
    :The implementation would be quite simple: Whenever a delayed ACK
    :is sent, check whether it is sent on its own (bad) or whether it
    :could be piggy-backed (good). If, say, 7 of 8 delayed ACKs had to
    :be sent as ACK-only packets, anyway, set TCP_NODELAY and do not
    :bother to keep on deciding whether delayed ACKs had become useful
    :in a different phase of the communication. If you want to be able
    :to automatically disable TCP_NODELAY, then just set a time-stamp
    :...
    :Regards, STefan

    That's an interesting approach. I think it would catch some
    of the cases, but not enough of them. If the round-trip in
    the server-relaying case is less then the delayed-ack, the acks
    will still wind up piggy-backed on return traffic but the latency
    will also still remain horrible.

    It should be noted that Nagle can cause high latencies even when
    delayed acks are turned off. Nagle's delay is not timed... in its
    simplest description it prevents packets from being transmitted
    for new data coming from userland if the data already in the
    sockbuf (and presumably already transmitted) has not yet been
    acknowledged.

    For interactive traffic this means that Nagle is putting the screws
    on the packet stream even if the acks aren't delayed, simply from the
    ack latency. With delayed acks turned off the latency is lower, but
    not 0, so interactive traffic is still being held up by Nagle. The
    effect is noticeable even on a LAN. Jerahmy brought up Samba... that
    is an excellent example. NFS-over-TCP would be another good example.

    Any protocol which multiplexes multiple commands from different
    sources over the same connection gets really messed up (slowed down)
    by Nagle.

    On the flip side, Nagle can't just be turned off by default because
    it would cause streaming connections from user programs which do tiny
    writes to generate a lot of unnecessarily tiny packets. This can become
    apparent when using SSH over a slow link. Numerous programs run from
    a shell generate fairly ineffcient packets which could have easily
    been batched when operating over SSH. The result can be sludgy
    performance for output which ought be batched up by TCP but isn't because
    SSH turns off Nagle unconditionally.

    -Matt
    Matthew Dillon

    _______________________________________________
    freebsd-stable@freebsd.org mailing list
    http://lists.freebsd.org/mailman/lis...freebsd-stable
    To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"


  8. Re: Sysctl knob(s) to set TCP 'nagle' time-out?


    On 24/06/2008, at 2:42 AM, Matthew Dillon wrote:

    > It should be noted that Nagle can cause high latencies even when
    > delayed acks are turned off. Nagle's delay is not timed... in its
    > simplest description it prevents packets from being transmitted
    > for new data coming from userland if the data already in the
    > sockbuf (and presumably already transmitted) has not yet been
    > acknowledged.


    Assuming that a full data packet can't be constructed in the time it
    takes for the acknowledgement. If you CAN construct a whole packet
    in that time then Nagle is either doing a good job or you're sending
    large amounts of data..

    Perhaps nagle a) needs a time out, though I don't really think that
    would help, or b) uses a dynamic 'in-flight' count where it tries to
    maintain x packets in-flight and only holds packets up when that
    value is reached.. The idea being that you get the ack on your first
    packet at the same time as the host should be getting your second
    packet..

    That way you still get to concatenate lots of small packets being
    generated in a short space of time, but don't hold up sending data
    because of the ack latency. It should also be possible to detect if
    the remote host is using delayed acks and compensate for that?

    Though I'v not considered it in much detail..
    _______________________________________________
    freebsd-stable@freebsd.org mailing list
    http://lists.freebsd.org/mailman/lis...freebsd-stable
    To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"


+ Reply to Thread