[PATCH net-2.6 0/3]: TCP fixes - Kernel

This is a discussion on [PATCH net-2.6 0/3]: TCP fixes - Kernel ; This fixes Bugzilla #10384 tcp_simple_retransmit does L increment without any checking whatsoever for overflowing S+L when Reno is in use. The simplest scenario I can currently think of is rather complex in practice (there might be some more straightforward cases ...

+ Reply to Thread
Results 1 to 4 of 4

Thread: [PATCH net-2.6 0/3]: TCP fixes

  1. [PATCH net-2.6 2/3] [TCP]: tcp_simple_retransmit can cause S+L

    This fixes Bugzilla #10384

    tcp_simple_retransmit does L increment without any checking
    whatsoever for overflowing S+L when Reno is in use.

    The simplest scenario I can currently think of is rather
    complex in practice (there might be some more straightforward
    cases though). Ie., if mss is reduced during mtu probing, it
    may end up marking everything lost and if some duplicate ACKs
    arrived prior to that sacked_out will be non-zero as well,
    leading to S+L > packets_out, tcp_clean_rtx_queue on the next
    cumulative ACK or tcp_fastretrans_alert on the next duplicate
    ACK will fix the S counter.

    More straightforward (but questionable) solution would be to
    just call tcp_reset_reno_sack() in tcp_simple_retransmit but
    it would negatively impact the probe's retransmission, ie.,
    the retransmissions would not occur if some duplicate ACKs
    had arrived.

    So I had to add reno sacked_out reseting to CA_Loss state
    when the first cumulative ACK arrives (this stale sacked_out
    might actually be the explanation for the reports of left_out
    overflows in kernel prior to 2.6.23 and S+L overflow reports
    of 2.6.24). However, this alone won't be enough to fix kernel
    before 2.6.24 because it is building on top of the commit
    1b6d427bb7e ([TCP]: Reduce sacked_out with reno when purging
    write_queue) to keep the sacked_out from overflowing.

    Signed-off-by: Ilpo Järvinen
    Reported-by: Alessandro Suardi
    ---
    include/net/tcp.h | 2 ++
    net/ipv4/tcp_input.c | 24 ++++++++++++++++++------
    net/ipv4/tcp_output.c | 3 +++
    3 files changed, 23 insertions(+), 6 deletions(-)

    diff --git a/include/net/tcp.h b/include/net/tcp.h
    index 723b368..8f5fc52 100644
    --- a/include/net/tcp.h
    +++ b/include/net/tcp.h
    @@ -760,6 +760,8 @@ static inline unsigned int tcp_packets_in_flight(const struct tcp_sock *tp)
    return tp->packets_out - tcp_left_out(tp) + tp->retrans_out;
    }

    +extern int tcp_limit_reno_sacked(struct tcp_sock *tp);
    +
    /* If cwnd > ssthresh, we may raise ssthresh to be half-way to cwnd.
    * The exception is rate halving phase, when cwnd is decreasing towards
    * ssthresh.
    diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
    index 58cad8b..8cace40 100644
    --- a/net/ipv4/tcp_input.c
    +++ b/net/ipv4/tcp_input.c
    @@ -1625,13 +1625,11 @@ out:
    return flag;
    }

    -/* If we receive more dupacks than we expected counting segments
    - * in assumption of absent reordering, interpret this as reordering.
    - * The only another reason could be bug in receiver TCP.
    +/* Limits sacked_out so that sum with lost_out isn't ever larger than
    + * packets_out. Returns zero if sacked_out adjustement wasn't necessary.
    */
    -static void tcp_check_reno_reordering(struct sock *sk, const int addend)
    +int tcp_limit_reno_sacked(struct tcp_sock *tp)
    {
    - struct tcp_sock *tp = tcp_sk(sk);
    u32 holes;

    holes = max(tp->lost_out, 1U);
    @@ -1639,8 +1637,20 @@ static void tcp_check_reno_reordering(struct sock *sk, const int addend)

    if ((tp->sacked_out + holes) > tp->packets_out) {
    tp->sacked_out = tp->packets_out - holes;
    - tcp_update_reordering(sk, tp->packets_out + addend, 0);
    + return 1;
    }
    + return 0;
    +}
    +
    +/* If we receive more dupacks than we expected counting segments
    + * in assumption of absent reordering, interpret this as reordering.
    + * The only another reason could be bug in receiver TCP.
    + */
    +static void tcp_check_reno_reordering(struct sock *sk, const int addend)
    +{
    + struct tcp_sock *tp = tcp_sk(sk);
    + if (tcp_limit_reno_sacked(tp))
    + tcp_update_reordering(sk, tp->packets_out + addend, 0);
    }

    /* Emulate SACKs for SACKless connection: account for a new dupack. */
    @@ -2600,6 +2610,8 @@ static void tcp_fastretrans_alert(struct sock *sk, int pkts_acked, int flag)
    case TCP_CA_Loss:
    if (flag & FLAG_DATA_ACKED)
    icsk->icsk_retransmits = 0;
    + if (tcp_is_reno(tp) && flag & FLAG_SND_UNA_ADVANCED)
    + tcp_reset_reno_sack(tp);
    if (!tcp_try_undo_loss(sk)) {
    tcp_moderate_cwnd(tp);
    tcp_xmit_retransmit_queue(sk);
    diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
    index 6e25540..441fdd3 100644
    --- a/net/ipv4/tcp_output.c
    +++ b/net/ipv4/tcp_output.c
    @@ -1808,6 +1808,9 @@ void tcp_simple_retransmit(struct sock *sk)
    if (!lost)
    return;

    + if (tcp_is_reno(tp))
    + tcp_limit_reno_sacked(tp);
    +
    tcp_verify_left_out(tp);

    /* Don't muck with the congestion window here.
    --
    1.5.2.2

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. [PATCH net-2.6 0/3]: TCP fixes

    Hi Dave,

    Here are some TCP fixes that resulted from the last weeks reports
    & debug. The first one is quite likely to hit but it's considerably
    harder to get it print an overflow warning, while I've seen two
    reports about the second one (one of them is for 2.6.24.y), please
    ignore the earlier version of the tcp_simple_retransmit patch. The
    FRTO patch is once again result of code review rather than some
    report but seems necessary to avoid detection of some not that
    likely cases as spurious RTOs when there were some other losses
    in the same window with the probe. Nevertheless, it should be safe
    to return non-FRTO behavior.

    ....So far, they're just compile tested. I'll see if I find some time
    to boot them on the evening.

    There might still be one fackets_out miscounter awaiting detection
    with debug patch (I hope Georgi catches it) because none of these
    seem an obvious reason for triggery of the !sacked_out && fackets_out
    trap.

    All of them are also valid for stables but please note that it won't
    be too easy, at least for the first & second patch, because of recent
    changes (especially if stable != 2.6.24.y). I'll try to do the
    adapted versions later on for the stable.

    --
    i.


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. [PATCH net-2.6 1/3] [TCP]: Fix NewReno's fast rexmit/recovery problems with GSOed skb

    Fixes a long-standing bug which makes NewReno recovery crippled.
    With GSO the whole head skb was marked as LOST which is in
    violation of NewReno procedure that only wants to mark one packet
    and ended up breaking our TCP code by causing counter overflow
    because our code was built on top of assumption about valid
    NewReno procedure. This manifested as triggering a WARN_ON for
    the overflow in a number of places.

    It seems relatively safe alternative to just do nothing if
    tcp_fragment fails due to oom because another duplicate ACK is
    likely to be received soon and the fragmentation will be retried.

    Special thanks goes to Soeren Sonnenburg who was
    lucky enough to be able to reproduce this so that the warning
    for the overflow was hit. It's not as easy task as it seems even
    if this bug happens quite often because the amount of outstanding
    data is pretty significant for the mismarkings to lead to an
    overflow.

    Because it's very late in 2.6.25-rc cycle (if this even makes in
    time), I didn't want to touch anything with SACK enabled here.
    Fragmenting might be useful for it as well but it's more or less
    a policy decision rather than mandatory fix. Thus there's no need
    to rush and we can postpone considering tcp_fragment with SACK
    for 2.6.26.

    In 2.6.24 and earlier, this very same bug existed but the effect
    is slightly different because of a small changes in the if
    conditions that fit to the patch's context. With them nothing
    got lost marker and thus no retransmissions happened.

    Signed-off-by: Ilpo Järvinen
    ---
    net/ipv4/tcp_input.c | 22 ++++++++++++++++++----
    1 files changed, 18 insertions(+), 4 deletions(-)

    diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
    index 6e46b4c..58cad8b 100644
    --- a/net/ipv4/tcp_input.c
    +++ b/net/ipv4/tcp_input.c
    @@ -2138,7 +2138,9 @@ static void tcp_mark_head_lost(struct sock *sk, int packets, int fast_rexmit)
    {
    struct tcp_sock *tp = tcp_sk(sk);
    struct sk_buff *skb;
    - int cnt;
    + int cnt, oldcnt;
    + int err;
    + unsigned int mss;

    BUG_TRAP(packets <= tp->packets_out);
    if (tp->lost_skb_hint) {
    @@ -2157,13 +2159,25 @@ static void tcp_mark_head_lost(struct sock *sk, int packets, int fast_rexmit)
    tp->lost_skb_hint = skb;
    tp->lost_cnt_hint = cnt;

    + if (after(TCP_SKB_CB(skb)->end_seq, tp->high_seq))
    + break;
    +
    + oldcnt = cnt;
    if (tcp_is_fack(tp) || tcp_is_reno(tp) ||
    (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED))
    cnt += tcp_skb_pcount(skb);

    - if (((!fast_rexmit || (tp->lost_out > 0)) && (cnt > packets)) ||
    - after(TCP_SKB_CB(skb)->end_seq, tp->high_seq))
    - break;
    + if ((!fast_rexmit || (tp->lost_out > 0)) && (cnt > packets)) {
    + if (tcp_is_sack(tp) || (oldcnt >= packets))
    + break;
    +
    + mss = skb_shinfo(skb)->gso_size;
    + err = tcp_fragment(sk, skb, (packets - oldcnt) * mss, mss);
    + if (err < 0)
    + break;
    + cnt = packets;
    + }
    +
    if (!(TCP_SKB_CB(skb)->sacked & (TCPCB_SACKED_ACKED|TCPCB_LOST))) {
    TCP_SKB_CB(skb)->sacked |= TCPCB_LOST;
    tp->lost_out += tcp_skb_pcount(skb);
    --
    1.5.2.2

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [PATCH net-2.6 1/3] [TCP]: Fix NewReno's fast rexmit/recovery problems with GSOed skb

    On Mon, 7 Apr 2008, Ilpo Järvinen wrote:

    > Fixes a long-standing bug which makes NewReno recovery crippled.
    > With GSO the whole head skb was marked as LOST which is in
    > violation of NewReno procedure that only wants to mark one packet
    > and ended up breaking our TCP code by causing counter overflow
    > because our code was built on top of assumption about valid
    > NewReno procedure. This manifested as triggering a WARN_ON for
    > the overflow in a number of places.
    >
    > It seems relatively safe alternative to just do nothing if
    > tcp_fragment fails due to oom because another duplicate ACK is
    > likely to be received soon and the fragmentation will be retried.
    >
    > Special thanks goes to Soeren Sonnenburg who was
    > lucky enough to be able to reproduce this so that the warning
    > for the overflow was hit. It's not as easy task as it seems even
    > if this bug happens quite often because the amount of outstanding
    > data is pretty significant for the mismarkings to lead to an
    > overflow.
    >
    > Because it's very late in 2.6.25-rc cycle (if this even makes in
    > time), I didn't want to touch anything with SACK enabled here.
    > Fragmenting might be useful for it as well but it's more or less
    > a policy decision rather than mandatory fix. Thus there's no need
    > to rush and we can postpone considering tcp_fragment with SACK
    > for 2.6.26.
    >
    > In 2.6.24 and earlier, this very same bug existed but the effect
    > is slightly different because of a small changes in the if
    > conditions that fit to the patch's context. With them nothing
    > got lost marker and thus no retransmissions happened.
    >
    > Signed-off-by: Ilpo Järvinen
    > ---
    > net/ipv4/tcp_input.c | 22 ++++++++++++++++++----
    > 1 files changed, 18 insertions(+), 4 deletions(-)
    >
    > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
    > index 6e46b4c..58cad8b 100644
    > --- a/net/ipv4/tcp_input.c
    > +++ b/net/ipv4/tcp_input.c
    > @@ -2138,7 +2138,9 @@ static void tcp_mark_head_lost(struct sock *sk, int packets, int fast_rexmit)
    > {
    > struct tcp_sock *tp = tcp_sk(sk);
    > struct sk_buff *skb;
    > - int cnt;
    > + int cnt, oldcnt;
    > + int err;
    > + unsigned int mss;
    >
    > BUG_TRAP(packets <= tp->packets_out);
    > if (tp->lost_skb_hint) {
    > @@ -2157,13 +2159,25 @@ static void tcp_mark_head_lost(struct sock *sk, int packets, int fast_rexmit)
    > tp->lost_skb_hint = skb;
    > tp->lost_cnt_hint = cnt;
    >
    > + if (after(TCP_SKB_CB(skb)->end_seq, tp->high_seq))
    > + break;
    > +
    > + oldcnt = cnt;
    > if (tcp_is_fack(tp) || tcp_is_reno(tp) ||
    > (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED))
    > cnt += tcp_skb_pcount(skb);
    >
    > - if (((!fast_rexmit || (tp->lost_out > 0)) && (cnt > packets)) ||
    > - after(TCP_SKB_CB(skb)->end_seq, tp->high_seq))
    > - break;
    > + if ((!fast_rexmit || (tp->lost_out > 0)) && (cnt > packets)) {


    ....Nah, this won't work too well. I'll repost soon.

    > + if (tcp_is_sack(tp) || (oldcnt >= packets))
    > + break;
    > +
    > + mss = skb_shinfo(skb)->gso_size;
    > + err = tcp_fragment(sk, skb, (packets - oldcnt) * mss, mss);
    > + if (err < 0)
    > + break;
    > + cnt = packets;
    > + }
    > +
    > if (!(TCP_SKB_CB(skb)->sacked & (TCPCB_SACKED_ACKED|TCPCB_LOST))) {
    > TCP_SKB_CB(skb)->sacked |= TCPCB_LOST;
    > tp->lost_out += tcp_skb_pcount(skb);
    >


    --
    i.

+ Reply to Thread