[PATCHv2 net-2.6 3/4] [TCP]: tcp_simple_retransmit can cause S+L - Kernel

This is a discussion on [PATCHv2 net-2.6 3/4] [TCP]: tcp_simple_retransmit can cause S+L - Kernel ; This fixes Bugzilla #10384 tcp_simple_retransmit does L increment without any checking whatsoever for overflowing S+L when Reno is in use. The simplest scenario I can currently think of is rather complex in practice (there might be some more straightforward cases ...

+ Reply to Thread
Results 1 to 2 of 2

Thread: [PATCHv2 net-2.6 3/4] [TCP]: tcp_simple_retransmit can cause S+L

  1. [PATCHv2 net-2.6 3/4] [TCP]: tcp_simple_retransmit can cause S+L

    This fixes Bugzilla #10384

    tcp_simple_retransmit does L increment without any checking
    whatsoever for overflowing S+L when Reno is in use.

    The simplest scenario I can currently think of is rather
    complex in practice (there might be some more straightforward
    cases though). Ie., if mss is reduced during mtu probing, it
    may end up marking everything lost and if some duplicate ACKs
    arrived prior to that sacked_out will be non-zero as well,
    leading to S+L > packets_out, tcp_clean_rtx_queue on the next
    cumulative ACK or tcp_fastretrans_alert on the next duplicate
    ACK will fix the S counter.

    More straightforward (but questionable) solution would be to
    just call tcp_reset_reno_sack() in tcp_simple_retransmit but
    it would negatively impact the probe's retransmission, ie.,
    the retransmissions would not occur if some duplicate ACKs
    had arrived.

    So I had to add reno sacked_out reseting to CA_Loss state
    when the first cumulative ACK arrives (this stale sacked_out
    might actually be the explanation for the reports of left_out
    overflows in kernel prior to 2.6.23 and S+L overflow reports
    of 2.6.24). However, this alone won't be enough to fix kernel
    before 2.6.24 because it is building on top of the commit
    1b6d427bb7e ([TCP]: Reduce sacked_out with reno when purging
    write_queue) to keep the sacked_out from overflowing.

    Signed-off-by: Ilpo Järvinen
    Reported-by: Alessandro Suardi
    ---
    include/net/tcp.h | 2 ++
    net/ipv4/tcp_input.c | 24 ++++++++++++++++++------
    net/ipv4/tcp_output.c | 3 +++
    3 files changed, 23 insertions(+), 6 deletions(-)

    diff --git a/include/net/tcp.h b/include/net/tcp.h
    index 723b368..8f5fc52 100644
    --- a/include/net/tcp.h
    +++ b/include/net/tcp.h
    @@ -760,6 +760,8 @@ static inline unsigned int tcp_packets_in_flight(const struct tcp_sock *tp)
    return tp->packets_out - tcp_left_out(tp) + tp->retrans_out;
    }

    +extern int tcp_limit_reno_sacked(struct tcp_sock *tp);
    +
    /* If cwnd > ssthresh, we may raise ssthresh to be half-way to cwnd.
    * The exception is rate halving phase, when cwnd is decreasing towards
    * ssthresh.
    diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
    index f053bfe..a2a06d6 100644
    --- a/net/ipv4/tcp_input.c
    +++ b/net/ipv4/tcp_input.c
    @@ -1625,13 +1625,11 @@ out:
    return flag;
    }

    -/* If we receive more dupacks than we expected counting segments
    - * in assumption of absent reordering, interpret this as reordering.
    - * The only another reason could be bug in receiver TCP.
    +/* Limits sacked_out so that sum with lost_out isn't ever larger than
    + * packets_out. Returns zero if sacked_out adjustement wasn't necessary.
    */
    -static void tcp_check_reno_reordering(struct sock *sk, const int addend)
    +int tcp_limit_reno_sacked(struct tcp_sock *tp)
    {
    - struct tcp_sock *tp = tcp_sk(sk);
    u32 holes;

    holes = max(tp->lost_out, 1U);
    @@ -1639,8 +1637,20 @@ static void tcp_check_reno_reordering(struct sock *sk, const int addend)

    if ((tp->sacked_out + holes) > tp->packets_out) {
    tp->sacked_out = tp->packets_out - holes;
    - tcp_update_reordering(sk, tp->packets_out + addend, 0);
    + return 1;
    }
    + return 0;
    +}
    +
    +/* If we receive more dupacks than we expected counting segments
    + * in assumption of absent reordering, interpret this as reordering.
    + * The only another reason could be bug in receiver TCP.
    + */
    +static void tcp_check_reno_reordering(struct sock *sk, const int addend)
    +{
    + struct tcp_sock *tp = tcp_sk(sk);
    + if (tcp_limit_reno_sacked(tp))
    + tcp_update_reordering(sk, tp->packets_out + addend, 0);
    }

    /* Emulate SACKs for SACKless connection: account for a new dupack. */
    @@ -2600,6 +2610,8 @@ static void tcp_fastretrans_alert(struct sock *sk, int pkts_acked, int flag)
    case TCP_CA_Loss:
    if (flag & FLAG_DATA_ACKED)
    icsk->icsk_retransmits = 0;
    + if (tcp_is_reno(tp) && flag & FLAG_SND_UNA_ADVANCED)
    + tcp_reset_reno_sack(tp);
    if (!tcp_try_undo_loss(sk)) {
    tcp_moderate_cwnd(tp);
    tcp_xmit_retransmit_queue(sk);
    diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
    index 6e25540..441fdd3 100644
    --- a/net/ipv4/tcp_output.c
    +++ b/net/ipv4/tcp_output.c
    @@ -1808,6 +1808,9 @@ void tcp_simple_retransmit(struct sock *sk)
    if (!lost)
    return;

    + if (tcp_is_reno(tp))
    + tcp_limit_reno_sacked(tp);
    +
    tcp_verify_left_out(tp);

    /* Don't muck with the congestion window here.
    --
    1.5.2.2

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. [PATCHv2 net-2.6 4/4] [TCP]: Don't allow FRTO to take place while MTU is being probed

    MTU probe can cause some remedies for FRTO because the normal
    packet ordering may be violated allowing FRTO to make a wrong
    decision (it might not be that serious threat for anything
    though). Thus it's safer to not run FRTO while MTU probe is
    underway.

    It seems that the basic FRTO variant should also look for an
    skb at probe_seq.start to check if that's retransmitted one
    but I didn't implement it now (plain seqno in window check
    isn't robust against wraparounds).

    Signed-off-by: Ilpo Järvinen
    ---
    net/ipv4/tcp_input.c | 5 +++++
    1 files changed, 5 insertions(+), 0 deletions(-)

    diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
    index a2a06d6..2b09a89 100644
    --- a/net/ipv4/tcp_input.c
    +++ b/net/ipv4/tcp_input.c
    @@ -1691,11 +1691,16 @@ static inline void tcp_reset_reno_sack(struct tcp_sock *tp)
    int tcp_use_frto(struct sock *sk)
    {
    const struct tcp_sock *tp = tcp_sk(sk);
    + const struct inet_connection_sock *icsk = inet_csk(sk);
    struct sk_buff *skb;

    if (!sysctl_tcp_frto)
    return 0;

    + /* MTU probe and F-RTO won't really play nicely along currently */
    + if (icsk->icsk_mtup.probe_size)
    + return 0;
    +
    if (IsSackFrto())
    return 1;

    --
    1.5.2.2

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread