Re: WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost() - Kernel

This is a discussion on Re: WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost() - Kernel ; (cc netdev) On Wed, 20 Feb 2008 20:04:39 -0800 (PST) Giangiacomo Mariotti wrote: > This is what I got with dmesg : > > [ 266.978695] WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost() > [ 266.978701] Pid: 0, comm: swapper Not tainted 2.6.24.2-my001 ...

+ Reply to Thread
Results 1 to 18 of 18

Thread: Re: WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost()

  1. Re: WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost()


    (cc netdev)

    On Wed, 20 Feb 2008 20:04:39 -0800 (PST) Giangiacomo Mariotti wrote:

    > This is what I got with dmesg :
    >
    > [ 266.978695] WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost()
    > [ 266.978701] Pid: 0, comm: swapper Not tainted 2.6.24.2-my001 #1
    > [ 266.978703]
    > [ 266.978704] Call Trace:
    > [ 266.978706] [] tcp_ack+0x16d8/0x197f
    > [ 266.978721] [] __wake_up+0x38/0x4e
    > [ 266.978727] [] tcp_rcv_established+0xe2/0x8cb
    > [ 266.978732] [] tcp_v4_do_rcv+0x30/0x39c
    > [ 266.978738] [] tcp_v4_rcv+0x99b/0xa06
    > [ 266.978743] [] __netdev_alloc_skb+0x29/0x43
    > [ 266.978749] [] ip_local_deliver_finish+0x152/0x212
    > [ 266.978753] [] ip_rcv_finish+0x2f8/0x31b
    > [ 266.978758] [] netif_receive_skb+0x3ae/0x3d1
    > [ 266.978763] [] rtl8169_rx_interrupt+0x45f/0x53e
    > [ 266.978768] [] rtl8169_poll+0x36/0x16a
    > [ 266.978773] [] net_rx_action+0xb7/0x1f3
    > [ 266.978778] [] __do_softirq+0x65/0xce
    > [ 266.978782] [] default_idle+0x0/0x3d
    > [ 266.978786] [] call_softirq+0x1c/0x28
    > [ 266.978789] [] do_softirq+0x2c/0x7d
    > [ 266.978792] [] irq_exit+0x3f/0x84
    > [ 266.978794] [] do_IRQ+0xb6/0xd5
    > [ 266.978797] [] default_idle+0x0/0x3d
    > [ 266.978800] [] ret_from_intr+0x0/0xa
    > [ 266.978801] [] default_idle+0x29/0x3d
    > [ 266.978809] [] cpu_idle+0x93/0xbb
    > [ 266.978813] [] start_kernel+0x2bb/0x2c7
    > [ 266.978818] [] _sinittext+0x123/0x12a
    > [ 266.978821]
    >
    > This though didn't cause any user-visible problem.
    >
    > .config file :

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost()

    On Sat, 23 Feb 2008, Andrew Morton wrote:

    >
    > (cc netdev)
    >
    > On Wed, 20 Feb 2008 20:04:39 -0800 (PST) Giangiacomo Mariotti wrote:
    >
    > > This is what I got with dmesg :
    > >
    > > [ 266.978695] WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost()
    > > [ 266.978701] Pid: 0, comm: swapper Not tainted 2.6.24.2-my001 #1
    > > [ 266.978703]
    > > [ 266.978704] Call Trace:
    > > [ 266.978706] [] tcp_ack+0x16d8/0x197f
    > > [ 266.978721] [] __wake_up+0x38/0x4e
    > > [ 266.978727] [] tcp_rcv_established+0xe2/0x8cb
    > > [ 266.978732] [] tcp_v4_do_rcv+0x30/0x39c
    > > [ 266.978738] [] tcp_v4_rcv+0x99b/0xa06
    > > [ 266.978743] [] __netdev_alloc_skb+0x29/0x43
    > > [ 266.978749] [] ip_local_deliver_finish+0x152/0x212
    > > [ 266.978753] [] ip_rcv_finish+0x2f8/0x31b
    > > [ 266.978758] [] netif_receive_skb+0x3ae/0x3d1
    > > [ 266.978763] [] rtl8169_rx_interrupt+0x45f/0x53e
    > > [ 266.978768] [] rtl8169_poll+0x36/0x16a
    > > [ 266.978773] [] net_rx_action+0xb7/0x1f3
    > > [ 266.978778] [] __do_softirq+0x65/0xce
    > > [ 266.978782] [] default_idle+0x0/0x3d
    > > [ 266.978786] [] call_softirq+0x1c/0x28
    > > [ 266.978789] [] do_softirq+0x2c/0x7d
    > > [ 266.978792] [] irq_exit+0x3f/0x84
    > > [ 266.978794] [] do_IRQ+0xb6/0xd5
    > > [ 266.978797] [] default_idle+0x0/0x3d
    > > [ 266.978800] [] ret_from_intr+0x0/0xa
    > > [ 266.978801] [] default_idle+0x29/0x3d
    > > [ 266.978809] [] cpu_idle+0x93/0xbb
    > > [ 266.978813] [] start_kernel+0x2bb/0x2c7
    > > [ 266.978818] [] _sinittext+0x123/0x12a
    > > [ 266.978821]
    > >


    Are you able to reproduce this in any way? I did in the past a debug patch
    that verifies TCP's write queue state by the hard way, ie., by bruteforce
    walking often enough to catch inconsistencies early enough to find out
    the root cause. I'll try to find that for you after I first go through
    the 2.6.24.2's code once again (but I'm pretty busy at this moment, so
    it might take a small while)...

    > > This though didn't cause any user-visible problem.


    Usually it's very insignificant to see them, unless you have them in very
    large quantities (it usually triggers for the same occurance in a number
    of places where that very same thing is being checked, thus having many
    of them in a row once is not what I mean here).

    Were there Leak printouts as well a bit after that? If not, this is
    triggered with either non-SACK TCP or it is a genuine S+L bits bug.

    --
    i.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost()

    On Wed, 27 Feb 2008, Guillaume Chazarain wrote:

    > On Wed, Feb 27, 2008 at 10:29 AM, Ilpo Järvinen
    > wrote:
    > > I did in the past a debug patch
    > > that verifies TCP's write queue state by the hard way, ie., by
    > > bruteforce
    > > walking often enough to catch inconsistencies early enough to find
    > > out
    > > the root cause.

    >
    > Are you talking about this one:
    > http://marc.info/?l=linux-netdev&m=119482084511178 ?
    > I attached a forward port to current git.


    No, I'd much more complete set of tests than in that one.

    > I am using this patch, and caught this maybe related error while
    > Bittorrenting:
    >
    > KERNEL: assertion (packets <= tp->packets_out) failed at
    > /home/g/linux-2.6/net/ipv4/tcp_input.c (2145)
    > KERNEL: assertion (packets <= tp->packets_out) failed at
    > /home/g/linux-2.6/net/ipv4/tcp_input.c (2145)
    > ------------[ cut here ]------------
    > WARNING: at /home/g/linux-2.6/net/ipv4/tcp_input.c:2515
    > tcp_fastretrans_alert+0xa5/0xa4f()


    You seem to be good in catching these... :-)


    --
    i.

  4. Re: WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost()

    On Wed, 27 Feb 2008, Ilpo Järvinen wrote:

    > On Wed, 27 Feb 2008, Guillaume Chazarain wrote:
    >
    > > Are you talking about this one:
    > > http://marc.info/?l=linux-netdev&m=119482084511178 ?
    > > I attached a forward port to current git.

    >
    > No, I'd much more complete set of tests than in that one.


    This below is for 2.6.25-rcs, won't work in 2.6.24, I'll customize it for
    that soon. Hopefully there isn't anything stupid in it (this time) which
    makes it to trigger spuriously.

    --
    i.

    [PATCH] TCP debug S+L (for 2.6.25-rcs, incompatible with 2.6.24.y)

    ---
    include/net/tcp.h | 9 +++-
    net/ipv4/tcp_input.c | 18 +++++++-
    net/ipv4/tcp_ipv4.c | 127 +++++++++++++++++++++++++++++++++++++++++++++++++
    net/ipv4/tcp_output.c | 23 +++++++--
    4 files changed, 169 insertions(+), 8 deletions(-)

    diff --git a/include/net/tcp.h b/include/net/tcp.h
    index 7de4ea3..acf5546 100644
    --- a/include/net/tcp.h
    +++ b/include/net/tcp.h
    @@ -272,6 +272,9 @@ DECLARE_SNMP_STAT(struct tcp_mib, tcp_statistics);
    #define TCP_ADD_STATS_BH(field, val) SNMP_ADD_STATS_BH(tcp_statistics, field, val)
    #define TCP_ADD_STATS_USER(field, val) SNMP_ADD_STATS_USER(tcp_statistics, field, val)

    +extern void tcp_print_queue(struct sock *sk);
    +extern void tcp_verify_wq(struct sock *sk);
    +
    extern void tcp_v4_err(struct sk_buff *skb, u32);

    extern void tcp_shutdown (struct sock *sk, int how);
    @@ -768,7 +771,11 @@ static inline __u32 tcp_current_ssthresh(const struct sock *sk)
    }

    /* Use define here intentionally to get WARN_ON location shown at the caller */
    -#define tcp_verify_left_out(tp) WARN_ON(tcp_left_out(tp) > tp->packets_out)
    +#define tcp_verify_left_out(tp) \
    + do {\
    + WARN_ON(tcp_left_out(tp) > tp->packets_out); \
    + tcp_verify_wq((struct sock *)tp); \
    + } while(0)

    extern void tcp_enter_cwr(struct sock *sk, const int set_ssthresh);
    extern __u32 tcp_init_cwnd(struct tcp_sock *tp, struct dst_entry *dst);
    diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
    index 19c449f..c897c93 100644
    --- a/net/ipv4/tcp_input.c
    +++ b/net/ipv4/tcp_input.c
    @@ -1426,8 +1426,10 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb,
    int first_sack_index;

    if (!tp->sacked_out) {
    - if (WARN_ON(tp->fackets_out))
    + if (WARN_ON(tp->fackets_out)) {
    + tcp_verify_left_out(tp);
    tp->fackets_out = 0;
    + }
    tcp_highest_sack_reset(sk);
    }

    @@ -2136,6 +2138,8 @@ static void tcp_mark_head_lost(struct sock *sk, int packets, int fast_rexmit)
    struct sk_buff *skb;
    int cnt;

    + tcp_verify_left_out(tp);
    +
    BUG_TRAP(packets <= tp->packets_out);
    if (tp->lost_skb_hint) {
    skb = tp->lost_skb_hint;
    @@ -2501,6 +2505,8 @@ static void tcp_fastretrans_alert(struct sock *sk, int pkts_acked, int flag)
    (tcp_fackets_out(tp) > tp->reordering));
    int fast_rexmit = 0;

    + tcp_verify_left_out(tp);
    +
    if (WARN_ON(!tp->packets_out && tp->sacked_out))
    tp->sacked_out = 0;
    if (WARN_ON(!tp->sacked_out && tp->fackets_out))
    @@ -2645,6 +2651,10 @@ static void tcp_fastretrans_alert(struct sock *sk, int pkts_acked, int flag)
    if (do_lost || (tcp_is_fack(tp) && tcp_head_timedout(sk)))
    tcp_update_scoreboard(sk, fast_rexmit);
    tcp_cwnd_down(sk, flag);
    +
    + WARN_ON(tcp_write_queue_head(sk) == NULL);
    + WARN_ON(!tp->packets_out);
    +
    tcp_xmit_retransmit_queue(sk);
    }

    @@ -2848,6 +2858,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets)
    tcp_clear_all_retrans_hints(tp);
    }

    + tcp_verify_left_out(tp);
    +
    if (skb && (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED))
    flag |= FLAG_SACK_RENEGING;

    @@ -3175,6 +3187,8 @@ static int tcp_ack(struct sock *sk, struct sk_buff *skb, int flag)
    prior_fackets = tp->fackets_out;
    prior_in_flight = tcp_packets_in_flight(tp);

    + tcp_verify_left_out(tp);
    +
    if (!(flag & FLAG_SLOWPATH) && after(ack, prior_snd_una)) {
    /* Window is constant, pure forward advance.
    * No more checks are required.
    @@ -3237,6 +3251,8 @@ static int tcp_ack(struct sock *sk, struct sk_buff *skb, int flag)
    if ((flag & FLAG_FORWARD_PROGRESS) || !(flag & FLAG_NOT_DUP))
    dst_confirm(sk->sk_dst_cache);

    + tcp_verify_left_out(tp);
    +
    return 1;

    no_queue:
    diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
    index 00156bf..71a4646 100644
    --- a/net/ipv4/tcp_ipv4.c
    +++ b/net/ipv4/tcp_ipv4.c
    @@ -108,6 +108,133 @@ struct inet_hashinfo __cacheline_aligned tcp_hashinfo = {
    .lhash_wait = __WAIT_QUEUE_HEAD_INITIALIZER(tcp_hashinfo.lhash_w ait),
    };

    +void tcp_print_queue(struct sock *sk)
    +{
    + struct tcp_sock *tp = tcp_sk(sk);
    + struct sk_buff *skb;
    + char s[50+1];
    + char h[50+1];
    + int idx = 0;
    + int i;
    +
    + i = 0;
    + tcp_for_write_queue(skb, sk) {
    + if (skb == tcp_send_head(sk))
    + printk(KERN_ERR "head %u %p\n", i, skb);
    + else
    + printk(KERN_ERR "skb %u %p\n", i, skb);
    + i++;
    + }
    +
    + tcp_for_write_queue(skb, sk) {
    + if (skb == tcp_send_head(sk))
    + break;
    +
    + for (i = 0; i < tcp_skb_pcount(skb); i++) {
    + if (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED) {
    + s[idx] = 'S';
    + if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST)
    + s[idx] = 'B';
    +
    + } else if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST) {
    + s[idx] = 'L';
    + } else {
    + s[idx] = ' ';
    + }
    + if (s[idx] != ' ' && skb->len < tp->mss_cache)
    + s[idx] += 'a' - 'A';
    +
    + if (i == 0) {
    + if (skb == tcp_highest_sack(sk))
    + h[idx] = 'h';
    + else
    + h[idx] = '+';
    + } else {
    + h[idx] = '-';
    + }
    +
    + if (++idx >= 50) {
    + s[idx] = 0;
    + h[idx] = 0;
    + printk(KERN_ERR "TCP wq(s) %s\n", s);
    + printk(KERN_ERR "TCP wq(h) %s\n", h);
    + idx = 0;
    + }
    + }
    + }
    + if (idx) {
    + s[idx] = '<';
    + s[idx+1] = 0;
    + h[idx] = '<';
    + h[idx+1] = 0;
    + printk(KERN_ERR "TCP wq(s) %s\n", s);
    + printk(KERN_ERR "TCP wq(h) %s\n", h);
    + }
    + printk(KERN_ERR "l%u s%u f%u p%u seq: su%u hs%u sn%u\n",
    + tp->lost_out, tp->sacked_out, tp->fackets_out, tp->packets_out,
    + tp->snd_una, tcp_highest_sack_seq(tp), tp->snd_nxt);
    +}
    +
    +void tcp_verify_wq(struct sock *sk)
    +{
    + struct tcp_sock *tp = tcp_sk(sk);
    + u32 lost = 0;
    + u32 sacked = 0;
    + u32 packets = 0;
    + u32 fackets = 0;
    + int hs_valid = 0;
    + struct sk_buff *skb;
    +
    + tcp_for_write_queue(skb, sk) {
    + if (skb == tcp_send_head(sk))
    + break;
    +
    + if ((fackets == packets) && (skb == tp->highest_sack))
    + hs_valid = 1;
    +
    + packets += tcp_skb_pcount(skb);
    +
    + if (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED) {
    + sacked += tcp_skb_pcount(skb);
    + if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST)
    + printk(KERN_ERR "Sacked bitmap S+L: %u %u-%u/%u\n",
    + TCP_SKB_CB(skb)->sacked,
    + TCP_SKB_CB(skb)->end_seq - tp->snd_una,
    + TCP_SKB_CB(skb)->seq - tp->snd_una,
    + tp->snd_una);
    + fackets = packets;
    + hs_valid = 0;
    + }
    + if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST)
    + lost += tcp_skb_pcount(skb);
    + }
    +
    + if ((fackets == packets) && (tp->highest_sack == tcp_send_head(sk)))
    + hs_valid = 1;
    +
    + if ((lost != tp->lost_out) ||
    + (tcp_is_sack(tp) && (sacked != tp->sacked_out)) ||
    + ((sacked || (tcp_is_sack(tp) && tp->sacked_out)) && !hs_valid) ||
    + (packets != tp->packets_out) ||
    + (fackets != tp->fackets_out) ||
    + tcp_left_out(tp) > tp->packets_out) {
    + printk(KERN_ERR "P: %u L: %u vs %u S: %u vs %u F: %u vs %u w: %u-%u (%u)\n",
    + tp->packets_out,
    + lost, tp->lost_out,
    + sacked, tp->sacked_out,
    + fackets, tp->fackets_out,
    + tp->snd_una, tp->snd_nxt,
    + tp->rx_opt.sack_ok);
    + tcp_print_queue(sk);
    + }
    +
    + WARN_ON(lost != tp->lost_out);
    + WARN_ON(tcp_is_sack(tp) && (sacked != tp->sacked_out));
    + WARN_ON(packets != tp->packets_out);
    + WARN_ON(fackets != tp->fackets_out);
    + WARN_ON(tcp_left_out(tp) > tp->packets_out);
    +}
    +
    static inline __u32 tcp_v4_init_sequence(struct sk_buff *skb)
    {
    return secure_tcp_sequence_number(ip_hdr(skb)->daddr,
    diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
    index ed750f9..257de86 100644
    --- a/net/ipv4/tcp_output.c
    +++ b/net/ipv4/tcp_output.c
    @@ -779,10 +779,9 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
    tp->lost_out -= diff;

    /* Adjust Reno SACK estimate. */
    - if (tcp_is_reno(tp) && diff > 0) {
    + if (tcp_is_reno(tp) && diff > 0)
    tcp_dec_pcount_approx_int(&tp->sacked_out, diff);
    - tcp_verify_left_out(tp);
    - }
    +
    tcp_adjust_fackets_out(sk, skb, diff);
    }

    @@ -790,6 +789,8 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
    skb_header_release(buff);
    tcp_insert_write_queue_after(skb, buff, sk);

    + tcp_verify_left_out(tp);
    +
    return 0;
    }

    @@ -1463,6 +1464,7 @@ static int tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle)
    } else if (result > 0) {
    sent_pkts = 1;
    }
    + tcp_verify_left_out(tp);

    while ((skb = tcp_send_head(sk))) {
    unsigned int limit;
    @@ -1764,6 +1766,7 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *skb,
    tcp_clear_retrans_hints_partial(tp);

    sk_wmem_free_skb(sk, next_skb);
    + tcp_verify_left_out(tp);
    }

    /* Do a simple retransmit without using the backoff mechanisms in
    @@ -1795,13 +1798,13 @@ void tcp_simple_retransmit(struct sock *sk)
    }
    }

    + tcp_verify_left_out(tp);
    +
    tcp_clear_all_retrans_hints(tp);

    if (!lost)
    return;

    - tcp_verify_left_out(tp);
    -
    /* Don't muck with the congestion window here.
    * Reason is that we do not increase amount of _data_
    * in network, but units changed and effective
    @@ -1888,6 +1891,8 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
    tcp_init_nondata_skb(skb, TCP_SKB_CB(skb)->end_seq - 1,
    TCP_SKB_CB(skb)->flags);
    skb->ip_summed = CHECKSUM_NONE;
    +
    + tcp_verify_left_out(tp);
    }
    }

    @@ -1970,8 +1975,10 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
    * packet to be MSS sized and all the
    * packet counting works out.
    */
    - if (tcp_packets_in_flight(tp) >= tp->snd_cwnd)
    + if (tcp_packets_in_flight(tp) >= tp->snd_cwnd) {
    + tcp_verify_left_out(tp);
    return;
    + }

    if (sacked & TCPCB_LOST) {
    if (!(sacked & (TCPCB_SACKED_ACKED|TCPCB_SACKED_RETRANS))) {
    @@ -1997,6 +2004,8 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
    }
    }

    + tcp_verify_left_out(tp);
    +
    /* OK, demanded retransmission is finished. */

    /* Forward retransmissions are possible only during Recovery. */
    @@ -2054,6 +2063,8 @@ void tcp_xmit_retransmit_queue(struct sock *sk)

    NET_INC_STATS_BH(LINUX_MIB_TCPFORWARDRETRANS);
    }
    +
    + tcp_verify_left_out(tp);
    }

    /* Send a fin. The caller locks the socket for us. This cannot be
    --
    1.5.2.2


  5. Re: WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost()

    On Thu, 28 Feb 2008 10:22:27 +0200 (EET) "Ilpo Järvinen" wrote:

    > [PATCH] TCP debug S+L (for 2.6.25-rcs, incompatible with 2.6.24.y)
    >
    > ---
    > include/net/tcp.h | 9 +++-
    > net/ipv4/tcp_input.c | 18 +++++++-
    > net/ipv4/tcp_ipv4.c | 127 +++++++++++++++++++++++++++++++++++++++++++++++++
    > net/ipv4/tcp_output.c | 23 +++++++--


    I'll put this in -mm, see if we can flush anything out. Please let me know
    if/when it's obsolete, updated, etc.

    What is "S+L"?
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost()

    On Thu, 28 Feb 2008, Andrew Morton wrote:

    > On Thu, 28 Feb 2008 10:22:27 +0200 (EET) "Ilpo Järvinen" wrote:
    >
    > > [PATCH] TCP debug S+L (for 2.6.25-rcs, incompatible with 2.6.24.y)
    > >
    > > ---
    > > include/net/tcp.h | 9 +++-
    > > net/ipv4/tcp_input.c | 18 +++++++-
    > > net/ipv4/tcp_ipv4.c | 127 +++++++++++++++++++++++++++++++++++++++++++++++++
    > > net/ipv4/tcp_output.c | 23 +++++++--

    >
    > I'll put this in -mm, see if we can flush anything out. Please let me know
    > if/when it's obsolete, updated, etc.
    >
    > What is "S+L"?


    I'll let Ilpo give the definitive answer. But to test if I'm starting
    to grasp this, I'll give my understanding. I believe 'S' means that a
    transmitted TCP skb has been acknowledged by a SACK, while 'L' means
    that a transmitted SKB is believed lost. Since the 'S' state implies
    that the packet has actually been successfully received, it should not be
    possible for it to be considered lost ('L' state). Thus an "S+L" state
    for a TCP skb is an internally inconsistent state and an indication of
    a TCP bug.

    Anyone feel free to correct me if I'm way off base in my understanding.

    -Bill
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost()

    On Thu, 28 Feb 2008, Bill Fink wrote:

    > On Thu, 28 Feb 2008, Andrew Morton wrote:
    >
    > > On Thu, 28 Feb 2008 10:22:27 +0200 (EET) "Ilpo Järvinen" wrote:
    > >
    > > > [PATCH] TCP debug S+L (for 2.6.25-rcs, incompatible with 2.6.24.y)
    > > >
    > > > ---
    > > > include/net/tcp.h | 9 +++-
    > > > net/ipv4/tcp_input.c | 18 +++++++-
    > > > net/ipv4/tcp_ipv4.c | 127 +++++++++++++++++++++++++++++++++++++++++++++++++
    > > > net/ipv4/tcp_output.c | 23 +++++++--

    > >
    > > I'll put this in -mm, see if we can flush anything out.


    Ok, thanks. Were you aware of the considerable cpu consumption it will
    cause...? I.e., scanning throught the write queue in a number of place per
    ACK will certainly show up if somebody tests with netperf or so... ;-)
    ....Just please make sure it won't leak into mainline (for sure you would
    have done that without this explicit note :-)).

    Good thing in that debug patch is that it catches inconsistencies
    immediately when they happen even if the cheap trap (which is in mainline)
    wouldn't ever see them because the situation would correct itself due to
    some other event.

    > > Please let me know if/when it's obsolete, updated, etc.


    Ok. Since many seem to now reporting this, I suppose the cause is
    relatively easy to find.

    > > What is "S+L"?

    >
    > I'll let Ilpo give the definitive answer. But to test if I'm starting
    > to grasp this, I'll give my understanding. I believe 'S' means that a
    > transmitted TCP skb has been acknowledged by a SACK, while 'L' means
    > that a transmitted SKB is believed lost. Since the 'S' state implies
    > that the packet has actually been successfully received, it should not be
    > possible for it to be considered lost ('L' state). Thus an "S+L" state
    > for a TCP skb is an internally inconsistent state and an indication of
    > a TCP bug.
    >
    > Anyone feel free to correct me if I'm way off base in my understanding.


    Yes, this is exactly what it means. There's a big comment about them in
    the net/ipv4/tcp_input.c too. I answered to a similar question (but Bill
    mostly told all of it already):
    http://marc.info/?l=linux-netdev&m=120099888912383&w=2

    We can do only cheap checking for sacked_out+lost_out > packets_out in
    mainline, and if that's true those warnings get printed but they won't
    necessarily tell the location of the bug because there might be
    considerable "latency" before that check triggers. On the other hand, this
    S+L debug patch verifies skb's ->sacked bitmaps against sacked/lost_out
    counters in multiple places per ACK and will catch the inconsistencies
    immediately at the site where they occurred (even if sacked_out + lost_out
    would still be below or equal to packets_out).



    --
    i.

  8. Re: WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost()

    On Thu, Feb 28, 2008 at 9:22 AM, Ilpo Järvinen
    wrote:
    > [PATCH] TCP debug S+L (for 2.6.25-rcs, incompatible with 2.6.24.y)


    Bittorrenting with this patch applied floods my dmesg. Here is a log dump:

    http://guichaz.free.fr/tcp-debug.log.bz2 (2.3M compressed, 113M uncompressed)

    It does not contain any "KERNEL: assertion (packets <=
    tp->packets_out) failed at" line, so I'm afraid it's just noise.

    Cheers.

    --
    Guillaume
    N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±‘êçzX§¶›¡Ü ¨}©ž²Æ*zÚ&j:+v‰¨¾«‘êçzZ+€Ê+zf£¢·hšˆ§~†*†Ûiÿûàz¹®w¥¢¸?™¨è*Ú&¢)ߢf”ù^jÇ«y§m…á@A«a¶Úÿ 0¶ìh®å’i

  9. Re: WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost()

    On Sun, 2 Mar 2008, Guillaume Chazarain wrote:

    > On Thu, Feb 28, 2008 at 9:22 AM, Ilpo Järvinen

    wrote:
    > > [PATCH] TCP debug S+L (for 2.6.25-rcs, incompatible with 2.6.24.y)

    >
    > Bittorrenting with this patch applied floods my dmesg. Here is a log
    > dump:
    >
    > http://guichaz.free.fr/tcp-debug.log.bz2 (2.3M compressed, 113M
    > uncompressed)


    In future, please inline at least the first one of them, if not sure
    where to cut, too much won't hurt... :-)

    > It does not contain any "KERNEL: assertion (packets <=
    > tp->packets_out) failed at" line, so I'm afraid it's just noise.


    At least it catches one bug which could cause that assertion (it is much
    more rigid than the assertion and thus it catched it even though you
    won't see that assertion to ever trigger :-)).

    Could you next figure out what is at:
    [] tcp_ack+0x621/0xd2f



    --
    i.

  10. Re: WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost()

    On Sun, 2 Mar 2008, Guillaume Chazarain wrote:

    >> On Sun, Mar 2, 2008 at 1:38 PM, Ilpo Järvinen wrote:
    >>> It does not contain any "KERNEL: assertion (packets <=
    >> > tp->packets_out) failed at" line, so I'm afraid it's just noise.


    Doh, you were right in this one...

    >> At least it catches one bug which could cause that assertion (it is much
    >> more rigid than the assertion and thus it catched it even though you
    >> won't see that assertion to ever trigger :-)).

    >
    > Great :-)


    ....I spoke too early, it was just that the verify call was placed into
    a place where the fackets_out is not yet reduced (I had too many version
    of that patch when I first did that and probably picked wrong one of
    them as a starting point, I'm sorry about that). I'll send an updated
    patch tomorrow for you and also correct it so that I don't need to ask
    things like this again (as long as one pastes couple of first occuring
    stacktraces):

    >>> Could you next figure out what is at:



    --
    i.

  11. Re: WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost()

    On Sun, 2 Mar 2008, Ilpo Järvinen wrote:

    > On Sun, 2 Mar 2008, Guillaume Chazarain wrote:
    >
    > >> On Sun, Mar 2, 2008 at 1:38 PM, Ilpo Järvinen wrote:
    > >>> It does not contain any "KERNEL: assertion (packets <=
    > >> > tp->packets_out) failed at" line, so I'm afraid it's just noise.

    >
    > Doh, you were right in this one...
    >
    > >> At least it catches one bug which could cause that assertion (it is much
    > >> more rigid than the assertion and thus it catched it even though you
    > >> won't see that assertion to ever trigger :-)).

    > >
    > > Great :-)

    >
    > ...I spoke too early, it was just that the verify call was placed into
    > a place where the fackets_out is not yet reduced (I had too many version
    > of that patch when I first did that and probably picked wrong one of
    > them as a starting point, I'm sorry about that). I'll send an updated
    > patch tomorrow for you and also correct it so that I don't need to ask
    > things like this again (as long as one pastes couple of first occuring
    > stacktraces):


    Here is the updated version... One of the stacktraces near the beginning
    (2nd or later) should now contain the line where the trap fired rather
    than having to figure that out from the stacktrace with eip address.

    Andrew, this is a minor update for it which avoids spurious triggers in
    tcp_clean_rtx_queue.

    --
    i.

    [PATCH] TCP debug S+L (for 2.6.25-rcs, incompatible with 2.6.24.y), v1.1

    Debugs sacked & lost skb inconstencies in TCP write queue &
    verifies fackets_out related variables.
    ---
    include/net/tcp.h | 10 +++-
    net/ipv4/tcp_input.c | 18 ++++++-
    net/ipv4/tcp_ipv4.c | 131 +++++++++++++++++++++++++++++++++++++++++++++++++
    net/ipv4/tcp_output.c | 23 ++++++--
    4 files changed, 174 insertions(+), 8 deletions(-)

    diff --git a/include/net/tcp.h b/include/net/tcp.h
    index 7de4ea3..19192d8 100644
    --- a/include/net/tcp.h
    +++ b/include/net/tcp.h
    @@ -272,6 +272,9 @@ DECLARE_SNMP_STAT(struct tcp_mib, tcp_statistics);
    #define TCP_ADD_STATS_BH(field, val) SNMP_ADD_STATS_BH(tcp_statistics, field, val)
    #define TCP_ADD_STATS_USER(field, val) SNMP_ADD_STATS_USER(tcp_statistics, field, val)

    +extern void tcp_print_queue(struct sock *sk);
    +extern int tcp_verify_wq(struct sock *sk);
    +
    extern void tcp_v4_err(struct sk_buff *skb, u32);

    extern void tcp_shutdown (struct sock *sk, int how);
    @@ -768,7 +771,12 @@ static inline __u32 tcp_current_ssthresh(const struct sock *sk)
    }

    /* Use define here intentionally to get WARN_ON location shown at the caller */
    -#define tcp_verify_left_out(tp) WARN_ON(tcp_left_out(tp) > tp->packets_out)
    +#define tcp_verify_left_out(tp) \
    + do {\
    + int res; \
    + res = tcp_verify_wq((struct sock *)tp); \
    + WARN_ON(res || tcp_left_out(tp) > tp->packets_out); \
    + } while(0)

    extern void tcp_enter_cwr(struct sock *sk, const int set_ssthresh);
    extern __u32 tcp_init_cwnd(struct tcp_sock *tp, struct dst_entry *dst);
    diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
    index 19c449f..bb0bdda 100644
    --- a/net/ipv4/tcp_input.c
    +++ b/net/ipv4/tcp_input.c
    @@ -1426,8 +1426,10 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb,
    int first_sack_index;

    if (!tp->sacked_out) {
    - if (WARN_ON(tp->fackets_out))
    + if (WARN_ON(tp->fackets_out)) {
    + tcp_verify_left_out(tp);
    tp->fackets_out = 0;
    + }
    tcp_highest_sack_reset(sk);
    }

    @@ -2136,6 +2138,8 @@ static void tcp_mark_head_lost(struct sock *sk, int packets, int fast_rexmit)
    struct sk_buff *skb;
    int cnt;

    + tcp_verify_left_out(tp);
    +
    BUG_TRAP(packets <= tp->packets_out);
    if (tp->lost_skb_hint) {
    skb = tp->lost_skb_hint;
    @@ -2501,6 +2505,8 @@ static void tcp_fastretrans_alert(struct sock *sk, int pkts_acked, int flag)
    (tcp_fackets_out(tp) > tp->reordering));
    int fast_rexmit = 0;

    + tcp_verify_left_out(tp);
    +
    if (WARN_ON(!tp->packets_out && tp->sacked_out))
    tp->sacked_out = 0;
    if (WARN_ON(!tp->sacked_out && tp->fackets_out))
    @@ -2645,6 +2651,10 @@ static void tcp_fastretrans_alert(struct sock *sk, int pkts_acked, int flag)
    if (do_lost || (tcp_is_fack(tp) && tcp_head_timedout(sk)))
    tcp_update_scoreboard(sk, fast_rexmit);
    tcp_cwnd_down(sk, flag);
    +
    + WARN_ON(tcp_write_queue_head(sk) == NULL);
    + WARN_ON(!tp->packets_out);
    +
    tcp_xmit_retransmit_queue(sk);
    }

    @@ -2887,6 +2897,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets)
    }
    }

    + tcp_verify_left_out(tp);
    +
    #if FASTRETRANS_DEBUG > 0
    BUG_TRAP((int)tp->sacked_out >= 0);
    BUG_TRAP((int)tp->lost_out >= 0);
    @@ -3175,6 +3187,8 @@ static int tcp_ack(struct sock *sk, struct sk_buff *skb, int flag)
    prior_fackets = tp->fackets_out;
    prior_in_flight = tcp_packets_in_flight(tp);

    + tcp_verify_left_out(tp);
    +
    if (!(flag & FLAG_SLOWPATH) && after(ack, prior_snd_una)) {
    /* Window is constant, pure forward advance.
    * No more checks are required.
    @@ -3237,6 +3251,8 @@ static int tcp_ack(struct sock *sk, struct sk_buff *skb, int flag)
    if ((flag & FLAG_FORWARD_PROGRESS) || !(flag & FLAG_NOT_DUP))
    dst_confirm(sk->sk_dst_cache);

    + tcp_verify_left_out(tp);
    +
    return 1;

    no_queue:
    diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
    index 00156bf..1a59b3c 100644
    --- a/net/ipv4/tcp_ipv4.c
    +++ b/net/ipv4/tcp_ipv4.c
    @@ -108,6 +108,137 @@ struct inet_hashinfo __cacheline_aligned tcp_hashinfo = {
    .lhash_wait = __WAIT_QUEUE_HEAD_INITIALIZER(tcp_hashinfo.lhash_w ait),
    };

    +void tcp_print_queue(struct sock *sk)
    +{
    + struct tcp_sock *tp = tcp_sk(sk);
    + struct sk_buff *skb;
    + char s[50+1];
    + char h[50+1];
    + int idx = 0;
    + int i;
    +
    + i = 0;
    + tcp_for_write_queue(skb, sk) {
    + if (skb == tcp_send_head(sk))
    + printk(KERN_ERR "head %u %p\n", i, skb);
    + else
    + printk(KERN_ERR "skb %u %p\n", i, skb);
    + i++;
    + }
    +
    + tcp_for_write_queue(skb, sk) {
    + if (skb == tcp_send_head(sk))
    + break;
    +
    + for (i = 0; i < tcp_skb_pcount(skb); i++) {
    + if (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED) {
    + s[idx] = 'S';
    + if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST)
    + s[idx] = 'B';
    +
    + } else if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST) {
    + s[idx] = 'L';
    + } else {
    + s[idx] = ' ';
    + }
    + if (s[idx] != ' ' && skb->len < tp->mss_cache)
    + s[idx] += 'a' - 'A';
    +
    + if (i == 0) {
    + if (skb == tcp_highest_sack(sk))
    + h[idx] = 'h';
    + else
    + h[idx] = '+';
    + } else {
    + h[idx] = '-';
    + }
    +
    + if (++idx >= 50) {
    + s[idx] = 0;
    + h[idx] = 0;
    + printk(KERN_ERR "TCP wq(s) %s\n", s);
    + printk(KERN_ERR "TCP wq(h) %s\n", h);
    + idx = 0;
    + }
    + }
    + }
    + if (idx) {
    + s[idx] = '<';
    + s[idx+1] = 0;
    + h[idx] = '<';
    + h[idx+1] = 0;
    + printk(KERN_ERR "TCP wq(s) %s\n", s);
    + printk(KERN_ERR "TCP wq(h) %s\n", h);
    + }
    + printk(KERN_ERR "l%u s%u f%u p%u seq: su%u hs%u sn%u\n",
    + tp->lost_out, tp->sacked_out, tp->fackets_out, tp->packets_out,
    + tp->snd_una, tcp_highest_sack_seq(tp), tp->snd_nxt);
    +}
    +
    +int tcp_verify_wq(struct sock *sk)
    +{
    + struct tcp_sock *tp = tcp_sk(sk);
    + u32 lost = 0;
    + u32 sacked = 0;
    + u32 packets = 0;
    + u32 fackets = 0;
    + int hs_valid = 0;
    + int inconsitent = 0;
    + struct sk_buff *skb;
    +
    + tcp_for_write_queue(skb, sk) {
    + if (skb == tcp_send_head(sk))
    + break;
    +
    + if ((fackets == packets) && (skb == tp->highest_sack))
    + hs_valid = 1;
    +
    + packets += tcp_skb_pcount(skb);
    +
    + if (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED) {
    + sacked += tcp_skb_pcount(skb);
    + if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST)
    + printk(KERN_ERR "Sacked bitmap S+L: %u %u-%u/%u\n",
    + TCP_SKB_CB(skb)->sacked,
    + TCP_SKB_CB(skb)->end_seq - tp->snd_una,
    + TCP_SKB_CB(skb)->seq - tp->snd_una,
    + tp->snd_una);
    + fackets = packets;
    + hs_valid = 0;
    + }
    + if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST)
    + lost += tcp_skb_pcount(skb);
    + }
    +
    + if ((fackets == packets) && (tp->highest_sack == tcp_send_head(sk)))
    + hs_valid = 1;
    +
    + if ((lost != tp->lost_out) ||
    + (tcp_is_sack(tp) && (sacked != tp->sacked_out)) ||
    + ((sacked || (tcp_is_sack(tp) && tp->sacked_out)) && !hs_valid) ||
    + (packets != tp->packets_out) ||
    + (fackets != tp->fackets_out) ||
    + tcp_left_out(tp) > tp->packets_out) {
    + printk(KERN_ERR "P: %u L: %u vs %u S: %u vs %u F: %u vs %u w: %u-%u (%u)\n",
    + tp->packets_out,
    + lost, tp->lost_out,
    + sacked, tp->sacked_out,
    + fackets, tp->fackets_out,
    + tp->snd_una, tp->snd_nxt,
    + tp->rx_opt.sack_ok);
    + tcp_print_queue(sk);
    + inconsistent = 1;
    + }
    +
    + WARN_ON(lost != tp->lost_out);
    + WARN_ON(tcp_is_sack(tp) && (sacked != tp->sacked_out));
    + WARN_ON(packets != tp->packets_out);
    + WARN_ON(fackets != tp->fackets_out);
    + WARN_ON(tcp_left_out(tp) > tp->packets_out);
    +
    + return inconsistent;
    +}
    +
    static inline __u32 tcp_v4_init_sequence(struct sk_buff *skb)
    {
    return secure_tcp_sequence_number(ip_hdr(skb)->daddr,
    diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
    index ed750f9..257de86 100644
    --- a/net/ipv4/tcp_output.c
    +++ b/net/ipv4/tcp_output.c
    @@ -779,10 +779,9 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
    tp->lost_out -= diff;

    /* Adjust Reno SACK estimate. */
    - if (tcp_is_reno(tp) && diff > 0) {
    + if (tcp_is_reno(tp) && diff > 0)
    tcp_dec_pcount_approx_int(&tp->sacked_out, diff);
    - tcp_verify_left_out(tp);
    - }
    +
    tcp_adjust_fackets_out(sk, skb, diff);
    }

    @@ -790,6 +789,8 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
    skb_header_release(buff);
    tcp_insert_write_queue_after(skb, buff, sk);

    + tcp_verify_left_out(tp);
    +
    return 0;
    }

    @@ -1463,6 +1464,7 @@ static int tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle)
    } else if (result > 0) {
    sent_pkts = 1;
    }
    + tcp_verify_left_out(tp);

    while ((skb = tcp_send_head(sk))) {
    unsigned int limit;
    @@ -1764,6 +1766,7 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *skb,
    tcp_clear_retrans_hints_partial(tp);

    sk_wmem_free_skb(sk, next_skb);
    + tcp_verify_left_out(tp);
    }

    /* Do a simple retransmit without using the backoff mechanisms in
    @@ -1795,13 +1798,13 @@ void tcp_simple_retransmit(struct sock *sk)
    }
    }

    + tcp_verify_left_out(tp);
    +
    tcp_clear_all_retrans_hints(tp);

    if (!lost)
    return;

    - tcp_verify_left_out(tp);
    -
    /* Don't muck with the congestion window here.
    * Reason is that we do not increase amount of _data_
    * in network, but units changed and effective
    @@ -1888,6 +1891,8 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
    tcp_init_nondata_skb(skb, TCP_SKB_CB(skb)->end_seq - 1,
    TCP_SKB_CB(skb)->flags);
    skb->ip_summed = CHECKSUM_NONE;
    +
    + tcp_verify_left_out(tp);
    }
    }

    @@ -1970,8 +1975,10 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
    * packet to be MSS sized and all the
    * packet counting works out.
    */
    - if (tcp_packets_in_flight(tp) >= tp->snd_cwnd)
    + if (tcp_packets_in_flight(tp) >= tp->snd_cwnd) {
    + tcp_verify_left_out(tp);
    return;
    + }

    if (sacked & TCPCB_LOST) {
    if (!(sacked & (TCPCB_SACKED_ACKED|TCPCB_SACKED_RETRANS))) {
    @@ -1997,6 +2004,8 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
    }
    }

    + tcp_verify_left_out(tp);
    +
    /* OK, demanded retransmission is finished. */

    /* Forward retransmissions are possible only during Recovery. */
    @@ -2054,6 +2063,8 @@ void tcp_xmit_retransmit_queue(struct sock *sk)

    NET_INC_STATS_BH(LINUX_MIB_TCPFORWARDRETRANS);
    }
    +
    + tcp_verify_left_out(tp);
    }

    /* Send a fin. The caller locks the socket for us. This cannot be
    --
    1.5.2.2


  12. Re: WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost()

    On Mon, 3 Mar 2008, Ilpo Järvinen wrote:

    > On Sun, 2 Mar 2008, Ilpo Järvinen wrote:
    >
    > > On Sun, 2 Mar 2008, Guillaume Chazarain wrote:
    > >
    > > >> On Sun, Mar 2, 2008 at 1:38 PM, Ilpo Järvinen wrote:
    > > >>> It does not contain any "KERNEL: assertion (packets <=
    > > >> > tp->packets_out) failed at" line, so I'm afraid it's just noise.

    > >
    > > Doh, you were right in this one...


    I did some filtering among those and found out that some still point out
    bug (there might be indication of tcp_mark_head_lost inconsistency as
    well but it's nearly impossible to track with all the incorporated noise
    which is causing which). ...I'll post the patch separately right after
    this.

    > > ...I spoke too early, it was just that the verify call was placed into
    > > a place where the fackets_out is not yet reduced (I had too many version
    > > of that patch when I first did that and probably picked wrong one of
    > > them as a starting point, I'm sorry about that). I'll send an updated
    > > patch tomorrow for you and also correct it so that I don't need to ask
    > > things like this again (as long as one pastes couple of first occuring
    > > stacktraces):

    >
    > Here is the updated version... One of the stacktraces near the beginning
    > (2nd or later) should now contain the line where the trap fired rather
    > than having to figure that out from the stacktrace with eip address.
    >
    > Andrew, this is a minor update for it which avoids spurious triggers in
    > tcp_clean_rtx_queue.


    Argh, I thought I compile tested it by make net/ipv4/tcp_*.o but found
    typo there even after that... :-(

    --
    i.

    [PATCH] TCP debug S+L (for 2.6.25-rcs, incompatible with 2.6.24.y), v1.2

    Debugs sacked & lost skb inconstencies in TCP write queue &
    verifies fackets_out related variables.
    ---
    include/net/tcp.h | 10 +++-
    net/ipv4/tcp_input.c | 18 ++++++-
    net/ipv4/tcp_ipv4.c | 131 +++++++++++++++++++++++++++++++++++++++++++++++++
    net/ipv4/tcp_output.c | 23 ++++++--
    4 files changed, 174 insertions(+), 8 deletions(-)

    diff --git a/include/net/tcp.h b/include/net/tcp.h
    index 7de4ea3..19192d8 100644
    --- a/include/net/tcp.h
    +++ b/include/net/tcp.h
    @@ -272,6 +272,9 @@ DECLARE_SNMP_STAT(struct tcp_mib, tcp_statistics);
    #define TCP_ADD_STATS_BH(field, val) SNMP_ADD_STATS_BH(tcp_statistics, field, val)
    #define TCP_ADD_STATS_USER(field, val) SNMP_ADD_STATS_USER(tcp_statistics, field, val)

    +extern void tcp_print_queue(struct sock *sk);
    +extern int tcp_verify_wq(struct sock *sk);
    +
    extern void tcp_v4_err(struct sk_buff *skb, u32);

    extern void tcp_shutdown (struct sock *sk, int how);
    @@ -768,7 +771,12 @@ static inline __u32 tcp_current_ssthresh(const struct sock *sk)
    }

    /* Use define here intentionally to get WARN_ON location shown at the caller */
    -#define tcp_verify_left_out(tp) WARN_ON(tcp_left_out(tp) > tp->packets_out)
    +#define tcp_verify_left_out(tp) \
    + do {\
    + int res; \
    + res = tcp_verify_wq((struct sock *)tp); \
    + WARN_ON(res || tcp_left_out(tp) > tp->packets_out); \
    + } while(0)

    extern void tcp_enter_cwr(struct sock *sk, const int set_ssthresh);
    extern __u32 tcp_init_cwnd(struct tcp_sock *tp, struct dst_entry *dst);
    diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
    index 19c449f..bb0bdda 100644
    --- a/net/ipv4/tcp_input.c
    +++ b/net/ipv4/tcp_input.c
    @@ -1426,8 +1426,10 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb,
    int first_sack_index;

    if (!tp->sacked_out) {
    - if (WARN_ON(tp->fackets_out))
    + if (WARN_ON(tp->fackets_out)) {
    + tcp_verify_left_out(tp);
    tp->fackets_out = 0;
    + }
    tcp_highest_sack_reset(sk);
    }

    @@ -2136,6 +2138,8 @@ static void tcp_mark_head_lost(struct sock *sk, int packets, int fast_rexmit)
    struct sk_buff *skb;
    int cnt;

    + tcp_verify_left_out(tp);
    +
    BUG_TRAP(packets <= tp->packets_out);
    if (tp->lost_skb_hint) {
    skb = tp->lost_skb_hint;
    @@ -2501,6 +2505,8 @@ static void tcp_fastretrans_alert(struct sock *sk, int pkts_acked, int flag)
    (tcp_fackets_out(tp) > tp->reordering));
    int fast_rexmit = 0;

    + tcp_verify_left_out(tp);
    +
    if (WARN_ON(!tp->packets_out && tp->sacked_out))
    tp->sacked_out = 0;
    if (WARN_ON(!tp->sacked_out && tp->fackets_out))
    @@ -2645,6 +2651,10 @@ static void tcp_fastretrans_alert(struct sock *sk, int pkts_acked, int flag)
    if (do_lost || (tcp_is_fack(tp) && tcp_head_timedout(sk)))
    tcp_update_scoreboard(sk, fast_rexmit);
    tcp_cwnd_down(sk, flag);
    +
    + WARN_ON(tcp_write_queue_head(sk) == NULL);
    + WARN_ON(!tp->packets_out);
    +
    tcp_xmit_retransmit_queue(sk);
    }

    @@ -2887,6 +2897,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets)
    }
    }

    + tcp_verify_left_out(tp);
    +
    #if FASTRETRANS_DEBUG > 0
    BUG_TRAP((int)tp->sacked_out >= 0);
    BUG_TRAP((int)tp->lost_out >= 0);
    @@ -3175,6 +3187,8 @@ static int tcp_ack(struct sock *sk, struct sk_buff *skb, int flag)
    prior_fackets = tp->fackets_out;
    prior_in_flight = tcp_packets_in_flight(tp);

    + tcp_verify_left_out(tp);
    +
    if (!(flag & FLAG_SLOWPATH) && after(ack, prior_snd_una)) {
    /* Window is constant, pure forward advance.
    * No more checks are required.
    @@ -3237,6 +3251,8 @@ static int tcp_ack(struct sock *sk, struct sk_buff *skb, int flag)
    if ((flag & FLAG_FORWARD_PROGRESS) || !(flag & FLAG_NOT_DUP))
    dst_confirm(sk->sk_dst_cache);

    + tcp_verify_left_out(tp);
    +
    return 1;

    no_queue:
    diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
    index 00156bf..d36c67c 100644
    --- a/net/ipv4/tcp_ipv4.c
    +++ b/net/ipv4/tcp_ipv4.c
    @@ -108,6 +108,137 @@ struct inet_hashinfo __cacheline_aligned tcp_hashinfo = {
    .lhash_wait = __WAIT_QUEUE_HEAD_INITIALIZER(tcp_hashinfo.lhash_w ait),
    };

    +void tcp_print_queue(struct sock *sk)
    +{
    + struct tcp_sock *tp = tcp_sk(sk);
    + struct sk_buff *skb;
    + char s[50+1];
    + char h[50+1];
    + int idx = 0;
    + int i;
    +
    + i = 0;
    + tcp_for_write_queue(skb, sk) {
    + if (skb == tcp_send_head(sk))
    + printk(KERN_ERR "head %u %p\n", i, skb);
    + else
    + printk(KERN_ERR "skb %u %p\n", i, skb);
    + i++;
    + }
    +
    + tcp_for_write_queue(skb, sk) {
    + if (skb == tcp_send_head(sk))
    + break;
    +
    + for (i = 0; i < tcp_skb_pcount(skb); i++) {
    + if (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED) {
    + s[idx] = 'S';
    + if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST)
    + s[idx] = 'B';
    +
    + } else if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST) {
    + s[idx] = 'L';
    + } else {
    + s[idx] = ' ';
    + }
    + if (s[idx] != ' ' && skb->len < tp->mss_cache)
    + s[idx] += 'a' - 'A';
    +
    + if (i == 0) {
    + if (skb == tcp_highest_sack(sk))
    + h[idx] = 'h';
    + else
    + h[idx] = '+';
    + } else {
    + h[idx] = '-';
    + }
    +
    + if (++idx >= 50) {
    + s[idx] = 0;
    + h[idx] = 0;
    + printk(KERN_ERR "TCP wq(s) %s\n", s);
    + printk(KERN_ERR "TCP wq(h) %s\n", h);
    + idx = 0;
    + }
    + }
    + }
    + if (idx) {
    + s[idx] = '<';
    + s[idx+1] = 0;
    + h[idx] = '<';
    + h[idx+1] = 0;
    + printk(KERN_ERR "TCP wq(s) %s\n", s);
    + printk(KERN_ERR "TCP wq(h) %s\n", h);
    + }
    + printk(KERN_ERR "l%u s%u f%u p%u seq: su%u hs%u sn%u\n",
    + tp->lost_out, tp->sacked_out, tp->fackets_out, tp->packets_out,
    + tp->snd_una, tcp_highest_sack_seq(tp), tp->snd_nxt);
    +}
    +
    +int tcp_verify_wq(struct sock *sk)
    +{
    + struct tcp_sock *tp = tcp_sk(sk);
    + u32 lost = 0;
    + u32 sacked = 0;
    + u32 packets = 0;
    + u32 fackets = 0;
    + int hs_valid = 0;
    + int inconsistent = 0;
    + struct sk_buff *skb;
    +
    + tcp_for_write_queue(skb, sk) {
    + if (skb == tcp_send_head(sk))
    + break;
    +
    + if ((fackets == packets) && (skb == tp->highest_sack))
    + hs_valid = 1;
    +
    + packets += tcp_skb_pcount(skb);
    +
    + if (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED) {
    + sacked += tcp_skb_pcount(skb);
    + if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST)
    + printk(KERN_ERR "Sacked bitmap S+L: %u %u-%u/%u\n",
    + TCP_SKB_CB(skb)->sacked,
    + TCP_SKB_CB(skb)->end_seq - tp->snd_una,
    + TCP_SKB_CB(skb)->seq - tp->snd_una,
    + tp->snd_una);
    + fackets = packets;
    + hs_valid = 0;
    + }
    + if (TCP_SKB_CB(skb)->sacked & TCPCB_LOST)
    + lost += tcp_skb_pcount(skb);
    + }
    +
    + if ((fackets == packets) && (tp->highest_sack == tcp_send_head(sk)))
    + hs_valid = 1;
    +
    + if ((lost != tp->lost_out) ||
    + (tcp_is_sack(tp) && (sacked != tp->sacked_out)) ||
    + ((sacked || (tcp_is_sack(tp) && tp->sacked_out)) && !hs_valid) ||
    + (packets != tp->packets_out) ||
    + (fackets != tp->fackets_out) ||
    + tcp_left_out(tp) > tp->packets_out) {
    + printk(KERN_ERR "P: %u L: %u vs %u S: %u vs %u F: %u vs %u w: %u-%u (%u)\n",
    + tp->packets_out,
    + lost, tp->lost_out,
    + sacked, tp->sacked_out,
    + fackets, tp->fackets_out,
    + tp->snd_una, tp->snd_nxt,
    + tp->rx_opt.sack_ok);
    + tcp_print_queue(sk);
    + inconsistent = 1;
    + }
    +
    + WARN_ON(lost != tp->lost_out);
    + WARN_ON(tcp_is_sack(tp) && (sacked != tp->sacked_out));
    + WARN_ON(packets != tp->packets_out);
    + WARN_ON(fackets != tp->fackets_out);
    + WARN_ON(tcp_left_out(tp) > tp->packets_out);
    +
    + return inconsistent;
    +}
    +
    static inline __u32 tcp_v4_init_sequence(struct sk_buff *skb)
    {
    return secure_tcp_sequence_number(ip_hdr(skb)->daddr,
    diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
    index ed750f9..257de86 100644
    --- a/net/ipv4/tcp_output.c
    +++ b/net/ipv4/tcp_output.c
    @@ -779,10 +779,9 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
    tp->lost_out -= diff;

    /* Adjust Reno SACK estimate. */
    - if (tcp_is_reno(tp) && diff > 0) {
    + if (tcp_is_reno(tp) && diff > 0)
    tcp_dec_pcount_approx_int(&tp->sacked_out, diff);
    - tcp_verify_left_out(tp);
    - }
    +
    tcp_adjust_fackets_out(sk, skb, diff);
    }

    @@ -790,6 +789,8 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
    skb_header_release(buff);
    tcp_insert_write_queue_after(skb, buff, sk);

    + tcp_verify_left_out(tp);
    +
    return 0;
    }

    @@ -1463,6 +1464,7 @@ static int tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle)
    } else if (result > 0) {
    sent_pkts = 1;
    }
    + tcp_verify_left_out(tp);

    while ((skb = tcp_send_head(sk))) {
    unsigned int limit;
    @@ -1764,6 +1766,7 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *skb,
    tcp_clear_retrans_hints_partial(tp);

    sk_wmem_free_skb(sk, next_skb);
    + tcp_verify_left_out(tp);
    }

    /* Do a simple retransmit without using the backoff mechanisms in
    @@ -1795,13 +1798,13 @@ void tcp_simple_retransmit(struct sock *sk)
    }
    }

    + tcp_verify_left_out(tp);
    +
    tcp_clear_all_retrans_hints(tp);

    if (!lost)
    return;

    - tcp_verify_left_out(tp);
    -
    /* Don't muck with the congestion window here.
    * Reason is that we do not increase amount of _data_
    * in network, but units changed and effective
    @@ -1888,6 +1891,8 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
    tcp_init_nondata_skb(skb, TCP_SKB_CB(skb)->end_seq - 1,
    TCP_SKB_CB(skb)->flags);
    skb->ip_summed = CHECKSUM_NONE;
    +
    + tcp_verify_left_out(tp);
    }
    }

    @@ -1970,8 +1975,10 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
    * packet to be MSS sized and all the
    * packet counting works out.
    */
    - if (tcp_packets_in_flight(tp) >= tp->snd_cwnd)
    + if (tcp_packets_in_flight(tp) >= tp->snd_cwnd) {
    + tcp_verify_left_out(tp);
    return;
    + }

    if (sacked & TCPCB_LOST) {
    if (!(sacked & (TCPCB_SACKED_ACKED|TCPCB_SACKED_RETRANS))) {
    @@ -1997,6 +2004,8 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
    }
    }

    + tcp_verify_left_out(tp);
    +
    /* OK, demanded retransmission is finished. */

    /* Forward retransmissions are possible only during Recovery. */
    @@ -2054,6 +2063,8 @@ void tcp_xmit_retransmit_queue(struct sock *sk)

    NET_INC_STATS_BH(LINUX_MIB_TCPFORWARDRETRANS);
    }
    +
    + tcp_verify_left_out(tp);
    }

    /* Send a fin. The caller locks the socket for us. This cannot be
    --
    1.5.2.2


  13. [PATCH net-2.6] [TCP]: Must count fack_count also when skipping

    On Mon, 3 Mar 2008, Ilpo Järvinen wrote:

    > On Mon, 3 Mar 2008, Ilpo Järvinen wrote:
    >
    > > On Sun, 2 Mar 2008, Ilpo Järvinen wrote:
    > >
    > > > On Sun, 2 Mar 2008, Guillaume Chazarain wrote:
    > > >
    > > > >> On Sun, Mar 2, 2008 at 1:38 PM, Ilpo Järvinen wrote:
    > > > >>> It does not contain any "KERNEL: assertion (packets <=
    > > > >> > tp->packets_out) failed at" line, so I'm afraid it's just noise.
    > > >
    > > > Doh, you were right in this one...

    >
    > I did some filtering among those and found out that some still point out
    > bug (there might be indication of tcp_mark_head_lost inconsistency as
    > well but it's nearly impossible to track with all the incorporated noise
    > which is causing which). ...I'll post the patch separately right after
    > this.


    Dave, at least this is needed for TCP packet counter correctness, there
    could be other one hiding still because this only makes fackets_out too
    small (there could be second order effect though), while people are seeing
    a result of too large fackets_out in tcp_mark_head_lost.

    It's a bit shame that I didn't notice this when I verified time-seqno
    graphs after that sacktag rewrite (it mostly results just a substle
    difference in the slope).

    With this & the debug patch I didn't get anything into my logs but others
    might be more successful if there are other bugs still to solve that
    require more sophisticated network conditions to occur.


    --
    i.

    [PATCH net-2.6] [TCP]: Must count fack_count also when skipping

    It makes fackets_out to grow too slowly compared with the
    real write queue.

    This shouldn't cause those BUG_TRAP(packets <= tp->packets_out)
    to trigger but how knows how such inconsistent fackets_out
    affects here and there around TCP when everything is nowadays
    assuming accurate fackets_out. So lets see if this silences
    them all.

    Reported by Guillaume Chazarain .

    Signed-off-by: Ilpo Järvinen
    ---
    net/ipv4/tcp_input.c | 14 +++++++++-----
    1 files changed, 9 insertions(+), 5 deletions(-)

    diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
    index 19c449f..7facdb0 100644
    --- a/net/ipv4/tcp_input.c
    +++ b/net/ipv4/tcp_input.c
    @@ -1367,7 +1367,7 @@ static struct sk_buff *tcp_sacktag_walk(struct sk_buff *skb, struct sock *sk,
    * a normal way
    */
    static struct sk_buff *tcp_sacktag_skip(struct sk_buff *skb, struct sock *sk,
    - u32 skip_to_seq)
    + u32 skip_to_seq, int *fack_count)
    {
    tcp_for_write_queue_from(skb, sk) {
    if (skb == tcp_send_head(sk))
    @@ -1375,6 +1375,8 @@ static struct sk_buff *tcp_sacktag_skip(struct sk_buff *skb, struct sock *sk,

    if (!before(TCP_SKB_CB(skb)->end_seq, skip_to_seq))
    break;
    +
    + *fack_count += tcp_skb_pcount(skb);
    }
    return skb;
    }
    @@ -1390,7 +1392,7 @@ static struct sk_buff *tcp_maybe_skipping_dsack(struct sk_buff *skb,
    return skb;

    if (before(next_dup->start_seq, skip_to_seq)) {
    - skb = tcp_sacktag_skip(skb, sk, next_dup->start_seq);
    + skb = tcp_sacktag_skip(skb, sk, next_dup->start_seq, fack_count);
    tcp_sacktag_walk(skb, sk, NULL,
    next_dup->start_seq, next_dup->end_seq,
    1, fack_count, reord, flag);
    @@ -1537,7 +1539,8 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb,

    /* Head todo? */
    if (before(start_seq, cache->start_seq)) {
    - skb = tcp_sacktag_skip(skb, sk, start_seq);
    + skb = tcp_sacktag_skip(skb, sk, start_seq,
    + &fack_count);
    skb = tcp_sacktag_walk(skb, sk, next_dup,
    start_seq,
    cache->start_seq,
    @@ -1565,7 +1568,8 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb,
    goto walk;
    }

    - skb = tcp_sacktag_skip(skb, sk, cache->end_seq);
    + skb = tcp_sacktag_skip(skb, sk, cache->end_seq,
    + &fack_count);
    /* Check overlap against next cached too (past this one already) */
    cache++;
    continue;
    @@ -1577,7 +1581,7 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb,
    break;
    fack_count = tp->fackets_out;
    }
    - skb = tcp_sacktag_skip(skb, sk, start_seq);
    + skb = tcp_sacktag_skip(skb, sk, start_seq, &fack_count);

    walk:
    skb = tcp_sacktag_walk(skb, sk, next_dup, start_seq, end_seq,
    --
    1.5.2.2


  14. Re: [PATCH net-2.6] [TCP]: Must count fack_count also when skipping

    From: "Ilpo_Järvinen"
    Date: Mon, 3 Mar 2008 15:53:12 +0200 (EET)

    > [PATCH net-2.6] [TCP]: Must count fack_count also when skipping
    >
    > It makes fackets_out to grow too slowly compared with the
    > real write queue.
    >
    > This shouldn't cause those BUG_TRAP(packets <= tp->packets_out)
    > to trigger but how knows how such inconsistent fackets_out
    > affects here and there around TCP when everything is nowadays
    > assuming accurate fackets_out. So lets see if this silences
    > them all.
    >
    > Reported by Guillaume Chazarain .
    >
    > Signed-off-by: Ilpo Järvinen


    Applied, thanks Ilpo.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  15. Re: [PATCH net-2.6] [TCP]: Must count fack_count also when skipping

    On Mon, 03 Mar 2008 15:53:12 +0200, Ilpo Järvinen
    wrote:

    > [PATCH net-2.6] [TCP]: Must count fack_count also when skipping
    >
    > It makes fackets_out to grow too slowly compared with the real write
    > queue.
    >
    > This shouldn't cause those BUG_TRAP(packets <= tp->packets_out) to
    > trigger but how knows how such inconsistent fackets_out affects here and
    > there around TCP when everything is nowadays assuming accurate
    > fackets_out. So lets see if this silences them all.
    >
    > Reported by Guillaume Chazarain .


    Will this patch be applied to 2.6.24 stable? I think I have been hit by
    the same problem recently:

    WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost()
    Pid: 16959, comm: X Tainted: P 2.6.24.3-desktop-3mnb #1

    Call Trace:
    [] tcp_ack+0x180c/0x1d60
    [] _read_lock_bh+0x9/0x20
    [] tcp_rcv_state_process+0x3b5/0xd00
    [] tcp_v4_do_rcv+0xc8/0x3f0
    [] :nf_conntrack_ipv4:ipv4_confirm+0x33/0x60
    [] nf_iterate+0x66/0xc0
    [] tcp_v4_rcv+0x898/0xaf0
    [] ip_local_deliver_finish+0xc3/0x250
    [] ip_rcv_finish+0x114/0x3b0
    [] ip_rcv+0x205/0x2f0
    [] netif_receive_skb+0x3ac/0x490
    [] :forcedeth:nv_napi_poll+0xf9/0x850
    [] net_rx_action+0x128/0x230
    [] __do_softirq+0x75/0xe0
    [] call_softirq+0x1c/0x30
    [] do_softirq+0x35/0x90
    [] irq_exit+0x88/0x90
    [] do_IRQ+0x80/0x100
    [] ret_from_intr+0x0/0xa

    WARNING: at net/ipv4/tcp_input.c:2413 tcp_fastretrans_alert()
    Pid: 4053, comm: rsyslogd Tainted: P 2.6.24.3-desktop-3mnb #1

    Call Trace:
    [] tcp_ack+0x1d2d/0x1d60
    [] _read_lock_bh+0x9/0x20
    [] tcp_rcv_state_process+0x3b5/0xd00
    [] tcp_v4_do_rcv+0xc8/0x3f0
    [] :nf_conntrack_ipv4:ipv4_confirm+0x33/0x60
    [] nf_iterate+0x66/0xc0
    [] tcp_v4_rcv+0x898/0xaf0
    [] ip_local_deliver_finish+0xc3/0x250
    [] ip_rcv_finish+0x114/0x3b0
    [] ip_rcv+0x205/0x2f0
    [] netif_receive_skb+0x3ac/0x490
    [] :forcedeth:nv_napi_poll+0xf9/0x850
    [] net_rx_action+0x128/0x230
    [] __do_softirq+0x75/0xe0
    [] call_softirq+0x1c/0x30
    [] do_softirq+0x35/0x90
    [] irq_exit+0x88/0x90
    [] do_IRQ+0x80/0x100
    [] ret_from_intr+0x0/0xa


    --
    Frederik Himpe

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  16. Re: [PATCH net-2.6] [TCP]: Must count fack_count also when skipping

    From: Frederik Himpe
    Date: Mon, 24 Mar 2008 20:36:08 +0000 (UTC)

    > On Mon, 03 Mar 2008 15:53:12 +0200, Ilpo Järvinen
    > wrote:
    >
    > > [PATCH net-2.6] [TCP]: Must count fack_count also when skipping
    > >
    > > It makes fackets_out to grow too slowly compared with the real write
    > > queue.
    > >
    > > This shouldn't cause those BUG_TRAP(packets <= tp->packets_out) to
    > > trigger but how knows how such inconsistent fackets_out affects here and
    > > there around TCP when everything is nowadays assuming accurate
    > > fackets_out. So lets see if this silences them all.
    > >
    > > Reported by Guillaume Chazarain .

    >
    > Will this patch be applied to 2.6.24 stable? I think I have been hit by
    > the same problem recently:


    I'll push it to the -stable folks for their next release
    since people are actively hitting it.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  17. Re: [PATCH net-2.6] [TCP]: Must count fack_count also when skipping

    On Mon, 24 Mar 2008, David Miller wrote:

    > From: Frederik Himpe
    > Date: Mon, 24 Mar 2008 20:36:08 +0000 (UTC)
    >
    > > On Mon, 03 Mar 2008 15:53:12 +0200, Ilpo Järvinen
    > > wrote:
    > >
    > > > [PATCH net-2.6] [TCP]: Must count fack_count also when skipping
    > > >
    > > > It makes fackets_out to grow too slowly compared with the real write
    > > > queue.
    > > >
    > > > This shouldn't cause those BUG_TRAP(packets <= tp->packets_out) to
    > > > trigger but how knows how such inconsistent fackets_out affects here and
    > > > there around TCP when everything is nowadays assuming accurate
    > > > fackets_out. So lets see if this silences them all.
    > > >
    > > > Reported by Guillaume Chazarain .

    > >
    > > Will this patch be applied to 2.6.24 stable? I think I have been hit by
    > > the same problem recently:

    >
    > I'll push it to the -stable folks for their next release
    > since people are actively hitting it.


    Please don't, it's not the right fix, it fixed a bug that was
    introduced post 2.6.24 by this commit:

    commit 68f8353b480e5f2e136c38a511abdbb88eaa8ce2
    Author: Ilpo Järvinen
    Date: Thu Nov 15 19:50:37 2007 -0800

    [TCP]: Rewrite SACK block processing & sack_recv_cache use


    There's something else wrong with the 2.6.24.y. I already knew that and
    was therefore planning next to run extensive set of tests on 2.6.24ish
    kernel with some torrent mixed with some netem stimuli but haven't yet had
    time for that as I had to find and resolve an hw incompatibility issues
    between ddr2s and mobo before being a week away. I'll post an 2.6.24.y
    adapted TCP debug patch once I get that done for my tests (in case
    somebody else is interested in running with it besides me).

    Btw, it would have been polite to cc me as well (I suppose you just didn't
    notice that somebody dropped me in between :-)), not a big prob though as
    I found it out early anyway because I'm trying to catch up what has been
    reported against TCP during my week away.


    --
    i.

  18. Re: [PATCH net-2.6] [TCP]: Must count fack_count also when skipping

    From: "Ilpo_Järvinen"
    Date: Tue, 25 Mar 2008 23:24:55 +0200 (EET)

    > Please don't, it's not the right fix, it fixed a bug that was
    > introduced post 2.6.24 by this commit:


    Ok.

    > Btw, it would have been polite to cc me as well (I suppose you just didn't
    > notice that somebody dropped me in between :-)), not a big prob though as
    > I found it out early anyway because I'm trying to catch up what has been
    > reported against TCP during my week away.


    Sorry, my bad.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread