2.6.24-rc6: possible recursive locking detected - Kernel

This is a discussion on 2.6.24-rc6: possible recursive locking detected - Kernel ; hi, a few minutes after upgrading from -rc5 to -rc6 I got: [ 1310.670986] ============================================= [ 1310.671690] [ INFO: possible recursive locking detected ] [ 1310.672097] 2.6.24-rc6 #1 [ 1310.672421] --------------------------------------------- [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock: [ ...

+ Reply to Thread
Results 1 to 17 of 17

Thread: 2.6.24-rc6: possible recursive locking detected

  1. 2.6.24-rc6: possible recursive locking detected

    hi,

    a few minutes after upgrading from -rc5 to -rc6 I got:

    [ 1310.670986] =============================================
    [ 1310.671690] [ INFO: possible recursive locking detected ]
    [ 1310.672097] 2.6.24-rc6 #1
    [ 1310.672421] ---------------------------------------------
    [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock:
    [ 1310.673238] (&q->lock){++..}, at: [] __wake_up+0x1b/0x50
    [ 1310.673869]
    [ 1310.673870] but task is already holding lock:
    [ 1310.674567] (&q->lock){++..}, at: [] __wake_up+0x1b/0x50
    [ 1310.675267]
    [ 1310.675268] other info that might help us debug this:
    [ 1310.675952] 5 locks held by FahCore_a0.exe/3692:
    [ 1310.676334] #0: (rcu_read_lock){..--}, at: [] net_rx_action+0x60/0x1b0
    [ 1310.677251] #1: (rcu_read_lock){..--}, at: [] netif_receive_skb+0x100/0x470
    [ 1310.677924] #2: (rcu_read_lock){..--}, at: [] ip_local_deliver_finish+0x32/0x210
    [ 1310.678460] #3: (clock-AF_INET){-.-?}, at: [] sock_def_readable+0x1e/0x80
    [ 1310.679250] #4: (&q->lock){++..}, at: [] __wake_up+0x1b/0x50
    [ 1310.680151]
    [ 1310.680152] stack backtrace:
    [ 1310.680772] Pid: 3692, comm: FahCore_a0.exe Not tainted 2.6.24-rc6 #1
    [ 1310.681209] [] show_trace_log_lvl+0x1a/0x30
    [ 1310.681659] [] show_trace+0x12/0x20
    [ 1310.682085] [] dump_stack+0x6a/0x70
    [ 1310.682512] [] __lock_acquire+0x971/0x10c0
    [ 1310.682961] [] lock_acquire+0x5e/0x80
    [ 1310.683392] [] _spin_lock_irqsave+0x38/0x50
    [ 1310.683914] [] __wake_up+0x1b/0x50
    [ 1310.684337] [] ep_poll_safewake+0x9a/0xc0
    [ 1310.684822] [] ep_poll_callback+0x8b/0xe0
    [ 1310.685265] [] __wake_up_common+0x48/0x70
    [ 1310.685712] [] __wake_up+0x37/0x50
    [ 1310.686136] [] sock_def_readable+0x7a/0x80
    [ 1310.686579] [] sock_queue_rcv_skb+0xeb/0x150
    [ 1310.687028] [] udp_queue_rcv_skb+0x139/0x2a0
    [ 1310.687554] [] __udp4_lib_rcv+0x2f1/0x7e0
    [ 1310.687996] [] udp_rcv+0x12/0x20
    [ 1310.688415] [] ip_local_deliver_finish+0x125/0x210
    [ 1310.688881] [] ip_local_deliver+0x2d/0x90
    [ 1310.689323] [] ip_rcv_finish+0xeb/0x300
    [ 1310.689760] [] ip_rcv+0x195/0x230
    [ 1310.690182] [] netif_receive_skb+0x37c/0x470
    [ 1310.690632] [] process_backlog+0x69/0xc0
    [ 1310.691175] [] net_rx_action+0x137/0x1b0
    [ 1310.691681] [] __do_softirq+0x52/0xb0
    [ 1310.692006] [] do_softirq+0x94/0xe0
    [ 1310.692301] =======================


    This is a single CPU machine, and the box was quite busy due to disk I/O
    (load 6-8). The machine continues to run and all is well now. Even the
    application mentioned above (FahCore_a0.exe) is running fine
    ("Folding@Home", cpu bound). The binary is located on an jfs filesystem,
    which was also under heavy I/O. Can someone tell me why the backtrace
    shows so much net* stuff? There was not much net I/O...

    more details and .config: http://nerdbynature.de/bits/2.6.24-rc6

    Thanks,
    Christian.
    --
    BOFH excuse #312:

    incompatible bit-registration operators
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: 2.6.24-rc6: possible recursive locking detected

    [Added some CCs]

    On Thursday, 3 of January 2008, Christian Kujau wrote:
    > hi,
    >
    > a few minutes after upgrading from -rc5 to -rc6 I got:
    >
    > [ 1310.670986] =============================================
    > [ 1310.671690] [ INFO: possible recursive locking detected ]
    > [ 1310.672097] 2.6.24-rc6 #1
    > [ 1310.672421] ---------------------------------------------
    > [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock:
    > [ 1310.673238] (&q->lock){++..}, at: [] __wake_up+0x1b/0x50
    > [ 1310.673869]
    > [ 1310.673870] but task is already holding lock:
    > [ 1310.674567] (&q->lock){++..}, at: [] __wake_up+0x1b/0x50
    > [ 1310.675267]
    > [ 1310.675268] other info that might help us debug this:
    > [ 1310.675952] 5 locks held by FahCore_a0.exe/3692:
    > [ 1310.676334] #0: (rcu_read_lock){..--}, at: [] net_rx_action+0x60/0x1b0
    > [ 1310.677251] #1: (rcu_read_lock){..--}, at: [] netif_receive_skb+0x100/0x470
    > [ 1310.677924] #2: (rcu_read_lock){..--}, at: [] ip_local_deliver_finish+0x32/0x210
    > [ 1310.678460] #3: (clock-AF_INET){-.-?}, at: [] sock_def_readable+0x1e/0x80
    > [ 1310.679250] #4: (&q->lock){++..}, at: [] __wake_up+0x1b/0x50
    > [ 1310.680151]
    > [ 1310.680152] stack backtrace:
    > [ 1310.680772] Pid: 3692, comm: FahCore_a0.exe Not tainted 2.6.24-rc6 #1
    > [ 1310.681209] [] show_trace_log_lvl+0x1a/0x30
    > [ 1310.681659] [] show_trace+0x12/0x20
    > [ 1310.682085] [] dump_stack+0x6a/0x70
    > [ 1310.682512] [] __lock_acquire+0x971/0x10c0
    > [ 1310.682961] [] lock_acquire+0x5e/0x80
    > [ 1310.683392] [] _spin_lock_irqsave+0x38/0x50
    > [ 1310.683914] [] __wake_up+0x1b/0x50
    > [ 1310.684337] [] ep_poll_safewake+0x9a/0xc0
    > [ 1310.684822] [] ep_poll_callback+0x8b/0xe0
    > [ 1310.685265] [] __wake_up_common+0x48/0x70
    > [ 1310.685712] [] __wake_up+0x37/0x50
    > [ 1310.686136] [] sock_def_readable+0x7a/0x80
    > [ 1310.686579] [] sock_queue_rcv_skb+0xeb/0x150
    > [ 1310.687028] [] udp_queue_rcv_skb+0x139/0x2a0
    > [ 1310.687554] [] __udp4_lib_rcv+0x2f1/0x7e0
    > [ 1310.687996] [] udp_rcv+0x12/0x20
    > [ 1310.688415] [] ip_local_deliver_finish+0x125/0x210
    > [ 1310.688881] [] ip_local_deliver+0x2d/0x90
    > [ 1310.689323] [] ip_rcv_finish+0xeb/0x300
    > [ 1310.689760] [] ip_rcv+0x195/0x230
    > [ 1310.690182] [] netif_receive_skb+0x37c/0x470
    > [ 1310.690632] [] process_backlog+0x69/0xc0
    > [ 1310.691175] [] net_rx_action+0x137/0x1b0
    > [ 1310.691681] [] __do_softirq+0x52/0xb0
    > [ 1310.692006] [] do_softirq+0x94/0xe0
    > [ 1310.692301] =======================
    >
    >
    > This is a single CPU machine, and the box was quite busy due to disk I/O
    > (load 6-8). The machine continues to run and all is well now. Even the
    > application mentioned above (FahCore_a0.exe) is running fine
    > ("Folding@Home", cpu bound). The binary is located on an jfs filesystem,
    > which was also under heavy I/O. Can someone tell me why the backtrace
    > shows so much net* stuff? There was not much net I/O...
    >
    > more details and .config: http://nerdbynature.de/bits/2.6.24-rc6
    >
    > Thanks,
    > Christian.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: 2.6.24-rc6: possible recursive locking detected



    [ Added even more CCs :-) Seems eventpoll & net related. ]

    * Rafael J. Wysocki wrote:

    > [Added some CCs]
    >
    > On Thursday, 3 of January 2008, Christian Kujau wrote:
    > > hi,
    > >
    > > a few minutes after upgrading from -rc5 to -rc6 I got:
    > >
    > > [ 1310.670986] =============================================
    > > [ 1310.671690] [ INFO: possible recursive locking detected ]
    > > [ 1310.672097] 2.6.24-rc6 #1
    > > [ 1310.672421] ---------------------------------------------
    > > [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock:
    > > [ 1310.673238] (&q->lock){++..}, at: [] __wake_up+0x1b/0x50
    > > [ 1310.673869]
    > > [ 1310.673870] but task is already holding lock:
    > > [ 1310.674567] (&q->lock){++..}, at: [] __wake_up+0x1b/0x50
    > > [ 1310.675267]
    > > [ 1310.675268] other info that might help us debug this:
    > > [ 1310.675952] 5 locks held by FahCore_a0.exe/3692:
    > > [ 1310.676334] #0: (rcu_read_lock){..--}, at: [] net_rx_action+0x60/0x1b0
    > > [ 1310.677251] #1: (rcu_read_lock){..--}, at: [] netif_receive_skb+0x100/0x470
    > > [ 1310.677924] #2: (rcu_read_lock){..--}, at: [] ip_local_deliver_finish+0x32/0x210
    > > [ 1310.678460] #3: (clock-AF_INET){-.-?}, at: [] sock_def_readable+0x1e/0x80
    > > [ 1310.679250] #4: (&q->lock){++..}, at: [] __wake_up+0x1b/0x50
    > > [ 1310.680151]
    > > [ 1310.680152] stack backtrace:
    > > [ 1310.680772] Pid: 3692, comm: FahCore_a0.exe Not tainted 2.6.24-rc6 #1
    > > [ 1310.681209] [] show_trace_log_lvl+0x1a/0x30
    > > [ 1310.681659] [] show_trace+0x12/0x20
    > > [ 1310.682085] [] dump_stack+0x6a/0x70
    > > [ 1310.682512] [] __lock_acquire+0x971/0x10c0
    > > [ 1310.682961] [] lock_acquire+0x5e/0x80
    > > [ 1310.683392] [] _spin_lock_irqsave+0x38/0x50
    > > [ 1310.683914] [] __wake_up+0x1b/0x50
    > > [ 1310.684337] [] ep_poll_safewake+0x9a/0xc0
    > > [ 1310.684822] [] ep_poll_callback+0x8b/0xe0
    > > [ 1310.685265] [] __wake_up_common+0x48/0x70
    > > [ 1310.685712] [] __wake_up+0x37/0x50
    > > [ 1310.686136] [] sock_def_readable+0x7a/0x80
    > > [ 1310.686579] [] sock_queue_rcv_skb+0xeb/0x150
    > > [ 1310.687028] [] udp_queue_rcv_skb+0x139/0x2a0
    > > [ 1310.687554] [] __udp4_lib_rcv+0x2f1/0x7e0
    > > [ 1310.687996] [] udp_rcv+0x12/0x20
    > > [ 1310.688415] [] ip_local_deliver_finish+0x125/0x210
    > > [ 1310.688881] [] ip_local_deliver+0x2d/0x90
    > > [ 1310.689323] [] ip_rcv_finish+0xeb/0x300
    > > [ 1310.689760] [] ip_rcv+0x195/0x230
    > > [ 1310.690182] [] netif_receive_skb+0x37c/0x470
    > > [ 1310.690632] [] process_backlog+0x69/0xc0
    > > [ 1310.691175] [] net_rx_action+0x137/0x1b0
    > > [ 1310.691681] [] __do_softirq+0x52/0xb0
    > > [ 1310.692006] [] do_softirq+0x94/0xe0
    > > [ 1310.692301] =======================
    > >
    > >
    > > This is a single CPU machine, and the box was quite busy due to disk I/O
    > > (load 6-8). The machine continues to run and all is well now. Even the
    > > application mentioned above (FahCore_a0.exe) is running fine
    > > ("Folding@Home", cpu bound). The binary is located on an jfs filesystem,
    > > which was also under heavy I/O. Can someone tell me why the backtrace
    > > shows so much net* stuff? There was not much net I/O...
    > >
    > > more details and .config: http://nerdbynature.de/bits/2.6.24-rc6
    > >
    > > Thanks,
    > > Christian.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: 2.6.24-rc6: possible recursive locking detected

    On Fri, Jan 04, 2008 at 09:30:49AM +0100, Ingo Molnar wrote:
    >
    > > > [ 1310.670986] =============================================
    > > > [ 1310.671690] [ INFO: possible recursive locking detected ]
    > > > [ 1310.672097] 2.6.24-rc6 #1
    > > > [ 1310.672421] ---------------------------------------------
    > > > [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock:
    > > > [ 1310.673238] (&q->lock){++..}, at: [] __wake_up+0x1b/0x50
    > > > [ 1310.673869]
    > > > [ 1310.673870] but task is already holding lock:
    > > > [ 1310.674567] (&q->lock){++..}, at: [] __wake_up+0x1b/0x50
    > > > [ 1310.675267]
    > > > [ 1310.675268] other info that might help us debug this:
    > > > [ 1310.675952] 5 locks held by FahCore_a0.exe/3692:
    > > > [ 1310.676334] #0: (rcu_read_lock){..--}, at: [] net_rx_action+0x60/0x1b0
    > > > [ 1310.677251] #1: (rcu_read_lock){..--}, at: [] netif_receive_skb+0x100/0x470
    > > > [ 1310.677924] #2: (rcu_read_lock){..--}, at: [] ip_local_deliver_finish+0x32/0x210
    > > > [ 1310.678460] #3: (clock-AF_INET){-.-?}, at: [] sock_def_readable+0x1e/0x80
    > > > [ 1310.679250] #4: (&q->lock){++..}, at: [] __wake_up+0x1b/0x50


    The net part might just be a red herring, since the problem is that
    __wake_up is somehow reentering itself.

    Cheers,
    --
    Visit Openswan at http://www.openswan.org/
    Email: Herbert Xu ~{PmV>HI~}
    Home Page: http://gondor.apana.org.au/~herbert/
    PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: 2.6.24-rc6: possible recursive locking detected


    On Sat, 2008-01-05 at 18:12 +1100, Herbert Xu wrote:
    > On Fri, Jan 04, 2008 at 09:30:49AM +0100, Ingo Molnar wrote:
    > >
    > > > > [ 1310.670986] =============================================
    > > > > [ 1310.671690] [ INFO: possible recursive locking detected ]
    > > > > [ 1310.672097] 2.6.24-rc6 #1
    > > > > [ 1310.672421] ---------------------------------------------
    > > > > [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock:
    > > > > [ 1310.673238] (&q->lock){++..}, at: [] __wake_up+0x1b/0x50
    > > > > [ 1310.673869]
    > > > > [ 1310.673870] but task is already holding lock:
    > > > > [ 1310.674567] (&q->lock){++..}, at: [] __wake_up+0x1b/0x50
    > > > > [ 1310.675267]
    > > > > [ 1310.675268] other info that might help us debug this:
    > > > > [ 1310.675952] 5 locks held by FahCore_a0.exe/3692:
    > > > > [ 1310.676334] #0: (rcu_read_lock){..--}, at: [] net_rx_action+0x60/0x1b0
    > > > > [ 1310.677251] #1: (rcu_read_lock){..--}, at: [] netif_receive_skb+0x100/0x470
    > > > > [ 1310.677924] #2: (rcu_read_lock){..--}, at: [] ip_local_deliver_finish+0x32/0x210
    > > > > [ 1310.678460] #3: (clock-AF_INET){-.-?}, at: [] sock_def_readable+0x1e/0x80
    > > > > [ 1310.679250] #4: (&q->lock){++..}, at: [] __wake_up+0x1b/0x50

    >
    > The net part might just be a red herring, since the problem is that
    > __wake_up is somehow reentering itself.


    /*
    * Perform a safe wake up of the poll wait list. The problem is that
    * with the new callback'd wake up system, it is possible that the
    * poll callback is reentered from inside the call to wake_up() done
    * on the poll wait queue head. The rule is that we cannot reenter the
    * wake up code from the same task more than EP_MAX_POLLWAKE_NESTS times,
    * and we cannot reenter the same wait queue head at all. This will
    * enable to have a hierarchy of epoll file descriptor of no more than
    * EP_MAX_POLLWAKE_NESTS deep. We need the irq version of the spin lock
    * because this one gets called by the poll callback, that in turn is called
    * from inside a wake_up(), that might be called from irq context.
    */

    Seems to suggest that the epoll code can indeed recurse into wakeup.

    Davide, Johannes, any ideas?

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: 2.6.24-rc6: possible recursive locking detected


    On Sat, 2008-01-05 at 17:53 +0100, Peter Zijlstra wrote:
    > On Sat, 2008-01-05 at 18:12 +1100, Herbert Xu wrote:
    > > On Fri, Jan 04, 2008 at 09:30:49AM +0100, Ingo Molnar wrote:
    > > >
    > > > > > [ 1310.670986] =============================================
    > > > > > [ 1310.671690] [ INFO: possible recursive locking detected ]
    > > > > > [ 1310.672097] 2.6.24-rc6 #1
    > > > > > [ 1310.672421] ---------------------------------------------
    > > > > > [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock:
    > > > > > [ 1310.673238] (&q->lock){++..}, at: [] __wake_up+0x1b/0x50
    > > > > > [ 1310.673869]
    > > > > > [ 1310.673870] but task is already holding lock:
    > > > > > [ 1310.674567] (&q->lock){++..}, at: [] __wake_up+0x1b/0x50
    > > > > > [ 1310.675267]
    > > > > > [ 1310.675268] other info that might help us debug this:
    > > > > > [ 1310.675952] 5 locks held by FahCore_a0.exe/3692:
    > > > > > [ 1310.676334] #0: (rcu_read_lock){..--}, at: [] net_rx_action+0x60/0x1b0
    > > > > > [ 1310.677251] #1: (rcu_read_lock){..--}, at: [] netif_receive_skb+0x100/0x470
    > > > > > [ 1310.677924] #2: (rcu_read_lock){..--}, at: [] ip_local_deliver_finish+0x32/0x210
    > > > > > [ 1310.678460] #3: (clock-AF_INET){-.-?}, at: [] sock_def_readable+0x1e/0x80
    > > > > > [ 1310.679250] #4: (&q->lock){++..}, at: [] __wake_up+0x1b/0x50

    > >
    > > The net part might just be a red herring, since the problem is that
    > > __wake_up is somehow reentering itself.

    >
    > /*
    > * Perform a safe wake up of the poll wait list. The problem is that
    > * with the new callback'd wake up system, it is possible that the
    > * poll callback is reentered from inside the call to wake_up() done
    > * on the poll wait queue head. The rule is that we cannot reenter the
    > * wake up code from the same task more than EP_MAX_POLLWAKE_NESTS times,
    > * and we cannot reenter the same wait queue head at all. This will
    > * enable to have a hierarchy of epoll file descriptor of no more than
    > * EP_MAX_POLLWAKE_NESTS deep. We need the irq version of the spin lock
    > * because this one gets called by the poll callback, that in turn is called
    > * from inside a wake_up(), that might be called from irq context.
    > */
    >
    > Seems to suggest that the epoll code can indeed recurse into wakeup.
    >
    > Davide, Johannes, any ideas?


    Since EP_MAX_POLLWAKE_NESTS < MAX_LOCKDEP_SUBCLASSES we could perhaps do
    something like:

    wake_up_nested(..., wake_nests);

    although I'm not quite sure that is correct, my understanding of this
    code is still fragile at best.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: 2.6.24-rc6: possible recursive locking detected

    On Sat, 5 Jan 2008, Peter Zijlstra wrote:

    >
    > On Sat, 2008-01-05 at 17:53 +0100, Peter Zijlstra wrote:
    > > On Sat, 2008-01-05 at 18:12 +1100, Herbert Xu wrote:
    > > > On Fri, Jan 04, 2008 at 09:30:49AM +0100, Ingo Molnar wrote:
    > > > >
    > > > > > > [ 1310.670986] =============================================
    > > > > > > [ 1310.671690] [ INFO: possible recursive locking detected ]
    > > > > > > [ 1310.672097] 2.6.24-rc6 #1
    > > > > > > [ 1310.672421] ---------------------------------------------
    > > > > > > [ 1310.672828] FahCore_a0.exe/3692 is trying to acquire lock:
    > > > > > > [ 1310.673238] (&q->lock){++..}, at: [] __wake_up+0x1b/0x50
    > > > > > > [ 1310.673869]
    > > > > > > [ 1310.673870] but task is already holding lock:
    > > > > > > [ 1310.674567] (&q->lock){++..}, at: [] __wake_up+0x1b/0x50
    > > > > > > [ 1310.675267]
    > > > > > > [ 1310.675268] other info that might help us debug this:
    > > > > > > [ 1310.675952] 5 locks held by FahCore_a0.exe/3692:
    > > > > > > [ 1310.676334] #0: (rcu_read_lock){..--}, at: [] net_rx_action+0x60/0x1b0
    > > > > > > [ 1310.677251] #1: (rcu_read_lock){..--}, at: [] netif_receive_skb+0x100/0x470
    > > > > > > [ 1310.677924] #2: (rcu_read_lock){..--}, at: [] ip_local_deliver_finish+0x32/0x210
    > > > > > > [ 1310.678460] #3: (clock-AF_INET){-.-?}, at: [] sock_def_readable+0x1e/0x80
    > > > > > > [ 1310.679250] #4: (&q->lock){++..}, at: [] __wake_up+0x1b/0x50
    > > >
    > > > The net part might just be a red herring, since the problem is that
    > > > __wake_up is somehow reentering itself.

    > >
    > > /*
    > > * Perform a safe wake up of the poll wait list. The problem is that
    > > * with the new callback'd wake up system, it is possible that the
    > > * poll callback is reentered from inside the call to wake_up() done
    > > * on the poll wait queue head. The rule is that we cannot reenter the
    > > * wake up code from the same task more than EP_MAX_POLLWAKE_NESTS times,
    > > * and we cannot reenter the same wait queue head at all. This will
    > > * enable to have a hierarchy of epoll file descriptor of no more than
    > > * EP_MAX_POLLWAKE_NESTS deep. We need the irq version of the spin lock
    > > * because this one gets called by the poll callback, that in turn is called
    > > * from inside a wake_up(), that might be called from irq context.
    > > */
    > >
    > > Seems to suggest that the epoll code can indeed recurse into wakeup.
    > >
    > > Davide, Johannes, any ideas?

    >
    > Since EP_MAX_POLLWAKE_NESTS < MAX_LOCKDEP_SUBCLASSES we could perhaps do
    > something like:
    >
    > wake_up_nested(..., wake_nests);
    >
    > although I'm not quite sure that is correct, my understanding of this
    > code is still fragile at best.


    I remember I talked with Arjan about this time ago. Basically, since 1)
    you can drop an epoll fd inside another epoll fd 2) callback-based wakeups
    are used, you can see a wake_up() from inside another wake_up(), but they
    will never refer to the same lock instance.
    Think about:

    dfd = socket(...);
    efd1 = epoll_create();
    efd2 = epoll_create();
    epoll_ctl(efd1, EPOLL_CTL_ADD, dfd, ...);
    epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...);

    When a packet arrives to the device underneath "dfd", the net code will
    issue a wake_up() on its poll wake list. Epoll (efd1) has installed a
    callback wakeup entry on that queue, and the wake_up() performed by the
    "dfd" net code will end up in ep_poll_callback(). At this point epoll
    (efd1) notices that it may have some event ready, so it needs to wake up
    the waiters on its poll wait list (efd2). So it calls ep_poll_safewake()
    that ends up in another wake_up(), after having checked about the
    recursion constraints. That are, no more than EP_MAX_POLLWAKE_NESTS, to
    avoid stack blasting. Never hit the same queue, to avoid loops like:

    epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...);
    epoll_ctl(efd3, EPOLL_CTL_ADD, efd2, ...);
    epoll_ctl(efd4, EPOLL_CTL_ADD, efd3, ...);
    epoll_ctl(efd1, EPOLL_CTL_ADD, efd4, ...);

    The code "if (tncur->wq == wq || ..." prevents re-entering the same
    queue/lock.
    I don't know how the lockdep code works, so I can't say about
    wake_up_nested(). Although I have a feeling is not enough in this case.
    A solution may be to move the call to ep_poll_safewake() (that'd become a
    simple wake_up()) inside a tasklet or whatever is today trendy for delayed
    work. But his kinda scares me to be honest, since epoll has already a
    bunch of places where it could be asynchronously hit (plus performance
    regression will need to be verified).



    - Davide


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: 2.6.24-rc6: possible recursive locking detected

    On Sat, 5 Jan 2008, Davide Libenzi wrote:
    > A solution may be to move the call to ep_poll_safewake() (that'd become a
    > simple wake_up()) inside a tasklet or whatever is today trendy for delayed
    > work. But his kinda scares me to be honest, since epoll has already a
    > bunch of places where it could be asynchronously hit (plus performance
    > regression will need to be verified).


    Although I'm not able to reproduce this one right now, I'm happy to test
    any patches you guys come up with.

    Thanks for your time,
    Christian.
    --
    BOFH excuse #78:

    Yes, yes, its called a design limitation
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: 2.6.24-rc6: possible recursive locking detected

    [Davide Libenzi - Sat, Jan 05, 2008 at 01:35:25PM -0800]
    | On Sat, 5 Jan 2008, Peter Zijlstra wrote:
    |
    [...snip...]
    | I remember I talked with Arjan about this time ago. Basically, since 1)
    | you can drop an epoll fd inside another epoll fd 2) callback-based wakeups
    | are used, you can see a wake_up() from inside another wake_up(), but they
    | will never refer to the same lock instance.
    | Think about:
    |
    | dfd = socket(...);
    | efd1 = epoll_create();
    | efd2 = epoll_create();
    | epoll_ctl(efd1, EPOLL_CTL_ADD, dfd, ...);
    | epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...);
    |
    | When a packet arrives to the device underneath "dfd", the net code will
    | issue a wake_up() on its poll wake list. Epoll (efd1) has installed a
    | callback wakeup entry on that queue, and the wake_up() performed by the
    | "dfd" net code will end up in ep_poll_callback(). At this point epoll
    | (efd1) notices that it may have some event ready, so it needs to wake up
    | the waiters on its poll wait list (efd2). So it calls ep_poll_safewake()
    | that ends up in another wake_up(), after having checked about the
    | recursion constraints. That are, no more than EP_MAX_POLLWAKE_NESTS, to
    | avoid stack blasting. Never hit the same queue, to avoid loops like:
    |
    | epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...);
    | epoll_ctl(efd3, EPOLL_CTL_ADD, efd2, ...);
    | epoll_ctl(efd4, EPOLL_CTL_ADD, efd3, ...);
    | epoll_ctl(efd1, EPOLL_CTL_ADD, efd4, ...);
    |
    | The code "if (tncur->wq == wq || ..." prevents re-entering the same
    | queue/lock.
    | I don't know how the lockdep code works, so I can't say about
    | wake_up_nested(). Although I have a feeling is not enough in this case.
    | A solution may be to move the call to ep_poll_safewake() (that'd become a
    | simple wake_up()) inside a tasklet or whatever is today trendy for delayed
    | work. But his kinda scares me to be honest, since epoll has already a
    | bunch of places where it could be asynchronously hit (plus performance
    | regression will need to be verified).
    |
    |
    |
    | - Davide
    |
    |

    it's quite possible that i'm wrong but just interested...
    why in ep_poll_safewake() the assignment

    struct list_head *lsthead = &psw->wake_task_list;

    is not protected by spinlock?

    - Cyrill -
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: 2.6.24-rc6: possible recursive locking detected

    [Cyrill Gorcunov - Mon, Jan 07, 2008 at 12:44:42AM +0300]
    | [Davide Libenzi - Sat, Jan 05, 2008 at 01:35:25PM -0800]
    | | On Sat, 5 Jan 2008, Peter Zijlstra wrote:
    | |
    | [...snip...]
    | | I remember I talked with Arjan about this time ago. Basically, since 1)
    | | you can drop an epoll fd inside another epoll fd 2) callback-based wakeups
    | | are used, you can see a wake_up() from inside another wake_up(), but they
    | | will never refer to the same lock instance.
    | | Think about:
    | |
    | | dfd = socket(...);
    | | efd1 = epoll_create();
    | | efd2 = epoll_create();
    | | epoll_ctl(efd1, EPOLL_CTL_ADD, dfd, ...);
    | | epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...);
    | |
    | | When a packet arrives to the device underneath "dfd", the net code will
    | | issue a wake_up() on its poll wake list. Epoll (efd1) has installed a
    | | callback wakeup entry on that queue, and the wake_up() performed by the
    | | "dfd" net code will end up in ep_poll_callback(). At this point epoll
    | | (efd1) notices that it may have some event ready, so it needs to wake up
    | | the waiters on its poll wait list (efd2). So it calls ep_poll_safewake()
    | | that ends up in another wake_up(), after having checked about the
    | | recursion constraints. That are, no more than EP_MAX_POLLWAKE_NESTS, to
    | | avoid stack blasting. Never hit the same queue, to avoid loops like:
    | |
    | | epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...);
    | | epoll_ctl(efd3, EPOLL_CTL_ADD, efd2, ...);
    | | epoll_ctl(efd4, EPOLL_CTL_ADD, efd3, ...);
    | | epoll_ctl(efd1, EPOLL_CTL_ADD, efd4, ...);
    | |
    | | The code "if (tncur->wq == wq || ..." prevents re-entering the same
    | | queue/lock.
    | | I don't know how the lockdep code works, so I can't say about
    | | wake_up_nested(). Although I have a feeling is not enough in this case.
    | | A solution may be to move the call to ep_poll_safewake() (that'd become a
    | | simple wake_up()) inside a tasklet or whatever is today trendy for delayed
    | | work. But his kinda scares me to be honest, since epoll has already a
    | | bunch of places where it could be asynchronously hit (plus performance
    | | regression will need to be verified).
    | |
    | |
    | |
    | | - Davide
    | |
    | |
    |
    | it's quite possible that i'm wrong but just interested...
    | why in ep_poll_safewake() the assignment
    |
    | struct list_head *lsthead = &psw->wake_task_list;
    |
    | is not protected by spinlock?
    |
    | - Cyrill -

    it was a completely stupid question... please drop

    - Cyrill -
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: 2.6.24-rc6: possible recursive locking detected

    On 01/05, Peter Zijlstra wrote:
    >
    > Since EP_MAX_POLLWAKE_NESTS < MAX_LOCKDEP_SUBCLASSES we could perhaps do
    > something like:
    >
    > wake_up_nested(..., wake_nests);


    I think this would be the most correct change. But I wonder if it is possible
    to do something more generic (but otoh more stupid/hackish and less safe).

    Consider this "just for illustration" patch,

    --- t/kernel/lockdep.c 2007-11-09 12:57:31.000000000 +0300
    +++ t/kernel/lockdep.c 2008-01-07 19:43:50.000000000 +0300
    @@ -1266,10 +1266,13 @@ check_deadlock(struct task_struct *curr,
    struct held_lock *prev;
    int i;

    - for (i = 0; i < curr->lockdep_depth; i++) {
    + for (i = curr->lockdep_depth; --i >= 0; ) {
    prev = curr->held_locks + i;
    if (prev->class != next->class)
    continue;
    +
    + if (prev->trylock == -1)
    + return 2;
    /*
    * Allow read-after-read recursion of the same
    * lock class (i.e. read_lock(lock)+read_lock(lock)):
    -------------------------------------------------------------------------

    Now,

    // trylock == -1
    #define spin_mark_nested(l) \
    lock_acquire(&(l)->dep_map, 0, -1, 0, 2, _THIS_IP_)
    #define spin_unmark_nested(l) \
    lock_release(&(l)->dep_map, 1, _THIS_IP_)

    and ep_poll_safewake() can do:

    /* Do really wake up now */
    spin_mark_nested(&wq->lock);
    wake_up(wq);
    spin_unmark_nested(&wq->lock);

    Possible?

    Oleg.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: 2.6.24-rc6: possible recursive locking detected

    On 01/07, Oleg Nesterov wrote:
    >
    > Consider this "just for illustration" patch,
    >
    > --- t/kernel/lockdep.c 2007-11-09 12:57:31.000000000 +0300
    > +++ t/kernel/lockdep.c 2008-01-07 19:43:50.000000000 +0300
    > @@ -1266,10 +1266,13 @@ check_deadlock(struct task_struct *curr,
    > struct held_lock *prev;
    > int i;
    >
    > - for (i = 0; i < curr->lockdep_depth; i++) {
    > + for (i = curr->lockdep_depth; --i >= 0; ) {
    > prev = curr->held_locks + i;
    > if (prev->class != next->class)
    > continue;
    > +
    > + if (prev->trylock == -1)
    > + return 2;
    > /*
    > * Allow read-after-read recursion of the same
    > * lock class (i.e. read_lock(lock)+read_lock(lock)):
    > -------------------------------------------------------------------------
    >
    > Now,
    >
    > // trylock == -1
    > #define spin_mark_nested(l) \
    > lock_acquire(&(l)->dep_map, 0, -1, 0, 2, _THIS_IP_)
    > #define spin_unmark_nested(l) \
    > lock_release(&(l)->dep_map, 1, _THIS_IP_)
    >
    > and ep_poll_safewake() can do:
    >
    > /* Do really wake up now */
    > spin_mark_nested(&wq->lock);
    > wake_up(wq);
    > spin_unmark_nested(&wq->lock);


    I tested the patch above with the following code,

    wait_queue_head_t w1, w2, w3;

    init_waitqueue_head(&w1);
    init_waitqueue_head(&w2);
    init_waitqueue_head(&w3);

    local_irq_disable();
    spin_lock(&w1.lock);

    spin_mark_nested(&w2.lock);
    spin_lock(&w2.lock);

    spin_mark_nested(&w3.lock);
    wake_up(&w3);
    spin_unmark_nested(&w3.lock);

    spin_unlock(&w2.lock);
    spin_unmark_nested(&w2.lock);

    spin_unlock(&w1.lock);
    local_irq_enable();

    seems to work. What do you think?

    Oleg.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  13. Re: 2.6.24-rc6: possible recursive locking detected

    On Sun, 6 Jan 2008, Christian Kujau wrote:

    > On Sat, 5 Jan 2008, Davide Libenzi wrote:
    > > A solution may be to move the call to ep_poll_safewake() (that'd become a
    > > simple wake_up()) inside a tasklet or whatever is today trendy for delayed
    > > work. But his kinda scares me to be honest, since epoll has already a
    > > bunch of places where it could be asynchronously hit (plus performance
    > > regression will need to be verified).

    >
    > Although I'm not able to reproduce this one right now, I'm happy to test any
    > patches you guys come up with.


    There's no need to reproduce it. It's there, it's among us



    - Davide


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  14. Re: 2.6.24-rc6: possible recursive locking detected


    On Mon, 2008-01-07 at 20:49 +0300, Oleg Nesterov wrote:
    > On 01/07, Oleg Nesterov wrote:
    > >
    > > Consider this "just for illustration" patch,
    > >
    > > --- t/kernel/lockdep.c 2007-11-09 12:57:31.000000000 +0300
    > > +++ t/kernel/lockdep.c 2008-01-07 19:43:50.000000000 +0300
    > > @@ -1266,10 +1266,13 @@ check_deadlock(struct task_struct *curr,
    > > struct held_lock *prev;
    > > int i;
    > >
    > > - for (i = 0; i < curr->lockdep_depth; i++) {
    > > + for (i = curr->lockdep_depth; --i >= 0; ) {
    > > prev = curr->held_locks + i;
    > > if (prev->class != next->class)
    > > continue;
    > > +
    > > + if (prev->trylock == -1)
    > > + return 2;
    > > /*
    > > * Allow read-after-read recursion of the same
    > > * lock class (i.e. read_lock(lock)+read_lock(lock)):
    > > -------------------------------------------------------------------------
    > >
    > > Now,
    > >
    > > // trylock == -1
    > > #define spin_mark_nested(l) \
    > > lock_acquire(&(l)->dep_map, 0, -1, 0, 2, _THIS_IP_)
    > > #define spin_unmark_nested(l) \
    > > lock_release(&(l)->dep_map, 1, _THIS_IP_)
    > >
    > > and ep_poll_safewake() can do:
    > >
    > > /* Do really wake up now */
    > > spin_mark_nested(&wq->lock);
    > > wake_up(wq);
    > > spin_unmark_nested(&wq->lock);

    >
    > I tested the patch above with the following code,
    >
    > wait_queue_head_t w1, w2, w3;
    >
    > init_waitqueue_head(&w1);
    > init_waitqueue_head(&w2);
    > init_waitqueue_head(&w3);
    >
    > local_irq_disable();
    > spin_lock(&w1.lock);
    >
    > spin_mark_nested(&w2.lock);
    > spin_lock(&w2.lock);
    >
    > spin_mark_nested(&w3.lock);
    > wake_up(&w3);
    > spin_unmark_nested(&w3.lock);
    >
    > spin_unlock(&w2.lock);
    > spin_unmark_nested(&w2.lock);
    >
    > spin_unlock(&w1.lock);
    > local_irq_enable();
    >
    > seems to work. What do you think?


    I've been pondering this for a while, and some days I really like it,
    some days I don't.

    The problem I have with it is that it becomes very easy to falsely
    annotate problems away - its a very powerful annotation. That said, its
    almost powerful enough to annotate the device semaphore/mutex problem.

    I think I'll do wake_up_nested() for now and keep this around.

    Thanks for this very nice idea though.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  15. Re: 2.6.24-rc6: possible recursive locking detected

    On 01/13, Peter Zijlstra wrote:
    >
    > On Mon, 2008-01-07 at 20:49 +0300, Oleg Nesterov wrote:
    >
    > > On 01/07, Oleg Nesterov wrote:
    > > >
    > > > Consider this "just for illustration" patch,
    > > >
    > > > --- t/kernel/lockdep.c 2007-11-09 12:57:31.000000000 +0300
    > > > +++ t/kernel/lockdep.c 2008-01-07 19:43:50.000000000 +0300
    > > > @@ -1266,10 +1266,13 @@ check_deadlock(struct task_struct *curr,
    > > > struct held_lock *prev;
    > > > int i;
    > > >
    > > > - for (i = 0; i < curr->lockdep_depth; i++) {
    > > > + for (i = curr->lockdep_depth; --i >= 0; ) {
    > > > prev = curr->held_locks + i;
    > > > if (prev->class != next->class)
    > > > continue;
    > > > +
    > > > + if (prev->trylock == -1)
    > > > + return 2;
    > > > /*
    > > > * Allow read-after-read recursion of the same
    > > > * lock class (i.e. read_lock(lock)+read_lock(lock)):
    > > > -------------------------------------------------------------------------
    > > >
    > > > Now,
    > > >
    > > > // trylock == -1
    > > > #define spin_mark_nested(l) \
    > > > lock_acquire(&(l)->dep_map, 0, -1, 0, 2, _THIS_IP_)
    > > > #define spin_unmark_nested(l) \
    > > > lock_release(&(l)->dep_map, 1, _THIS_IP_)
    > > >
    > > > and ep_poll_safewake() can do:
    > > >
    > > > /* Do really wake up now */
    > > > spin_mark_nested(&wq->lock);
    > > > wake_up(wq);
    > > > spin_unmark_nested(&wq->lock);

    > >
    > > seems to work. What do you think?

    >
    > I've been pondering this for a while, and some days I really like it,
    > some days I don't.
    >
    > The problem I have with it is that it becomes very easy to falsely
    > annotate problems away - its a very powerful annotation.


    Also, I don't like the overloading of ->trylock, this is really hackish.

    > I think I'll do wake_up_nested() for now and keep this around.


    Agreed.

    Perhaps it is a bit easier to use spin_lock_nested() + __wake_up_common()
    directly (we have a lot of wake_up_xxx helpers), but this is up to you.


    Offtopic question. Why do we have so many lockdep stuff in timer.c and hrtimer.c ?
    We never lock 2 bases at the same time, except in migrate_timers(). We can kill
    double_spin_lock() and base_lock_keys[] and just use spin_lock_nested in
    migrate_timers(), no?

    Oleg.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  16. hrtimers and lockdep (was: Re: 2.6.24-rc6: possible recursive locking detected)

    ( trimmed CC list )

    Sorry for the delay, this message seems to have gotten lost in my
    inbox :-/

    On Tue, 2008-01-15 at 00:27 +0300, Oleg Nesterov wrote:
    > On 01/13, Peter Zijlstra wrote:


    > Offtopic question. Why do we have so many lockdep stuff in timer.c and hrtimer.c ?
    > We never lock 2 bases at the same time, except in migrate_timers(). We can kill
    > double_spin_lock() and base_lock_keys[] and just use spin_lock_nested in
    > migrate_timers(), no?


    Lets ask Thomas.. :-)

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  17. Re: hrtimers and lockdep (was: Re: 2.6.24-rc6: possible recursive locking detected)

    On Wed, 30 Jan 2008, Peter Zijlstra wrote:

    > ( trimmed CC list )
    >
    > Sorry for the delay, this message seems to have gotten lost in my
    > inbox :-/
    >
    > On Tue, 2008-01-15 at 00:27 +0300, Oleg Nesterov wrote:
    > > On 01/13, Peter Zijlstra wrote:

    >
    > > Offtopic question. Why do we have so many lockdep stuff in timer.c and hrtimer.c ?
    > > We never lock 2 bases at the same time, except in migrate_timers(). We can kill
    > > double_spin_lock() and base_lock_keys[] and just use spin_lock_nested in
    > > migrate_timers(), no?

    >
    > Lets ask Thomas.. :-)


    No objections as long as it all works

    tglx
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread