[PATCH 0/2][RT] powerpc - fix bug in irq reverse mapping radix tree (Resend) - Kernel

This is a discussion on [PATCH 0/2][RT] powerpc - fix bug in irq reverse mapping radix tree (Resend) - Kernel ; (This is resend as vger dropped my previous attempt, sorry for the duplication) Hi, here are 2 patches for fixing the following bug occuring on IBM pSeries under an RT kernel: BUG: sleeping function called from invalid context swapper(1) at ...

+ Reply to Thread
Results 1 to 10 of 10

Thread: [PATCH 0/2][RT] powerpc - fix bug in irq reverse mapping radix tree (Resend)

  1. [PATCH 0/2][RT] powerpc - fix bug in irq reverse mapping radix tree (Resend)

    (This is resend as vger dropped my previous attempt, sorry for the duplication)

    Hi,

    here are 2 patches for fixing the following bug occuring on IBM pSeries under
    an RT kernel:

    BUG: sleeping function called from invalid context swapper(1) at kernel/rtmutex.c:739
    in_atomic():1 [00000002], irqs_disabled():1
    Call Trace:
    [c0000001e20f3340] [c000000000010370] .show_stack+0x70/0x1bc (unreliable)
    [c0000001e20f33f0] [c000000000049380] .__might_sleep+0x11c/0x138
    [c0000001e20f3470] [c0000000002a2f64] .__rt_spin_lock+0x3c/0x98
    [c0000001e20f34f0] [c0000000000c3f20] .kmem_cache_alloc+0x68/0x184
    [c0000001e20f3590] [c000000000193f3c] .radix_tree_node_alloc+0xf0/0x144
    [c0000001e20f3630] [c000000000195190] .radix_tree_insert+0x18c/0x2fc
    [c0000001e20f36f0] [c00000000000c710] .irq_radix_revmap+0x1a4/0x1e4
    [c0000001e20f37b0] [c00000000003b3f0] .xics_startup+0x30/0x54
    [c0000001e20f3840] [c00000000008b864] .setup_irq+0x26c/0x370
    [c0000001e20f38f0] [c00000000008ba68] .request_irq+0x100/0x158
    [c0000001e20f39a0] [c0000000001ee9c0] .hvc_open+0xb4/0x148
    [c0000001e20f3a40] [c0000000001d72ec] .tty_open+0x200/0x368
    [c0000001e20f3af0] [c0000000000ce928] .chrdev_open+0x1f4/0x25c
    [c0000001e20f3ba0] [c0000000000c8bf0] .__dentry_open+0x188/0x2c8
    [c0000001e20f3c50] [c0000000000c8dec] .do_filp_open+0x50/0x70
    [c0000001e20f3d70] [c0000000000c8e8c] .do_sys_open+0x80/0x148
    [c0000001e20f3e20] [c00000000000928c] .init_post+0x4c/0x100
    [c0000001e20f3ea0] [c0000000003c0e0c] .kernel_init+0x428/0x478
    [c0000001e20f3f90] [c000000000027448] .kernel_thread+0x4c/0x68


    The root cause of this bug lies in the fact that the XICS interrupt controller
    uses a radix tree for its reverse irq mapping and that we cannot allocate the tree
    nodes (even GFP_ATOMIC) with preemption disabled.

    In fact, we have 2 nested preemption disabling when we want to allocate
    a new node:

    - setup_irq() does a spin_lock_irqsave() before calling xics_startup() which
    then calls irq_radix_revmap() to insert a new node in the tree

    - irq_radix_revmap() also does a spin_lock_irqsave() (in irq_radix_wrlock())
    before the radix_tree_insert()

    The first patch moves the call to irq_radix_revmap() from xics_startup() out to
    xics_host_map_direct() and xics_host_map_lpar() which are called with preemption
    enabled.

    The second patch is a little more involved in that it takes advantage of
    the concurrent radix tree to simplify the locking requirements and allows
    to allocate a new node outside a preemption disabled section.

    I just hope I've correctly understood the concurrent radix trees semantic
    and got the (absence of) locking right.

    Sebastien.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: [PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless

    On Thursday 24 July 2008 20:50, Sebastien Dugue wrote:
    > From: Sebastien Dugue
    > Date: Tue, 22 Jul 2008 11:56:41 +0200
    > Subject: [PATCH][RT] powerpc - Make the irq reverse mapping radix tree
    > lockless
    >
    > The radix tree used by interrupt controllers for their irq reverse
    > mapping (currently only the XICS found on pSeries) have a complex locking
    > scheme dating back to before the advent of the concurrent radix tree on
    > preempt-rt.
    >
    > Take advantage of this and of the fact that the items of the tree are
    > pointers to a static array (irq_map) elements which can never go under us
    > to simplify the locking.
    >
    > Concurrency between readers and writers are handled by the intrinsic
    > properties of the concurrent radix tree. Concurrency between the tree
    > initialization which is done asynchronously with readers and writers access
    > is handled via an atomic variable (revmap_trees_allocated) set when the
    > tree has been initialized and checked before any reader or writer access
    > just like we used to check for tree.gfp_mask != 0 before.


    Hmm, RCU radix tree is in mainline too for quite a while. I thought
    Ben had already converted this code over ages ago...

    Nothing against the -rt patch, but mainline should probably be updated
    to use RCU as well?
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless

    On Thu, 24 Jul 2008 21:11:34 +1000 Nick Piggin wrote:

    > On Thursday 24 July 2008 20:50, Sebastien Dugue wrote:
    > > From: Sebastien Dugue
    > > Date: Tue, 22 Jul 2008 11:56:41 +0200
    > > Subject: [PATCH][RT] powerpc - Make the irq reverse mapping radix tree
    > > lockless
    > >
    > > The radix tree used by interrupt controllers for their irq reverse
    > > mapping (currently only the XICS found on pSeries) have a complex locking
    > > scheme dating back to before the advent of the concurrent radix tree on
    > > preempt-rt.
    > >
    > > Take advantage of this and of the fact that the items of the tree are
    > > pointers to a static array (irq_map) elements which can never go under us
    > > to simplify the locking.
    > >
    > > Concurrency between readers and writers are handled by the intrinsic
    > > properties of the concurrent radix tree. Concurrency between the tree
    > > initialization which is done asynchronously with readers and writers access
    > > is handled via an atomic variable (revmap_trees_allocated) set when the
    > > tree has been initialized and checked before any reader or writer access
    > > just like we used to check for tree.gfp_mask != 0 before.

    >
    > Hmm, RCU radix tree is in mainline too for quite a while. I thought
    > Ben had already converted this code over ages ago...


    Mainline does not have the concurrent radix tree which this patch
    is based on, but maybe it's overkill and the RCU radix tree is enough.
    Not sure, will have to think about it a bit more.

    >
    > Nothing against the -rt patch, but mainline should probably be updated
    > to use RCU as well?
    >


    If rcu radix tree is enough, then definitely yes.

    Sebastien.

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless


    > > Concurrency between readers and writers are handled by the intrinsic
    > > properties of the concurrent radix tree. Concurrency between the tree
    > > initialization which is done asynchronously with readers and writers access
    > > is handled via an atomic variable (revmap_trees_allocated) set when the
    > > tree has been initialized and checked before any reader or writer access
    > > just like we used to check for tree.gfp_mask != 0 before.

    >
    > Hmm, RCU radix tree is in mainline too for quite a while. I thought
    > Ben had already converted this code over ages ago...
    >
    > Nothing against the -rt patch, but mainline should probably be updated
    > to use RCU as well?


    No, I haven't updated that code yet, and yes, we should do it :-)

    Cheers,
    Ben.


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless

    On Thu, 2008-07-24 at 14:18 +0200, Sebastien Dugue wrote:
    > On Thu, 24 Jul 2008 21:11:34 +1000 Nick Piggin wrote:
    >
    > > On Thursday 24 July 2008 20:50, Sebastien Dugue wrote:
    > > > From: Sebastien Dugue
    > > > Date: Tue, 22 Jul 2008 11:56:41 +0200
    > > > Subject: [PATCH][RT] powerpc - Make the irq reverse mapping radix tree
    > > > lockless
    > > >
    > > > The radix tree used by interrupt controllers for their irq reverse
    > > > mapping (currently only the XICS found on pSeries) have a complex locking
    > > > scheme dating back to before the advent of the concurrent radix tree on
    > > > preempt-rt.
    > > >
    > > > Take advantage of this and of the fact that the items of the tree are
    > > > pointers to a static array (irq_map) elements which can never go under us
    > > > to simplify the locking.
    > > >
    > > > Concurrency between readers and writers are handled by the intrinsic
    > > > properties of the concurrent radix tree. Concurrency between the tree
    > > > initialization which is done asynchronously with readers and writers access
    > > > is handled via an atomic variable (revmap_trees_allocated) set when the
    > > > tree has been initialized and checked before any reader or writer access
    > > > just like we used to check for tree.gfp_mask != 0 before.

    > >
    > > Hmm, RCU radix tree is in mainline too for quite a while. I thought
    > > Ben had already converted this code over ages ago...

    >
    > Mainline does not have the concurrent radix tree which this patch
    > is based on, but maybe it's overkill and the RCU radix tree is enough.
    > Not sure, will have to think about it a bit more.


    Should be. The model of the concurrent radix tree can be mapped to
    spinlock + rcu radix tree.

    So instead of:

    > + DEFINE_RADIX_TREE_CONTEXT(ctx, tree);
    > + radix_tree_lock(&ctx);
    > + radix_tree_insert(ctx.tree, hwirq, &irq_map[virq]);
    > + radix_tree_unlock(&ctx);



    you then write:

    spin_lock(&host->revmap_data.tree_lock);
    radix_tree_insert(&host->revmap_data.tree, hwirq, &irq_map[virq]);
    spin_unlock(&host->revmap_data.tree_lock);


    The only advantage of the concurrent radix tree over this model is that
    it can potentially do multiple modification operations at the same time.

    Still, cool that you used it ;-)

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless


    Hi Peter,

    On Fri, 25 Jul 2008 09:49:37 +0200 Peter Zijlstra wrote:

    > On Thu, 2008-07-24 at 14:18 +0200, Sebastien Dugue wrote:
    > > On Thu, 24 Jul 2008 21:11:34 +1000 Nick Piggin wrote:
    > >
    > > > On Thursday 24 July 2008 20:50, Sebastien Dugue wrote:
    > > > > From: Sebastien Dugue
    > > > > Date: Tue, 22 Jul 2008 11:56:41 +0200
    > > > > Subject: [PATCH][RT] powerpc - Make the irq reverse mapping radix tree
    > > > > lockless
    > > > >
    > > > > The radix tree used by interrupt controllers for their irq reverse
    > > > > mapping (currently only the XICS found on pSeries) have a complex locking
    > > > > scheme dating back to before the advent of the concurrent radix tree on
    > > > > preempt-rt.
    > > > >
    > > > > Take advantage of this and of the fact that the items of the tree are
    > > > > pointers to a static array (irq_map) elements which can never go under us
    > > > > to simplify the locking.
    > > > >
    > > > > Concurrency between readers and writers are handled by the intrinsic
    > > > > properties of the concurrent radix tree. Concurrency between the tree
    > > > > initialization which is done asynchronously with readers and writers access
    > > > > is handled via an atomic variable (revmap_trees_allocated) set when the
    > > > > tree has been initialized and checked before any reader or writer access
    > > > > just like we used to check for tree.gfp_mask != 0 before.
    > > >
    > > > Hmm, RCU radix tree is in mainline too for quite a while. I thought
    > > > Ben had already converted this code over ages ago...

    > >
    > > Mainline does not have the concurrent radix tree which this patch
    > > is based on, but maybe it's overkill and the RCU radix tree is enough.
    > > Not sure, will have to think about it a bit more.

    >
    > Should be. The model of the concurrent radix tree can be mapped to
    > spinlock + rcu radix tree.
    >
    > So instead of:
    >
    > > + DEFINE_RADIX_TREE_CONTEXT(ctx, tree);
    > > + radix_tree_lock(&ctx);
    > > + radix_tree_insert(ctx.tree, hwirq, &irq_map[virq]);
    > > + radix_tree_unlock(&ctx);

    >
    >
    > you then write:
    >
    > spin_lock(&host->revmap_data.tree_lock);
    > radix_tree_insert(&host->revmap_data.tree, hwirq, &irq_map[virq]);
    > spin_unlock(&host->revmap_data.tree_lock);
    >


    Cool, that will indeed makes it much easier to have something applicable
    to mainline which works with preempt-rt.

    >
    > The only advantage of the concurrent radix tree over this model is that
    > it can potentially do multiple modification operations at the same time.


    Well in theory that can happen if a module is loaded which creates a mapping
    while another one is unloaded at the same time. The time window is pretty narrow,
    but still present nonetheless. That's why I chose to use the concurrent version.

    >
    > Still, cool that you used it ;-)



    Yep, looked like what was needed until I realized it was not available in
    mainline. Nice work though and good paper for explaining it all.

    Sebastien.


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: [PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless

    On Fri, 2008-07-25 at 09:49 +0200, Peter Zijlstra wrote:
    >
    >
    > The only advantage of the concurrent radix tree over this model is that
    > it can potentially do multiple modification operations at the same time.


    Yup, we do not need that for the irq revmap... concurrent lookup is all we need.

    Cheers,
    Ben.


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: [PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless

    On Fri, 25 Jul 2008 18:27:20 +1000 Benjamin Herrenschmidt wrote:

    > On Fri, 2008-07-25 at 09:49 +0200, Peter Zijlstra wrote:
    > >
    > >
    > > The only advantage of the concurrent radix tree over this model is that
    > > it can potentially do multiple modification operations at the same time.

    >
    > Yup, we do not need that for the irq revmap... concurrent lookup is all we need.
    >


    Shouldn't we care about concurrent insertion and deletion in the tree? I agree
    that concern might be a bit artificial but in theory that can happen.

    Sebastien.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: [PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless

    On Fri, 2008-07-25 at 10:36 +0200, Sebastien Dugue wrote:
    > On Fri, 25 Jul 2008 18:27:20 +1000 Benjamin Herrenschmidt wrote:
    >
    > > On Fri, 2008-07-25 at 09:49 +0200, Peter Zijlstra wrote:
    > > >
    > > >
    > > > The only advantage of the concurrent radix tree over this model is that
    > > > it can potentially do multiple modification operations at the same time.

    > >
    > > Yup, we do not need that for the irq revmap... concurrent lookup is all we need.
    > >

    >
    > Shouldn't we care about concurrent insertion and deletion in the tree? I agree
    > that concern might be a bit artificial but in theory that can happen.


    Yes, we just need to protect it with a big hammer, like a spinlock, it's
    not a performance critical code path.

    Ben.


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: [PATCH 2/2][RT] powerpc - Make the irq reverse mapping radix tree lockless

    On Fri, 25 Jul 2008 18:40:21 +1000 Benjamin Herrenschmidt wrote:

    > On Fri, 2008-07-25 at 10:36 +0200, Sebastien Dugue wrote:
    > > On Fri, 25 Jul 2008 18:27:20 +1000 Benjamin Herrenschmidt wrote:
    > >
    > > > On Fri, 2008-07-25 at 09:49 +0200, Peter Zijlstra wrote:
    > > > >
    > > > >
    > > > > The only advantage of the concurrent radix tree over this model is that
    > > > > it can potentially do multiple modification operations at the same time.
    > > >
    > > > Yup, we do not need that for the irq revmap... concurrent lookup is all we need.
    > > >

    > >
    > > Shouldn't we care about concurrent insertion and deletion in the tree? I agree
    > > that concern might be a bit artificial but in theory that can happen.

    >
    > Yes, we just need to protect it with a big hammer, like a spinlock, it's
    > not a performance critical code path.


    Agreed. Will look into this in the next few days.

    Thanks,

    Sebastien.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread