[PATCH 0/3] powerpc - Make the irq reverse mapping tree lockless - Kernel

This is a discussion on [PATCH 0/3] powerpc - Make the irq reverse mapping tree lockless - Kernel ; Hi , here is a respin of the patches I posted last week for the RT kernel now targeted for mainline ( http://lkml.org/lkml/2008/7/24/98 ). Thomas, steven, a note for you at the end. The goal of this patchset is to ...

+ Reply to Thread
Results 1 to 13 of 13

Thread: [PATCH 0/3] powerpc - Make the irq reverse mapping tree lockless

  1. [PATCH 0/3] powerpc - Make the irq reverse mapping tree lockless

    Hi ,

    here is a respin of the patches I posted last week for the RT kernel now targeted
    for mainline (http://lkml.org/lkml/2008/7/24/98). Thomas, steven, a note for you
    at the end.

    The goal of this patchset is to simplify the locking constraints on the radix
    tree used for IRQ reverse mapping on the pSeries machines and provide lockless
    access to this tree.

    This also solves the following BUG under preempt-rt:

    BUG: sleeping function called from invalid context swapper(1) at kernel/rtmutex.c:739
    in_atomic():1 [00000002], irqs_disabled():1
    Call Trace:
    [c0000001e20f3340] [c000000000010370] .show_stack+0x70/0x1bc (unreliable)
    [c0000001e20f33f0] [c000000000049380] .__might_sleep+0x11c/0x138
    [c0000001e20f3470] [c0000000002a2f64] .__rt_spin_lock+0x3c/0x98
    [c0000001e20f34f0] [c0000000000c3f20] .kmem_cache_alloc+0x68/0x184
    [c0000001e20f3590] [c000000000193f3c] .radix_tree_node_alloc+0xf0/0x144
    [c0000001e20f3630] [c000000000195190] .radix_tree_insert+0x18c/0x2fc
    [c0000001e20f36f0] [c00000000000c710] .irq_radix_revmap+0x1a4/0x1e4
    [c0000001e20f37b0] [c00000000003b3f0] .xics_startup+0x30/0x54
    [c0000001e20f3840] [c00000000008b864] .setup_irq+0x26c/0x370
    [c0000001e20f38f0] [c00000000008ba68] .request_irq+0x100/0x158
    [c0000001e20f39a0] [c0000000001ee9c0] .hvc_open+0xb4/0x148
    [c0000001e20f3a40] [c0000000001d72ec] .tty_open+0x200/0x368
    [c0000001e20f3af0] [c0000000000ce928] .chrdev_open+0x1f4/0x25c
    [c0000001e20f3ba0] [c0000000000c8bf0] .__dentry_open+0x188/0x2c8
    [c0000001e20f3c50] [c0000000000c8dec] .do_filp_open+0x50/0x70
    [c0000001e20f3d70] [c0000000000c8e8c] .do_sys_open+0x80/0x148
    [c0000001e20f3e20] [c00000000000928c] .init_post+0x4c/0x100
    [c0000001e20f3ea0] [c0000000003c0e0c] .kernel_init+0x428/0x478
    [c0000001e20f3f90] [c000000000027448] .kernel_thread+0x4c/0x68

    The root cause of this bug lies in the fact that the XICS interrupt controller
    uses a radix tree for its reverse irq mapping and that we cannot allocate the tree
    nodes (even GFP_ATOMIC) with preemption disabled.

    In fact, we have 2 nested preemption disabling when we want to allocate
    a new node:

    - setup_irq() does a spin_lock_irqsave() before calling xics_startup() which
    then calls irq_radix_revmap() to insert a new node in the tree

    - irq_radix_revmap() also does a spin_lock_irqsave() (in irq_radix_wrlock())
    before the radix_tree_insert()

    Also, if an IRQ gets registered before the tree is initialized (namely the
    IPI), it will be inserted into the tree in interrupt context once the tree
    have been initialized, hence the need for a spin_lock_irqsave() in the insertion
    path.

    This serie is split into 3 patches:

    - The first patch moves the initialization of the radix tree earlier in the
    boot process before any IRQ gets registered, but after the mm is up.

    - The second patch splits irq_radix_revmap() into its 2 components: one
    for lookup and one for insertion into the radix tree.

    - And finally, the third patch makes the radix tree fully lockless on the
    lookup side.


    Here is the diffstat for the whole patchset:

    arch/powerpc/kernel/irq.c | 134 ++++++++-------------------------
    arch/powerpc/platforms/pseries/smp.c | 1 +
    arch/powerpc/platforms/pseries/xics.c | 11 +--
    include/asm-powerpc/irq.h | 24 +++++-
    4 files changed, 58 insertions(+), 112 deletions(-)


    Thomas, Steven, the first 2 patches can be applied seamlessly to 2.6.26-rt1
    with offsets, the third patch has a trivial to fix reject in
    arch/powerpc/kernel/irq.c because the irq_big_lock is changed to a raw spinlock
    in preempt-rt. If you want those patches for RT, just flag me, I have those
    sitting on my test box.



    Thanks,

    Sebastien.



    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. [PATCH] powerpc - Separate the irq radix tree insertion and lookup

    irq_radix_revmap() currently serves 2 purposes, irq mapping lookup
    and insertion which happen in interrupt and process context respectively.

    Separate the function into its 2 components, one for lookup only and one
    for insertion only.


    Signed-off-by: Sebastien Dugue
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    ---
    arch/powerpc/kernel/irq.c | 25 ++++++++++++++-----------
    arch/powerpc/platforms/pseries/xics.c | 11 ++++-------
    include/asm-powerpc/irq.h | 18 +++++++++++++++---
    3 files changed, 33 insertions(+), 21 deletions(-)

    diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
    index 0a1445c..083b181 100644
    --- a/arch/powerpc/kernel/irq.c
    +++ b/arch/powerpc/kernel/irq.c
    @@ -900,34 +900,37 @@ void __init irq_radix_revmap_init(void)
    }
    }

    -unsigned int irq_radix_revmap(struct irq_host *host,
    - irq_hw_number_t hwirq)
    +unsigned int irq_radix_revmap_lookup(struct irq_host *host,
    + irq_hw_number_t hwirq)
    {
    struct irq_map_entry *ptr;
    - unsigned int virq;
    + unsigned int virq = NO_IRQ;
    unsigned long flags;

    WARN_ON(host->revmap_type != IRQ_HOST_MAP_TREE);

    - /* Try to resolve */
    irq_radix_rdlock(&flags);
    ptr = radix_tree_lookup(&host->revmap_data.tree, hwirq);
    irq_radix_rdunlock(flags);

    - /* Found it, return */
    - if (ptr) {
    + if (ptr)
    virq = ptr - irq_map;
    - return virq;
    - }

    - /* If not there, try to insert it */
    - virq = irq_find_mapping(host, hwirq);
    + return virq;
    +}
    +
    +void irq_radix_revmap_insert(struct irq_host *host, unsigned int virq,
    + irq_hw_number_t hwirq)
    +{
    + unsigned long flags;
    +
    + WARN_ON(host->revmap_type != IRQ_HOST_MAP_TREE);
    +
    if (virq != NO_IRQ) {
    irq_radix_wrlock(&flags);
    radix_tree_insert(&host->revmap_data.tree, hwirq, &irq_map[virq]);
    irq_radix_wrunlock(flags);
    }
    - return virq;
    }

    unsigned int irq_linear_revmap(struct irq_host *host,
    diff --git a/arch/powerpc/platforms/pseries/xics.c b/arch/powerpc/platforms/pseries/xics.c
    index 0fc830f..6b1a005 100644
    --- a/arch/powerpc/platforms/pseries/xics.c
    +++ b/arch/powerpc/platforms/pseries/xics.c
    @@ -310,12 +310,6 @@ static void xics_mask_irq(unsigned int virq)

    static unsigned int xics_startup(unsigned int virq)
    {
    - unsigned int irq;
    -
    - /* force a reverse mapping of the interrupt so it gets in the cache */
    - irq = (unsigned int)irq_map[virq].hwirq;
    - irq_radix_revmap(xics_host, irq);
    -
    /* unmask it */
    xics_unmask_irq(virq);
    return 0;
    @@ -346,7 +340,7 @@ static inline unsigned int xics_remap_irq(unsigned int vec)

    if (vec == XICS_IRQ_SPURIOUS)
    return NO_IRQ;
    - irq = irq_radix_revmap(xics_host, vec);
    + irq = irq_radix_revmap_lookup(xics_host, vec);
    if (likely(irq != NO_IRQ))
    return irq;

    @@ -530,6 +524,9 @@ static int xics_host_map(struct irq_host *h, unsigned int virq,
    {
    pr_debug("xics: map virq %d, hwirq 0x%lx\n", virq, hw);

    + /* Insert the interrupt mapping into the radix tree for fast lookup */
    + irq_radix_revmap_insert(xics_host, virq, hw);
    +
    get_irq_desc(virq)->status |= IRQ_LEVEL;
    set_irq_chip_and_handler(virq, xics_irq_chip, handle_fasteoi_irq);
    return 0;
    diff --git a/include/asm-powerpc/irq.h b/include/asm-powerpc/irq.h
    index 47b8119..5c88acf 100644
    --- a/include/asm-powerpc/irq.h
    +++ b/include/asm-powerpc/irq.h
    @@ -243,15 +243,27 @@ extern unsigned int irq_create_direct_mapping(struct irq_host *host);
    extern void __init irq_radix_revmap_init(void);

    /**
    - * irq_radix_revmap - Find a linux virq from a hw irq number.
    + * irq_radix_revmap_insert - Insert a hw irq to linux virq number mapping.
    + * @host: host owning this hardware interrupt
    + * @virq: linux irq number
    + * @hwirq: hardware irq number in that host space
    + *
    + * This is for use by irq controllers that use a radix tree reverse
    + * mapping for fast lookup.
    + */
    +extern void irq_radix_revmap_insert(struct irq_host *host, unsigned int virq,
    + irq_hw_number_t hwirq);
    +
    +/**
    + * irq_radix_revmap_lookup - Find a linux virq from a hw irq number.
    * @host: host owning this hardware interrupt
    * @hwirq: hardware irq number in that host space
    *
    * This is a fast path, for use by irq controller code that uses radix tree
    * revmaps
    */
    -extern unsigned int irq_radix_revmap(struct irq_host *host,
    - irq_hw_number_t hwirq);
    +extern unsigned int irq_radix_revmap_lookup(struct irq_host *host,
    + irq_hw_number_t hwirq);

    /**
    * irq_linear_revmap - Find a linux virq from a hw irq number.
    --
    1.5.5.1

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. [PATCH] powerpc - Initialize the irq radix tree earlier

    The radix tree used for fast irq reverse mapping by the XICS is initialized
    late in the boot process, after the first interrupt (IPI) gets registered
    and after the first IPI is received.

    This patch moves the initialization of the XICS radix tree earlier into
    the boot process in smp_xics_probe() (the mm is already up but no interrupts
    have been registered at that point) to avoid having to insert a mapping into
    the tree in interrupt context. This will help in simplifying the locking
    constraints and move to a lockless radix tree in subsequent patches.

    As a nice side effect, there is no need any longer to check for
    (host->revmap_data.tree.gfp_mask != 0) to know if the tree have been
    initialized.


    Signed-off-by: Sebastien Dugue
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    ---
    arch/powerpc/kernel/irq.c | 44 +++++++++------------------------
    arch/powerpc/platforms/pseries/smp.c | 1 +
    include/asm-powerpc/irq.h | 5 ++++
    3 files changed, 18 insertions(+), 32 deletions(-)

    diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
    index 6ac8612..0a1445c 100644
    --- a/arch/powerpc/kernel/irq.c
    +++ b/arch/powerpc/kernel/irq.c
    @@ -840,9 +840,6 @@ void irq_dispose_mapping(unsigned int virq)
    host->revmap_data.linear.revmap[hwirq] = NO_IRQ;
    break;
    case IRQ_HOST_MAP_TREE:
    - /* Check if radix tree allocated yet */
    - if (host->revmap_data.tree.gfp_mask == 0)
    - break;
    irq_radix_wrlock(&flags);
    radix_tree_delete(&host->revmap_data.tree, hwirq);
    irq_radix_wrunlock(flags);
    @@ -893,28 +890,28 @@ unsigned int irq_find_mapping(struct irq_host *host,
    }
    EXPORT_SYMBOL_GPL(irq_find_mapping);

    +void __init irq_radix_revmap_init(void)
    +{
    + struct irq_host *h;
    +
    + list_for_each_entry(h, &irq_hosts, link) {
    + if (h->revmap_type == IRQ_HOST_MAP_TREE)
    + INIT_RADIX_TREE(&h->revmap_data.tree, GFP_ATOMIC);
    + }
    +}

    unsigned int irq_radix_revmap(struct irq_host *host,
    irq_hw_number_t hwirq)
    {
    - struct radix_tree_root *tree;
    struct irq_map_entry *ptr;
    unsigned int virq;
    unsigned long flags;

    WARN_ON(host->revmap_type != IRQ_HOST_MAP_TREE);

    - /* Check if the radix tree exist yet. We test the value of
    - * the gfp_mask for that. Sneaky but saves another int in the
    - * structure. If not, we fallback to slow mode
    - */
    - tree = &host->revmap_data.tree;
    - if (tree->gfp_mask == 0)
    - return irq_find_mapping(host, hwirq);
    -
    - /* Now try to resolve */
    + /* Try to resolve */
    irq_radix_rdlock(&flags);
    - ptr = radix_tree_lookup(tree, hwirq);
    + ptr = radix_tree_lookup(&host->revmap_data.tree, hwirq);
    irq_radix_rdunlock(flags);

    /* Found it, return */
    @@ -927,7 +924,7 @@ unsigned int irq_radix_revmap(struct irq_host *host,
    virq = irq_find_mapping(host, hwirq);
    if (virq != NO_IRQ) {
    irq_radix_wrlock(&flags);
    - radix_tree_insert(tree, hwirq, &irq_map[virq]);
    + radix_tree_insert(&host->revmap_data.tree, hwirq, &irq_map[virq]);
    irq_radix_wrunlock(flags);
    }
    return virq;
    @@ -1035,23 +1032,6 @@ void irq_early_init(void)
    get_irq_desc(i)->status |= IRQ_NOREQUEST;
    }

    -/* We need to create the radix trees late */
    -static int irq_late_init(void)
    -{
    - struct irq_host *h;
    - unsigned long flags;
    -
    - irq_radix_wrlock(&flags);
    - list_for_each_entry(h, &irq_hosts, link) {
    - if (h->revmap_type == IRQ_HOST_MAP_TREE)
    - INIT_RADIX_TREE(&h->revmap_data.tree, GFP_ATOMIC);
    - }
    - irq_radix_wrunlock(flags);
    -
    - return 0;
    -}
    -arch_initcall(irq_late_init);
    -
    #ifdef CONFIG_VIRQ_DEBUG
    static int virq_debug_show(struct seq_file *m, void *private)
    {
    diff --git a/arch/powerpc/platforms/pseries/smp.c b/arch/powerpc/platforms/pseries/smp.c
    index 9d8f8c8..b143fe7 100644
    --- a/arch/powerpc/platforms/pseries/smp.c
    +++ b/arch/powerpc/platforms/pseries/smp.c
    @@ -130,6 +130,7 @@ static void smp_xics_message_pass(int target, int msg)

    static int __init smp_xics_probe(void)
    {
    + irq_radix_revmap_init();
    xics_request_IPIs();

    return cpus_weight(cpu_possible_map);
    diff --git a/include/asm-powerpc/irq.h b/include/asm-powerpc/irq.h
    index 1ef8e30..47b8119 100644
    --- a/include/asm-powerpc/irq.h
    +++ b/include/asm-powerpc/irq.h
    @@ -237,6 +237,11 @@ extern unsigned int irq_find_mapping(struct irq_host *host,
    */
    extern unsigned int irq_create_direct_mapping(struct irq_host *host);

    +/*
    + * Initialize the radix tree used by some irq controllers
    + */
    +extern void __init irq_radix_revmap_init(void);
    +
    /**
    * irq_radix_revmap - Find a linux virq from a hw irq number.
    * @host: host owning this hardware interrupt
    --
    1.5.5.1

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. [PATCH] powerpc - Make the irq reverse mapping radix tree lockless

    The radix trees used by interrupt controllers for their irq reverse mapping
    (currently only the XICS found on pSeries) have a complex locking scheme
    dating back to before the advent of the lockless radix tree.

    Take advantage of this and of the fact that the items of the tree are
    pointers to a static array (irq_map) elements which can never go under us
    to simplify the locking.

    Concurrency between readers and writers is handled by the intrinsic
    properties of the lockless radix tree. Concurrency between writers is handled
    with a spinlock added to the irq_host structure.


    Signed-off-by: Sebastien Dugue
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    ---
    arch/powerpc/kernel/irq.c | 75 ++++++--------------------------------------
    include/asm-powerpc/irq.h | 1 +
    2 files changed, 12 insertions(+), 64 deletions(-)

    diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
    index 083b181..3aa683b 100644
    --- a/arch/powerpc/kernel/irq.c
    +++ b/arch/powerpc/kernel/irq.c
    @@ -458,8 +458,6 @@ void do_softirq(void)

    static LIST_HEAD(irq_hosts);
    static DEFINE_SPINLOCK(irq_big_lock);
    -static DEFINE_PER_CPU(unsigned int, irq_radix_reader);
    -static unsigned int irq_radix_writer;
    struct irq_map_entry irq_map[NR_IRQS];
    static unsigned int irq_virq_count = NR_IRQS;
    static struct irq_host *irq_default_host;
    @@ -602,57 +600,6 @@ void irq_set_virq_count(unsigned int count)
    irq_virq_count = count;
    }

    -/* radix tree not lockless safe ! we use a brlock-type mecanism
    - * for now, until we can use a lockless radix tree
    - */
    -static void irq_radix_wrlock(unsigned long *flags)
    -{
    - unsigned int cpu, ok;
    -
    - spin_lock_irqsave(&irq_big_lock, *flags);
    - irq_radix_writer = 1;
    - smp_mb();
    - do {
    - barrier();
    - ok = 1;
    - for_each_possible_cpu(cpu) {
    - if (per_cpu(irq_radix_reader, cpu)) {
    - ok = 0;
    - break;
    - }
    - }
    - if (!ok)
    - cpu_relax();
    - } while(!ok);
    -}
    -
    -static void irq_radix_wrunlock(unsigned long flags)
    -{
    - smp_wmb();
    - irq_radix_writer = 0;
    - spin_unlock_irqrestore(&irq_big_lock, flags);
    -}
    -
    -static void irq_radix_rdlock(unsigned long *flags)
    -{
    - local_irq_save(*flags);
    - __get_cpu_var(irq_radix_reader) = 1;
    - smp_mb();
    - if (likely(irq_radix_writer == 0))
    - return;
    - __get_cpu_var(irq_radix_reader) = 0;
    - smp_wmb();
    - spin_lock(&irq_big_lock);
    - __get_cpu_var(irq_radix_reader) = 1;
    - spin_unlock(&irq_big_lock);
    -}
    -
    -static void irq_radix_rdunlock(unsigned long flags)
    -{
    - __get_cpu_var(irq_radix_reader) = 0;
    - local_irq_restore(flags);
    -}
    -
    static int irq_setup_virq(struct irq_host *host, unsigned int virq,
    irq_hw_number_t hwirq)
    {
    @@ -807,7 +754,6 @@ void irq_dispose_mapping(unsigned int virq)
    {
    struct irq_host *host;
    irq_hw_number_t hwirq;
    - unsigned long flags;

    if (virq == NO_IRQ)
    return;
    @@ -840,9 +786,9 @@ void irq_dispose_mapping(unsigned int virq)
    host->revmap_data.linear.revmap[hwirq] = NO_IRQ;
    break;
    case IRQ_HOST_MAP_TREE:
    - irq_radix_wrlock(&flags);
    + spin_lock(&host->tree_lock);
    radix_tree_delete(&host->revmap_data.tree, hwirq);
    - irq_radix_wrunlock(flags);
    + spin_unlock(&host->tree_lock);
    break;
    }

    @@ -895,8 +841,10 @@ void __init irq_radix_revmap_init(void)
    struct irq_host *h;

    list_for_each_entry(h, &irq_hosts, link) {
    - if (h->revmap_type == IRQ_HOST_MAP_TREE)
    + if (h->revmap_type == IRQ_HOST_MAP_TREE) {
    INIT_RADIX_TREE(&h->revmap_data.tree, GFP_ATOMIC);
    + spin_lock_init(&h->tree_lock);
    + }
    }
    }

    @@ -905,13 +853,14 @@ unsigned int irq_radix_revmap_lookup(struct irq_host *host,
    {
    struct irq_map_entry *ptr;
    unsigned int virq = NO_IRQ;
    - unsigned long flags;

    WARN_ON(host->revmap_type != IRQ_HOST_MAP_TREE);

    - irq_radix_rdlock(&flags);
    + /*
    + * No rcu_read_lock(ing) needed, the ptr returned can't go under us
    + * as it's referencing an entry in the static irq_map table.
    + */
    ptr = radix_tree_lookup(&host->revmap_data.tree, hwirq);
    - irq_radix_rdunlock(flags);

    if (ptr)
    virq = ptr - irq_map;
    @@ -922,14 +871,12 @@ unsigned int irq_radix_revmap_lookup(struct irq_host *host,
    void irq_radix_revmap_insert(struct irq_host *host, unsigned int virq,
    irq_hw_number_t hwirq)
    {
    - unsigned long flags;
    -
    WARN_ON(host->revmap_type != IRQ_HOST_MAP_TREE);

    if (virq != NO_IRQ) {
    - irq_radix_wrlock(&flags);
    + spin_lock(&host->tree_lock);
    radix_tree_insert(&host->revmap_data.tree, hwirq, &irq_map[virq]);
    - irq_radix_wrunlock(flags);
    + spin_unlock(&host->tree_lock);
    }
    }

    diff --git a/include/asm-powerpc/irq.h b/include/asm-powerpc/irq.h
    index 5c88acf..2ae395f 100644
    --- a/include/asm-powerpc/irq.h
    +++ b/include/asm-powerpc/irq.h
    @@ -121,6 +121,7 @@ struct irq_host {
    } linear;
    struct radix_tree_root tree;
    } revmap_data;
    + spinlock_t tree_lock;
    struct irq_host_ops *ops;
    void *host_data;
    irq_hw_number_t inval_irq;
    --
    1.5.5.1

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [PATCH 0/3] powerpc - Make the irq reverse mapping tree lockless


    OK, I goofed up with git-format-patch, forgot the --numbered option.

    The patches subjects should read:

    [PATCH 1/3] powerpc - Initialize the irq radix tree earlier
    [PATCH 2/3] powerpc - Separate the irq radix tree insertion and lookup
    [PATCH 3/3] powerpc - Make the irq reverse mapping radix tree lockless

    Sorry for that.

    Sebastien.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [PATCH] powerpc - Initialize the irq radix tree earlier

    On Thu, 2008-07-31 at 11:40 +0200, Sebastien Dugue wrote:
    > The radix tree used for fast irq reverse mapping by the XICS is initialized
    > late in the boot process, after the first interrupt (IPI) gets registered
    > and after the first IPI is received.
    >
    > This patch moves the initialization of the XICS radix tree earlier into
    > the boot process in smp_xics_probe() (the mm is already up but no interrupts
    > have been registered at that point) to avoid having to insert a mapping into
    > the tree in interrupt context. This will help in simplifying the locking
    > constraints and move to a lockless radix tree in subsequent patches.
    >
    > As a nice side effect, there is no need any longer to check for
    > (host->revmap_data.tree.gfp_mask != 0) to know if the tree have been
    > initialized.


    Hi Sebastien,

    This is a nice cleanup, I think

    > diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
    > index 6ac8612..0a1445c 100644
    > --- a/arch/powerpc/kernel/irq.c
    > +++ b/arch/powerpc/kernel/irq.c
    > @@ -893,28 +890,28 @@ unsigned int irq_find_mapping(struct irq_host *host,
    > }
    > EXPORT_SYMBOL_GPL(irq_find_mapping);
    >
    > +void __init irq_radix_revmap_init(void)
    > +{
    > + struct irq_host *h;
    > +
    > + list_for_each_entry(h, &irq_hosts, link) {
    > + if (h->revmap_type == IRQ_HOST_MAP_TREE)
    > + INIT_RADIX_TREE(&h->revmap_data.tree, GFP_ATOMIC);
    > + }
    > +}


    Note irq_radix_revmap_init() loops over all irq_hosts ...

    > diff --git a/arch/powerpc/platforms/pseries/smp.c b/arch/powerpc/platforms/pseries/smp.c
    > index 9d8f8c8..b143fe7 100644
    > --- a/arch/powerpc/platforms/pseries/smp.c
    > +++ b/arch/powerpc/platforms/pseries/smp.c
    > @@ -130,6 +130,7 @@ static void smp_xics_message_pass(int target, int msg)
    >
    > static int __init smp_xics_probe(void)
    > {
    > + irq_radix_revmap_init();
    > xics_request_IPIs();


    But now it's only called from the xics setup code.

    Which seems a bit ugly. In practice it doesn't matter because at the
    moment xics is the only user of the radix revmap. But if we're going to
    switch to this sort of initialisation I think xics should only be
    init'ing the revmap for itself.


    This boot ordering stuff is pretty hairy, so I might have missed
    something, but this is how the code is ordered AFAICT:
    
    start_kernel()
    init_IRQ()
    ...
    local_irq_enable()
    ...
    rest_init()
    kernel_thread()
    kernel_init()
    smp_prepare_cpus()
    smp_xics_probe() (via smp_ops->probe())


    What's stopping us from taking an irq between local_irq_enable() and
    smp_xics_probe() ? Is it just that no one's request_irq()'ed them yet?


    cheers

    --
    Michael Ellerman
    OzLabs, IBM Australia Development Lab

    wwweb: http://michael.ellerman.id.au
    phone: +61 2 6212 1183 (tie line 70 21183)

    We do not inherit the earth from our ancestors,
    we borrow it from our children. - S.M.A.R.T Person

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.6 (GNU/Linux)

    iD8DBQBIkaTIdSjSd0sB4dIRAlLzAJ9UCXxhuVH9Lau6Ps3Yr8 aelr4a+gCfd2el
    T0F4Q23W4JmI2+5fFQ89OvI=
    =NyUP
    -----END PGP SIGNATURE-----


  7. Re: [PATCH] powerpc - Initialize the irq radix tree earlier


    Hi Michael,

    On Thu, 31 Jul 2008 21:40:56 +1000 Michael Ellerman wrote:

    > On Thu, 2008-07-31 at 11:40 +0200, Sebastien Dugue wrote:
    > > The radix tree used for fast irq reverse mapping by the XICS is initialized
    > > late in the boot process, after the first interrupt (IPI) gets registered
    > > and after the first IPI is received.
    > >
    > > This patch moves the initialization of the XICS radix tree earlier into
    > > the boot process in smp_xics_probe() (the mm is already up but no interrupts
    > > have been registered at that point) to avoid having to insert a mapping into
    > > the tree in interrupt context. This will help in simplifying the locking
    > > constraints and move to a lockless radix tree in subsequent patches.
    > >
    > > As a nice side effect, there is no need any longer to check for
    > > (host->revmap_data.tree.gfp_mask != 0) to know if the tree have been
    > > initialized.

    >
    > Hi Sebastien,
    >
    > This is a nice cleanup, I think


    Thanks.

    >
    > > diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
    > > index 6ac8612..0a1445c 100644
    > > --- a/arch/powerpc/kernel/irq.c
    > > +++ b/arch/powerpc/kernel/irq.c
    > > @@ -893,28 +890,28 @@ unsigned int irq_find_mapping(struct irq_host *host,
    > > }
    > > EXPORT_SYMBOL_GPL(irq_find_mapping);
    > >
    > > +void __init irq_radix_revmap_init(void)
    > > +{
    > > + struct irq_host *h;
    > > +
    > > + list_for_each_entry(h, &irq_hosts, link) {
    > > + if (h->revmap_type == IRQ_HOST_MAP_TREE)
    > > + INIT_RADIX_TREE(&h->revmap_data.tree, GFP_ATOMIC);
    > > + }
    > > +}

    >
    > Note irq_radix_revmap_init() loops over all irq_hosts ...


    Yep, but there's only one host (xics)

    >
    > > diff --git a/arch/powerpc/platforms/pseries/smp.c b/arch/powerpc/platforms/pseries/smp.c
    > > index 9d8f8c8..b143fe7 100644
    > > --- a/arch/powerpc/platforms/pseries/smp.c
    > > +++ b/arch/powerpc/platforms/pseries/smp.c
    > > @@ -130,6 +130,7 @@ static void smp_xics_message_pass(int target, int msg)
    > >
    > > static int __init smp_xics_probe(void)
    > > {
    > > + irq_radix_revmap_init();
    > > xics_request_IPIs();

    >
    > But now it's only called from the xics setup code.
    >
    > Which seems a bit ugly. In practice it doesn't matter because at the
    > moment xics is the only user of the radix revmap. But if we're going to
    > switch to this sort of initialisation I think xics should only be
    > init'ing the revmap for itself.


    You're right, that's what I intended to do from the beginning but
    stumbled upon ... hmm, can't remember what, that made me change
    my mind. But I agree, I'm not particularly proud of that. Will look
    again into that.

    >
    >
    > This boot ordering stuff is pretty hairy, so I might have missed
    > something, but this is how the code is ordered AFAICT:
    > 
    > start_kernel()
    > init_IRQ()
    > ...
    > local_irq_enable()
    > ...
    > rest_init()
    > kernel_thread()
    > kernel_init()
    > smp_prepare_cpus()
    > smp_xics_probe() (via smp_ops->probe())
    >
    >
    > What's stopping us from taking an irq between local_irq_enable() and
    > smp_xics_probe() ? Is it just that no one's request_irq()'ed them yet?


    It's hairy, I agree, but as you've mentioned no one has done a request_irq()
    at that point. The first one to do it is smp_xics_probe() for the IPI.

    Thanks for your comments.

    Sebastien.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: [PATCH] powerpc - Initialize the irq radix tree earlier

    On Thu, 31 Jul 2008 14:00:02 +0200 Sebastien Dugue wrote:

    >
    > Hi Michael,
    >
    > On Thu, 31 Jul 2008 21:40:56 +1000 Michael Ellerman wrote:
    >
    > > On Thu, 2008-07-31 at 11:40 +0200, Sebastien Dugue wrote:
    > > > The radix tree used for fast irq reverse mapping by the XICS is initialized
    > > > late in the boot process, after the first interrupt (IPI) gets registered
    > > > and after the first IPI is received.
    > > >
    > > > This patch moves the initialization of the XICS radix tree earlier into
    > > > the boot process in smp_xics_probe() (the mm is already up but no interrupts
    > > > have been registered at that point) to avoid having to insert a mapping into
    > > > the tree in interrupt context. This will help in simplifying the locking
    > > > constraints and move to a lockless radix tree in subsequent patches.
    > > >
    > > > As a nice side effect, there is no need any longer to check for
    > > > (host->revmap_data.tree.gfp_mask != 0) to know if the tree have been
    > > > initialized.

    > >
    > > Hi Sebastien,
    > >
    > > This is a nice cleanup, I think

    >
    > Thanks.
    >
    > >
    > > > diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
    > > > index 6ac8612..0a1445c 100644
    > > > --- a/arch/powerpc/kernel/irq.c
    > > > +++ b/arch/powerpc/kernel/irq.c
    > > > @@ -893,28 +890,28 @@ unsigned int irq_find_mapping(struct irq_host *host,
    > > > }
    > > > EXPORT_SYMBOL_GPL(irq_find_mapping);
    > > >
    > > > +void __init irq_radix_revmap_init(void)
    > > > +{
    > > > + struct irq_host *h;
    > > > +
    > > > + list_for_each_entry(h, &irq_hosts, link) {
    > > > + if (h->revmap_type == IRQ_HOST_MAP_TREE)
    > > > + INIT_RADIX_TREE(&h->revmap_data.tree, GFP_ATOMIC);
    > > > + }
    > > > +}

    > >
    > > Note irq_radix_revmap_init() loops over all irq_hosts ...

    >
    > Yep, but there's only one host (xics)
    >
    > >
    > > > diff --git a/arch/powerpc/platforms/pseries/smp.c b/arch/powerpc/platforms/pseries/smp.c
    > > > index 9d8f8c8..b143fe7 100644
    > > > --- a/arch/powerpc/platforms/pseries/smp.c
    > > > +++ b/arch/powerpc/platforms/pseries/smp.c
    > > > @@ -130,6 +130,7 @@ static void smp_xics_message_pass(int target, int msg)
    > > >
    > > > static int __init smp_xics_probe(void)
    > > > {
    > > > + irq_radix_revmap_init();
    > > > xics_request_IPIs();

    > >
    > > But now it's only called from the xics setup code.
    > >
    > > Which seems a bit ugly. In practice it doesn't matter because at the
    > > moment xics is the only user of the radix revmap. But if we're going to
    > > switch to this sort of initialisation I think xics should only be
    > > init'ing the revmap for itself.

    >
    > You're right, that's what I intended to do from the beginning but
    > stumbled upon ... hmm, can't remember what, that made me change
    > my mind.


    Ah yes, I wanted to do it from xics_init_host() but backed off
    because at that point the mm is not up. But it does not make a difference
    as the first request_irq() happens after the mm is up. A bit shaky I
    concede.

    > But I agree, I'm not particularly proud of that. Will look
    > again into that.
    >
    > >
    > >
    > > This boot ordering stuff is pretty hairy, so I might have missed
    > > something, but this is how the code is ordered AFAICT:
    > > 
    > > start_kernel()
    > > init_IRQ()
    > > ...
    > > local_irq_enable()
    > > ...
    > > rest_init()
    > > kernel_thread()
    > > kernel_init()
    > > smp_prepare_cpus()
    > > smp_xics_probe() (via smp_ops->probe())
    > >
    > >
    > > What's stopping us from taking an irq between local_irq_enable() and
    > > smp_xics_probe() ? Is it just that no one's request_irq()'ed them yet?

    >
    > It's hairy, I agree, but as you've mentioned no one has done a request_irq()
    > at that point. The first one to do it is smp_xics_probe() for the IPI.
    >
    > Thanks for your comments.
    >
    > Sebastien.
    > _______________________________________________
    > Linuxppc-dev mailing list
    > Linuxppc-dev@ozlabs.org
    > https://ozlabs.org/mailman/listinfo/linuxppc-dev

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: [PATCH] powerpc - Initialize the irq radix tree earlier

    On Thu, 2008-07-31 at 14:00 +0200, Sebastien Dugue wrote:
    > On Thu, 31 Jul 2008 21:40:56 +1000 Michael Ellerman wrote:
    > >
    > > This boot ordering stuff is pretty hairy, so I might have missed
    > > something, but this is how the code is ordered AFAICT:
    > > 
    > > start_kernel()
    > > init_IRQ()
    > > ...
    > > local_irq_enable()
    > > ...
    > > rest_init()
    > > kernel_thread()
    > > kernel_init()
    > > smp_prepare_cpus()
    > > smp_xics_probe() (via smp_ops->probe())
    > >
    > >
    > > What's stopping us from taking an irq between local_irq_enable() and
    > > smp_xics_probe() ? Is it just that no one's request_irq()'ed them yet?

    >
    > It's hairy, I agree, but as you've mentioned no one has done a request_irq()
    > at that point. The first one to do it is smp_xics_probe() for the IPI.


    Hmm, I don't think that's strong enough. I can trivially cause irqs to
    fire during a kexec reboot just by mashing the keyboard.

    And during a kdump boot all sorts of stuff could be firing. Even during
    a clean boot, from firmware, I don't think we can guarantee that
    nothing's going to fire.

    ... after a bit of testing ..

    It seems it actually works (sort of).

    xics_remap_irq() calls irq_radix_revmap_lookup(), which calls:

    ptr = radix_tree_lookup(&host->revmap_data.tree, hwirq);

    And because host->revmap_data.tree was zalloc'ed we trip on the first
    check here:



    cheers

    --
    Michael Ellerman
    OzLabs, IBM Australia Development Lab

    wwweb: http://michael.ellerman.id.au
    phone: +61 2 6212 1183 (tie line 70 21183)

    We do not inherit the earth from our ancestors,
    we borrow it from our children. - S.M.A.R.T Person

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.6 (GNU/Linux)

    iD8DBQBIkbbwdSjSd0sB4dIRAnL8AJ0U6V/hgMyz63PAXyleQFBUyt5PowCeN+MS
    aW9sJwt87wxn3x89zotwskk=
    =waH1
    -----END PGP SIGNATURE-----


  10. Re: [PATCH] powerpc - Initialize the irq radix tree earlier

    On Thu, 2008-07-31 at 22:58 +1000, Michael Ellerman wrote:
    > On Thu, 2008-07-31 at 14:00 +0200, Sebastien Dugue wrote:
    > > On Thu, 31 Jul 2008 21:40:56 +1000 Michael Ellerman wrote:
    > > >
    > > > This boot ordering stuff is pretty hairy, so I might have missed
    > > > something, but this is how the code is ordered AFAICT:
    > > > 
    > > > start_kernel()
    > > > init_IRQ()
    > > > ...
    > > > local_irq_enable()
    > > > ...
    > > > rest_init()
    > > > kernel_thread()
    > > > kernel_init()
    > > > smp_prepare_cpus()
    > > > smp_xics_probe() (via smp_ops->probe())
    > > >
    > > >
    > > > What's stopping us from taking an irq between local_irq_enable() and
    > > > smp_xics_probe() ? Is it just that no one's request_irq()'ed them yet?

    > >
    > > It's hairy, I agree, but as you've mentioned no one has done a request_irq()
    > > at that point. The first one to do it is smp_xics_probe() for the IPI.

    >
    > Hmm, I don't think that's strong enough. I can trivially cause irqs to
    > fire during a kexec reboot just by mashing the keyboard.
    >
    > And during a kdump boot all sorts of stuff could be firing. Even during
    > a clean boot, from firmware, I don't think we can guarantee that
    > nothing's going to fire.
    >
    > .. after a bit of testing ..
    >
    > It seems it actually works (sort of).
    >
    > xics_remap_irq() calls irq_radix_revmap_lookup(), which calls:
    >
    > ptr = radix_tree_lookup(&host->revmap_data.tree, hwirq);
    >
    > And because host->revmap_data.tree was zalloc'ed we trip on the first
    > check here:


    @#$% ctrl-enter == send!

    Continuing ...

    void *radix_tree_lookup(struct radix_tree_root *root, unsigned long index)
    {
    unsigned int height, shift;
    struct radix_tree_node *node, **slot;

    node = rcu_dereference(root->rnode);
    if (node == NULL)
    return NULL;

    Which means irq_radix_revmap_lookup() will return NO_IRQ, which iscool.


    So I think it can fly, as long as we're happy that we can't reverse map
    anything until smp_xics_probe() - and I think that's true, as any irq we
    take will be invalid.

    cheers

    --
    Michael Ellerman
    OzLabs, IBM Australia Development Lab

    wwweb: http://michael.ellerman.id.au
    phone: +61 2 6212 1183 (tie line 70 21183)

    We do not inherit the earth from our ancestors,
    we borrow it from our children. - S.M.A.R.T Person

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.6 (GNU/Linux)

    iD8DBQBIkbezdSjSd0sB4dIRAkisAJsHzgyg5HoFJkP7Re4K4x +cY9A2XACgq8YY
    uk3cPoEt5eWGPqvfhs/YxrI=
    =vBiQ
    -----END PGP SIGNATURE-----


  11. Re: [PATCH] powerpc - Initialize the irq radix tree earlier

    On Thu, 31 Jul 2008 23:01:39 +1000 Michael Ellerman wrote:

    > On Thu, 2008-07-31 at 22:58 +1000, Michael Ellerman wrote:
    > > On Thu, 2008-07-31 at 14:00 +0200, Sebastien Dugue wrote:
    > > > On Thu, 31 Jul 2008 21:40:56 +1000 Michael Ellerman wrote:
    > > > >
    > > > > This boot ordering stuff is pretty hairy, so I might have missed
    > > > > something, but this is how the code is ordered AFAICT:
    > > > > 
    > > > > start_kernel()
    > > > > init_IRQ()
    > > > > ...
    > > > > local_irq_enable()
    > > > > ...
    > > > > rest_init()
    > > > > kernel_thread()
    > > > > kernel_init()
    > > > > smp_prepare_cpus()
    > > > > smp_xics_probe() (via smp_ops->probe())
    > > > >
    > > > >
    > > > > What's stopping us from taking an irq between local_irq_enable() and
    > > > > smp_xics_probe() ? Is it just that no one's request_irq()'ed them yet?
    > > >
    > > > It's hairy, I agree, but as you've mentioned no one has done a request_irq()
    > > > at that point. The first one to do it is smp_xics_probe() for the IPI.

    > >
    > > Hmm, I don't think that's strong enough. I can trivially cause irqs to
    > > fire during a kexec reboot just by mashing the keyboard.
    > >
    > > And during a kdump boot all sorts of stuff could be firing. Even during
    > > a clean boot, from firmware, I don't think we can guarantee that
    > > nothing's going to fire.
    > >
    > > .. after a bit of testing ..
    > >
    > > It seems it actually works (sort of).
    > >
    > > xics_remap_irq() calls irq_radix_revmap_lookup(), which calls:
    > >
    > > ptr = radix_tree_lookup(&host->revmap_data.tree, hwirq);
    > >
    > > And because host->revmap_data.tree was zalloc'ed we trip on the first
    > > check here:

    >
    > @#$% ctrl-enter == send!
    >
    > Continuing ...
    >
    > void *radix_tree_lookup(struct radix_tree_root *root, unsigned long index)
    > {
    > unsigned int height, shift;
    > struct radix_tree_node *node, **slot;
    >
    > node = rcu_dereference(root->rnode);
    > if (node == NULL)
    > return NULL;
    >
    > Which means irq_radix_revmap_lookup() will return NO_IRQ, which is cool.


    Which is what I intended so that as long as no IRQ is registered we
    return NO_IRQ.

    >
    >
    > So I think it can fly, as long as we're happy that we can't reverse map
    > anything until smp_xics_probe() - and I think that's true, as any irq we
    > take will be invalid.


    That's true as no IRQs are registered before smp_xics_probe() and for any
    interrupt we might get before that, irq_radix_revmap_lookup() will return
    NO_IRQ.

    Thanks,

    Sebastien.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: [PATCH] powerpc - Initialize the irq radix tree earlier

    On Thu, 2008-07-31 at 15:26 +0200, Sebastien Dugue wrote:
    > On Thu, 31 Jul 2008 23:01:39 +1000 Michael Ellerman wrote:
    >
    > > On Thu, 2008-07-31 at 22:58 +1000, Michael Ellerman wrote:
    > > > On Thu, 2008-07-31 at 14:00 +0200, Sebastien Dugue wrote:
    > > > > On Thu, 31 Jul 2008 21:40:56 +1000 Michael Ellerman wrote:
    > > > > >
    > > > > > This boot ordering stuff is pretty hairy, so I might have missed
    > > > > > something, but this is how the code is ordered AFAICT:
    > > > > > 
    > > > > > start_kernel()
    > > > > > init_IRQ()
    > > > > > ...
    > > > > > local_irq_enable()
    > > > > > ...
    > > > > > rest_init()
    > > > > > kernel_thread()
    > > > > > kernel_init()
    > > > > > smp_prepare_cpus()
    > > > > > smp_xics_probe() (via smp_ops->probe())
    > > > > >
    > > > > >
    > > > > > What's stopping us from taking an irq between local_irq_enable() and
    > > > > > smp_xics_probe() ? Is it just that no one's request_irq()'ed them yet?
    > > > >
    > > > > It's hairy, I agree, but as you've mentioned no one has done a request_irq()
    > > > > at that point. The first one to do it is smp_xics_probe() for the IPI.
    > > >
    > > > Hmm, I don't think that's strong enough. I can trivially cause irqs to
    > > > fire during a kexec reboot just by mashing the keyboard.
    > > >
    > > > And during a kdump boot all sorts of stuff could be firing. Even during
    > > > a clean boot, from firmware, I don't think we can guarantee that
    > > > nothing's going to fire.
    > > >
    > > > .. after a bit of testing ..
    > > >
    > > > It seems it actually works (sort of).
    > > >
    > > > xics_remap_irq() calls irq_radix_revmap_lookup(), which calls:
    > > >
    > > > ptr = radix_tree_lookup(&host->revmap_data.tree, hwirq);
    > > >
    > > > And because host->revmap_data.tree was zalloc'ed we trip on the first
    > > > check here:

    > >
    > > @#$% ctrl-enter == send!
    > >
    > > Continuing ...
    > >
    > > void *radix_tree_lookup(struct radix_tree_root *root, unsigned long index)
    > > {
    > > unsigned int height, shift;
    > > struct radix_tree_node *node, **slot;
    > >
    > > node = rcu_dereference(root->rnode);
    > > if (node == NULL)
    > > return NULL;
    > >
    > > Which means irq_radix_revmap_lookup() will return NO_IRQ, which is cool.

    >
    > Which is what I intended so that as long as no IRQ is registered we
    > return NO_IRQ.
    >
    > >
    > >
    > > So I think it can fly, as long as we're happy that we can't reverse map
    > > anything until smp_xics_probe() - and I think that's true, as any irq we
    > > take will be invalid.

    >
    > That's true as no IRQs are registered before smp_xics_probe() and for any
    > interrupt we might get before that, irq_radix_revmap_lookup() will return
    > NO_IRQ.


    Cool, we agree

    My only worry is that we might be relying on on the particular radix
    tree implementation a bit too much. Is it documented somewhere that
    the /very/ first check is for root->rnode != NULL, and the rest of the
    root may be unintialised?

    And I think it needs a big fat comment in the irq code saying that it's
    safe because revmap_data is zalloc'ed, and that means the radix lookup
    will fail (safely).

    cheers

    --
    Michael Ellerman
    OzLabs, IBM Australia Development Lab

    wwweb: http://michael.ellerman.id.au
    phone: +61 2 6212 1183 (tie line 70 21183)

    We do not inherit the earth from our ancestors,
    we borrow it from our children. - S.M.A.R.T Person

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.6 (GNU/Linux)

    iD8DBQBIkcCOdSjSd0sB4dIRAuxGAJ0TguBTpS0p4xdDf+dAZK TcHyTkvQCeNc7S
    6UpVTe06vYU5VRVjCJNvKYU=
    =IyZ5
    -----END PGP SIGNATURE-----


  13. Re: [PATCH] powerpc - Initialize the irq radix tree earlier

    On Thu, 31 Jul 2008 23:39:26 +1000 Michael Ellerman wrote:

    > On Thu, 2008-07-31 at 15:26 +0200, Sebastien Dugue wrote:
    > > On Thu, 31 Jul 2008 23:01:39 +1000 Michael Ellerman wrote:
    > >
    > > > On Thu, 2008-07-31 at 22:58 +1000, Michael Ellerman wrote:
    > > > > On Thu, 2008-07-31 at 14:00 +0200, Sebastien Dugue wrote:
    > > > > > On Thu, 31 Jul 2008 21:40:56 +1000 Michael Ellerman wrote:
    > > > > > >
    > > > > > > This boot ordering stuff is pretty hairy, so I might have missed
    > > > > > > something, but this is how the code is ordered AFAICT:
    > > > > > > 
    > > > > > > start_kernel()
    > > > > > > init_IRQ()
    > > > > > > ...
    > > > > > > local_irq_enable()
    > > > > > > ...
    > > > > > > rest_init()
    > > > > > > kernel_thread()
    > > > > > > kernel_init()
    > > > > > > smp_prepare_cpus()
    > > > > > > smp_xics_probe() (via smp_ops->probe())
    > > > > > >
    > > > > > >
    > > > > > > What's stopping us from taking an irq between local_irq_enable() and
    > > > > > > smp_xics_probe() ? Is it just that no one's request_irq()'ed them yet?
    > > > > >
    > > > > > It's hairy, I agree, but as you've mentioned no one has done a request_irq()
    > > > > > at that point. The first one to do it is smp_xics_probe() for the IPI.
    > > > >
    > > > > Hmm, I don't think that's strong enough. I can trivially cause irqs to
    > > > > fire during a kexec reboot just by mashing the keyboard.
    > > > >
    > > > > And during a kdump boot all sorts of stuff could be firing. Even during
    > > > > a clean boot, from firmware, I don't think we can guarantee that
    > > > > nothing's going to fire.
    > > > >
    > > > > .. after a bit of testing ..
    > > > >
    > > > > It seems it actually works (sort of).
    > > > >
    > > > > xics_remap_irq() calls irq_radix_revmap_lookup(), which calls:
    > > > >
    > > > > ptr = radix_tree_lookup(&host->revmap_data.tree, hwirq);
    > > > >
    > > > > And because host->revmap_data.tree was zalloc'ed we trip on the first
    > > > > check here:
    > > >
    > > > @#$% ctrl-enter == send!
    > > >
    > > > Continuing ...
    > > >
    > > > void *radix_tree_lookup(struct radix_tree_root *root, unsigned long index)
    > > > {
    > > > unsigned int height, shift;
    > > > struct radix_tree_node *node, **slot;
    > > >
    > > > node = rcu_dereference(root->rnode);
    > > > if (node == NULL)
    > > > return NULL;
    > > >
    > > > Which means irq_radix_revmap_lookup() will return NO_IRQ, which is cool.

    > >
    > > Which is what I intended so that as long as no IRQ is registered we
    > > return NO_IRQ.
    > >
    > > >
    > > >
    > > > So I think it can fly, as long as we're happy that we can't reverse map
    > > > anything until smp_xics_probe() - and I think that's true, as any irq we
    > > > take will be invalid.

    > >
    > > That's true as no IRQs are registered before smp_xics_probe() and for any
    > > interrupt we might get before that, irq_radix_revmap_lookup() will return
    > > NO_IRQ.

    >
    > Cool, we agree
    >
    > My only worry is that we might be relying on on the particular radix
    > tree implementation a bit too much.


    Well maybe we could revert back to testing a flag just like we
    do for host->revmap_data.tree.gfp_mask != 0. Dunno.

    > Is it documented somewhere that
    > the /very/ first check is for root->rnode != NULL, and the rest of the
    > root may be unintialised?


    Not in anything I could read except in looking at the code.

    >
    > And I think it needs a big fat comment in the irq code saying that it's
    > safe because revmap_data is zalloc'ed, and that means the radix lookup
    > will fail (safely).


    Yep, right. Will advertise this properly for the next round if this
    remains the prefered solution.

    Thanks,

    Sebastien.



    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread