[PATCH -rt 0/4] nmi_watchdog fixes for -rt - Kernel

This is a discussion on [PATCH -rt 0/4] nmi_watchdog fixes for -rt - Kernel ; Hi, Here is a patchset of nmi_watchdog fixes for -rt kernel. These patches are against 2.6.24.4-rt4. Could you please review? thanks, Hiroshi Shimamoto -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message ...

+ Reply to Thread
Results 1 to 7 of 7

Thread: [PATCH -rt 0/4] nmi_watchdog fixes for -rt

  1. [PATCH -rt 0/4] nmi_watchdog fixes for -rt

    Hi,

    Here is a patchset of nmi_watchdog fixes for -rt kernel.
    These patches are against 2.6.24.4-rt4.
    Could you please review?

    thanks,
    Hiroshi Shimamoto
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. [PATCH -rt 2/4] x86: return true for NMI handled

    From: Hiroshi Shimamoto

    NMI for show_regs causes unknown NMI when nmi_watchdog is local APIC mode.
    Because lapic_wd_event() will fail due to still running perfctr.
    If NMI is for show_regs, nmi_watchdog_tick() should return 1.

    On x86_32, call irq_show_regs_callback() is moved to top of the
    nmi_watchdog_tick() same as x86_64.

    Signed-off-by: Hiroshi Shimamoto
    ---
    arch/x86/kernel/nmi_32.c | 10 +++++-----
    arch/x86/kernel/nmi_64.c | 9 +++++----
    include/linux/sched.h | 2 +-
    3 files changed, 11 insertions(+), 10 deletions(-)

    diff --git a/arch/x86/kernel/nmi_32.c b/arch/x86/kernel/nmi_32.c
    index da9deb3..d1f92ca 100644
    --- a/arch/x86/kernel/nmi_32.c
    +++ b/arch/x86/kernel/nmi_32.c
    @@ -350,10 +350,10 @@ void nmi_show_all_regs(void)

    static DEFINE_RAW_SPINLOCK(nmi_print_lock);

    -notrace void irq_show_regs_callback(int cpu, struct pt_regs *regs)
    +notrace int irq_show_regs_callback(int cpu, struct pt_regs *regs)
    {
    if (!nmi_show_regs[cpu])
    - return;
    + return 0;

    nmi_show_regs[cpu] = 0;
    spin_lock(&nmi_print_lock);
    @@ -362,6 +362,7 @@ notrace void irq_show_regs_callback(int cpu, struct pt_regs *regs)
    per_cpu(irq_stat, cpu).apic_timer_irqs);
    show_regs(regs);
    spin_unlock(&nmi_print_lock);
    + return 1;
    }

    notrace __kprobes int
    @@ -376,8 +377,9 @@ nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
    unsigned int sum;
    int touched = 0;
    int cpu = smp_processor_id();
    - int rc=0;
    + int rc;

    + rc = irq_show_regs_callback(cpu, regs);
    __profile_tick(CPU_PROFILING, regs);

    /* check for other users first */
    @@ -404,8 +406,6 @@ nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
    sum = per_cpu(irq_stat, cpu).apic_timer_irqs +
    per_cpu(irq_stat, cpu).irq0_irqs;

    - irq_show_regs_callback(cpu, regs);
    -
    /* if the apic timer isn't firing, this cpu isn't doing much */
    /* if the none of the timers isn't firing, this cpu isn't doing much */
    if (!touched && last_irq_sums[cpu] == sum) {
    diff --git a/arch/x86/kernel/nmi_64.c b/arch/x86/kernel/nmi_64.c
    index 5d3073c..afc0317 100644
    --- a/arch/x86/kernel/nmi_64.c
    +++ b/arch/x86/kernel/nmi_64.c
    @@ -340,10 +340,10 @@ void nmi_show_all_regs(void)

    static DEFINE_RAW_SPINLOCK(nmi_print_lock);

    -notrace void irq_show_regs_callback(int cpu, struct pt_regs *regs)
    +notrace int irq_show_regs_callback(int cpu, struct pt_regs *regs)
    {
    if (!nmi_show_regs[cpu])
    - return;
    + return 0;

    nmi_show_regs[cpu] = 0;
    spin_lock(&nmi_print_lock);
    @@ -351,6 +351,7 @@ notrace void irq_show_regs_callback(int cpu, struct pt_regs *regs)
    printk(KERN_WARNING "apic_timer_irqs: %d\n", read_pda(apic_timer_irqs));
    show_regs(regs);
    spin_unlock(&nmi_print_lock);
    + return 1;
    }

    notrace int __kprobes
    @@ -359,9 +360,9 @@ nmi_watchdog_tick(struct pt_regs * regs, unsigned reason)
    int sum;
    int touched = 0;
    int cpu = smp_processor_id();
    - int rc = 0;
    + int rc;

    - irq_show_regs_callback(cpu, regs);
    + rc = irq_show_regs_callback(cpu, regs);
    __profile_tick(CPU_PROFILING, regs);

    /* check for other users first */
    diff --git a/include/linux/sched.h b/include/linux/sched.h
    index 4176f87..a37200a 100644
    --- a/include/linux/sched.h
    +++ b/include/linux/sched.h
    @@ -292,7 +292,7 @@ static inline void show_state(void)
    }

    extern void show_regs(struct pt_regs *);
    -extern void irq_show_regs_callback(int cpu, struct pt_regs *regs);
    +extern int irq_show_regs_callback(int cpu, struct pt_regs *regs);

    /*
    * TASK is a pointer to the task whose backtrace we want to see (or NULL for current
    --
    1.5.4.1

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. [PATCH -rt 1/4] x86_64: send NMI after nmi_show_regs on

    From: Hiroshi Shimamoto

    The flags nmi_show_regs should be set before send NMI.

    Signed-off-by: Hiroshi Shimamoto
    ---
    arch/x86/kernel/nmi_64.c | 4 ++--
    1 files changed, 2 insertions(+), 2 deletions(-)

    diff --git a/arch/x86/kernel/nmi_64.c b/arch/x86/kernel/nmi_64.c
    index d187ab9..69cc737 100644
    --- a/arch/x86/kernel/nmi_64.c
    +++ b/arch/x86/kernel/nmi_64.c
    @@ -327,11 +327,11 @@ void nmi_show_all_regs(void)
    if (system_state == SYSTEM_BOOTING)
    return;

    - smp_send_nmi_allbutself();
    -
    for_each_online_cpu(i)
    nmi_show_regs[i] = 1;

    + smp_send_nmi_allbutself();
    +
    for_each_online_cpu(i) {
    while (nmi_show_regs[i] == 1)
    barrier();
    --
    1.5.4.1
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. [PATCH -rt 3/4] x86: nmi_watchdog NMI needed for irq_show_regs_callback()

    From: Hiroshi Shimamoto

    The -rt kernel doesn't panic immediately when NMI lockup detected.
    Because the kernel waits show_regs on all cpus, but NMI is not come so
    frequently.

    Signed-off-by: Hiroshi Shimamoto
    ---
    arch/x86/kernel/nmi_32.c | 7 +++++++
    arch/x86/kernel/nmi_64.c | 8 +++++++-
    2 files changed, 14 insertions(+), 1 deletions(-)

    diff --git a/arch/x86/kernel/nmi_32.c b/arch/x86/kernel/nmi_32.c
    index f55f05b..da9deb3 100644
    --- a/arch/x86/kernel/nmi_32.c
    +++ b/arch/x86/kernel/nmi_32.c
    @@ -428,6 +428,13 @@ nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
    if (i == cpu)
    continue;
    nmi_show_regs[i] = 1;
    + }
    +
    + smp_send_nmi_allbutself();
    +
    + for_each_online_cpu(i) {
    + if (i == cpu)
    + continue;
    while (nmi_show_regs[i] == 1)
    cpu_relax();
    }
    diff --git a/arch/x86/kernel/nmi_64.c b/arch/x86/kernel/nmi_64.c
    index 69cc737..5d3073c 100644
    --- a/arch/x86/kernel/nmi_64.c
    +++ b/arch/x86/kernel/nmi_64.c
    @@ -412,10 +412,16 @@ nmi_watchdog_tick(struct pt_regs * regs, unsigned reason)
    if (i == cpu)
    continue;
    nmi_show_regs[i] = 1;
    + }
    +
    + smp_send_nmi_allbutself();
    +
    + for_each_online_cpu(i) {
    + if (i == cpu)
    + continue;
    while (nmi_show_regs[i] == 1)
    cpu_relax();
    }
    -
    die_nmi("NMI Watchdog detected LOCKUP on CPU %d\n", regs,
    panic_on_timeout);
    }
    --
    1.5.4.1

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [PATCH -rt 1/4] x86_64: send NMI after nmi_show_regs on



    On Mon, 28 Apr 2008, Hiroshi Shimamoto wrote:
    > diff --git a/arch/x86/kernel/nmi_64.c b/arch/x86/kernel/nmi_64.c
    > index d187ab9..69cc737 100644
    > --- a/arch/x86/kernel/nmi_64.c
    > +++ b/arch/x86/kernel/nmi_64.c
    > @@ -327,11 +327,11 @@ void nmi_show_all_regs(void)
    > if (system_state == SYSTEM_BOOTING)
    > return;
    >
    > - smp_send_nmi_allbutself();
    > -
    > for_each_online_cpu(i)
    > nmi_show_regs[i] = 1;


    Hi Hiroshi,

    I know this wasn't your code to begin with but, how does this function
    exit? I mean, we set an array where each index per online cpu is set to
    one, then do an "nmi_allbutself", and then wait on those indexes to turn
    zero, one at a time. If we are CPU 0 here, we set that index to 1, then
    enter the loop, and will block forever on this "while" loop below.

    Am I missing something?

    Thanks,

    -- Steve



    > + smp_send_nmi_allbutself();
    > +
    > for_each_online_cpu(i) {
    > while (nmi_show_regs[i] == 1)
    > barrier();
    > --
    > 1.5.4.1
    >

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [PATCH -rt 0/4] nmi_watchdog fixes for -rt


    On Mon, 28 Apr 2008, Hiroshi Shimamoto wrote:

    > Hi,
    >
    > Here is a patchset of nmi_watchdog fixes for -rt kernel.
    > These patches are against 2.6.24.4-rt4.
    > Could you please review?


    Patches look good!

    I'll queue them up for the next release.

    Thanks,

    -- Steve

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: [PATCH -rt 1/4] x86_64: send NMI after nmi_show_regs on

    Steven Rostedt wrote:
    >
    > On Mon, 28 Apr 2008, Hiroshi Shimamoto wrote:
    >> diff --git a/arch/x86/kernel/nmi_64.c b/arch/x86/kernel/nmi_64.c
    >> index d187ab9..69cc737 100644
    >> --- a/arch/x86/kernel/nmi_64.c
    >> +++ b/arch/x86/kernel/nmi_64.c
    >> @@ -327,11 +327,11 @@ void nmi_show_all_regs(void)
    >> if (system_state == SYSTEM_BOOTING)
    >> return;
    >>
    >> - smp_send_nmi_allbutself();
    >> -
    >> for_each_online_cpu(i)
    >> nmi_show_regs[i] = 1;

    >
    > Hi Hiroshi,
    >
    > I know this wasn't your code to begin with but, how does this function
    > exit? I mean, we set an array where each index per online cpu is set to
    > one, then do an "nmi_allbutself", and then wait on those indexes to turn
    > zero, one at a time. If we are CPU 0 here, we set that index to 1, then
    > enter the loop, and will block forever on this "while" loop below.


    Hm, I'm not quit sure when NMI disabled.
    If NMI is working issuing CPU will receive NMI and the flag turns off in
    NMI handler.
    I'll look into it again and will work if needed.

    thanks,
    Hiroshi Shimamoto

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread