Fix x86 32 bit FRAME_POINTER chasing code - Kernel

This is a discussion on Fix x86 32 bit FRAME_POINTER chasing code - Kernel ; This patch is simple; I don't know if it's .24 candidate; the bug is pretty bad but not a recent regression, and there is obviously some risk with touching this code. Subject: Fix x86 32 bit FRAME_POINTER chasing code From: ...

+ Reply to Thread
Results 1 to 5 of 5

Thread: Fix x86 32 bit FRAME_POINTER chasing code

  1. Fix x86 32 bit FRAME_POINTER chasing code

    This patch is simple; I don't know if it's .24 candidate; the bug is pretty bad but not a recent regression,
    and there is obviously some risk with touching this code.



    Subject: Fix x86 32 bit FRAME_POINTER chasing code
    From: Arjan van de Ven

    The current x86 32 bit FRAME_POINTER chasing code has a nasty bug in
    that the EBP tracer doesn't actually update the value of EBP it is
    tracing, so that the code doesn't actually switch to the irq stack properly.

    The result is a truncated backtrace:

    WARNING: at timeroops.c:8 kerneloops_regression_test() (Not tainted)
    Pid: 0, comm: swapper Not tainted 2.6.24-0.77.rc4.git4.fc9 #1
    [] show_trace_log_lvl+0x1a/0x2f
    [] show_trace+0x12/0x14
    [] dump_stack+0x6c/0x72
    [] kerneloops_regression_test+0x44/0x46 [timeroops]
    [] run_timer_softirq+0x127/0x18f
    [] __do_softirq+0x78/0xff
    [] do_softirq+0x74/0xf7
    =======================

    This patch fixes the code to update EBP properly, and to check the EIP
    before printing (as the non-framepointer backtracer does) so that
    the same test backtrace now looks like this:

    WARNING: at timeroops.c:8 kerneloops_regression_test()
    Pid: 0, comm: swapper Not tainted 2.6.24-rc7 #4
    [] show_trace_log_lvl+0x1a/0x2f
    [] show_trace+0x12/0x14
    [] dump_stack+0x6a/0x70
    [] kerneloops_regression_test+0x3b/0x3d [timeroops]
    [] run_timer_softirq+0x11b/0x17c
    [] __do_softirq+0x42/0x94
    [] do_softirq+0x50/0xb6
    [] irq_exit+0x37/0x67
    [] do_IRQ+0x9a/0xaf
    [] common_interrupt+0x2e/0x34
    [] cpuidle_idle_call+0x52/0x78
    [] cpu_idle+0x46/0x60
    [] rest_init+0x43/0x45
    [] start_kernel+0x279/0x27f
    =======================

    This shows that the backtrace goes all the way down to user context now.
    This bug was found during the port to 64 bit of the frame pointer backtracer.

    Signed-off-by: Arjan van de Ven

    ---
    arch/x86/kernel/traps_32.c | 4 +++-
    1 file changed, 3 insertions(+), 1 deletion(-)

    Index: linux-2.6.24-rc7/arch/x86/kernel/traps_32.c
    ================================================== =================
    --- linux-2.6.24-rc7.orig/arch/x86/kernel/traps_32.c
    +++ linux-2.6.24-rc7/arch/x86/kernel/traps_32.c
    @@ -124,7 +124,8 @@ static inline unsigned long print_contex
    unsigned long addr;

    addr = frame->return_address;
    - ops->address(data, addr);
    + if (__kernel_text_address(addr))
    + ops->address(data, addr);
    /*
    * break out of recursive entries (such as
    * end_of_stack_stop_unwind_function). Also,
    @@ -132,6 +133,7 @@ static inline unsigned long print_contex
    * move downwards!
    */
    next = frame->next_frame;
    + ebp = (unsigned long) next;
    if (next <= frame)
    break;
    frame = next;

    --
    If you want to reach me at my work email, use arjan@linux.intel.com
    For development, discussion and tips for power savings,
    visit http://www.lesswatts.org
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: Fix x86 32 bit FRAME_POINTER chasing code


    * Arjan van de Ven wrote:

    > +++ linux-2.6.24-rc7/arch/x86/kernel/traps_32.c
    > @@ -124,7 +124,8 @@ static inline unsigned long print_contex
    > unsigned long addr;
    >
    > addr = frame->return_address;
    > - ops->address(data, addr);
    > + if (__kernel_text_address(addr))
    > + ops->address(data, addr);
    > /*
    > * break out of recursive entries (such as
    > * end_of_stack_stop_unwind_function). Also,
    > @@ -132,6 +133,7 @@ static inline unsigned long print_contex
    > * move downwards!
    > */
    > next = frame->next_frame;
    > + ebp = (unsigned long) next;
    > if (next <= frame)


    thanks, applied. Nice catch!

    > This patch is simple; I don't know if it's .24 candidate; the bug is
    > pretty bad but not a recent regression, and there is obviously some
    > risk with touching this code.


    it's a 2.6.24.1 candidate i believe. We trigger plenty of various
    crashes during x86.git maintenance and others hit various crashes in
    -mm, so by the time .1 is released we'll have it in .25 and can backport
    it. Most folks/distros will update to 2.6.24.1 very quickly so there's
    no risk of months loss of quality to kerneloops.org data either.

    if there's more than 1-2 weeks to the v2.6.24 release we could merge it
    right now as well:

    Acked-by: Ingo Molnar

    because in a week we'll trigger plenty of crashes in -git based x86
    trees and will know about any regressions and will be able to reasonably
    trust it.

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: Fix x86 32 bit FRAME_POINTER chasing code

    On Thu, Jan 10, 2008 at 07:54:38AM +0100, Ingo Molnar wrote:
    >...
    > it's a 2.6.24.1 candidate i believe. We trigger plenty of various
    > crashes during x86.git maintenance and others hit various crashes in
    > -mm, so by the time .1 is released we'll have it in .25 and can backport
    > it. Most folks/distros will update to 2.6.24.1 very quickly so there's
    > no risk of months loss of quality to kerneloops.org data either.
    >...


    -stable should not introduce additional regressions, and my personal
    impression is that it already introduces too many regressions.

    Either this patch is considered both important and safe enough for
    getting into Linus' tree now or it's simply too late for 2.6.24*

    > Ingo


    cu
    Adrian

    --

    "Is there not promise of rain?" Ling Tan asked suddenly out
    of the darkness. There had been need of rain for many days.
    "Only a promise," Lao Er said.
    Pearl S. Buck - Dragon Seed

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: Fix x86 32 bit FRAME_POINTER chasing code

    On 01/10/2008 01:54 AM, Ingo Molnar wrote:
    > * Arjan van de Ven wrote:
    >
    >> +++ linux-2.6.24-rc7/arch/x86/kernel/traps_32.c
    >> @@ -124,7 +124,8 @@ static inline unsigned long print_contex
    >> unsigned long addr;
    >>
    >> addr = frame->return_address;
    >> - ops->address(data, addr);
    >> + if (__kernel_text_address(addr))
    >> + ops->address(data, addr);
    >> /*
    >> * break out of recursive entries (such as
    >> * end_of_stack_stop_unwind_function). Also,
    >> @@ -132,6 +133,7 @@ static inline unsigned long print_contex
    >> * move downwards!
    >> */
    >> next = frame->next_frame;
    >> + ebp = (unsigned long) next;
    >> if (next <= frame)

    >
    > thanks, applied. Nice catch!
    >
    >> This patch is simple; I don't know if it's .24 candidate; the bug is
    >> pretty bad but not a recent regression, and there is obviously some
    >> risk with touching this code.

    >
    > it's a 2.6.24.1 candidate i believe. We trigger plenty of various
    > crashes during x86.git maintenance and others hit various crashes in
    > -mm, so by the time .1 is released we'll have it in .25 and can backport
    > it. Most folks/distros will update to 2.6.24.1 very quickly so there's
    > no risk of months loss of quality to kerneloops.org data either.
    >


    Using the same logic, why not put it in 2.6.24 and then remove it in 2.6.24.1
    if it's broken?
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: Fix x86 32 bit FRAME_POINTER chasing code


    * Chuck Ebbert wrote:

    > > it's a 2.6.24.1 candidate i believe. We trigger plenty of various
    > > crashes during x86.git maintenance and others hit various crashes in
    > > -mm, so by the time .1 is released we'll have it in .25 and can
    > > backport it. Most folks/distros will update to 2.6.24.1 very quickly
    > > so there's no risk of months loss of quality to kerneloops.org data
    > > either.

    >
    > Using the same logic, why not put it in 2.6.24 and then remove it in
    > 2.6.24.1 if it's broken?


    hm, did you understood my .25 reference to mean 2.6.25-final? I meant
    2.6.25-rc1. Or if there's a 2.6.24-rc8 planned then we could put it in
    right now [with the bisectability fixes, i.e. not the original series].
    I.e. IMO what we shouldnt do is to put in these fixes without having had
    _some_ -rc release inbetween.

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread