[crash] kernel BUG at drivers/cpufreq/cpufreq.c:1060! - Kernel

This is a discussion on [crash] kernel BUG at drivers/cpufreq/cpufreq.c:1060! - Kernel ; 2.6.24-rc4-git5, got this cpufreq crash on x86 64-bit, during 'make randconfig' random bootup testing: powernow-k8: BIOS error - no PSB or ACPI _PSS objects ------------[ cut here ]------------ kernel BUG at drivers/cpufreq/cpufreq.c:1060! invalid opcode: 0000 [1] SMP [...] RIP: 0010:[ ...

+ Reply to Thread
Results 1 to 8 of 8

Thread: [crash] kernel BUG at drivers/cpufreq/cpufreq.c:1060!

  1. [crash] kernel BUG at drivers/cpufreq/cpufreq.c:1060!


    2.6.24-rc4-git5, got this cpufreq crash on x86 64-bit, during 'make
    randconfig' random bootup testing:

    powernow-k8: BIOS error - no PSB or ACPI _PSS objects
    ------------[ cut here ]------------
    kernel BUG at drivers/cpufreq/cpufreq.c:1060!
    invalid opcode: 0000 [1] SMP
    [...]
    RIP: 0010:[] [] cpufreq_remove_dev+0x160/0x2ad
    [...]
    Call Trace:
    [] sysdev_driver_unregister+0x53/0x8a
    [] cpufreq_register_driver+0x148/0x188
    [] kernel_init+0x14b/0x318
    [] child_rip+0xa/0x12
    [] kernel_init+0x0/0x318
    [] child_rip+0x0/0x12

    kernel is 2.6.24-rc4-git5-ish + x86.git. (but no cpufreq changes to the
    upstream code) crashlog and config attached. Will try with vanilla
    -latest as well.

    Ingo


  2. Re: [crash] kernel BUG at drivers/cpufreq/cpufreq.c:1060!


    * Ingo Molnar wrote:

    > 2.6.24-rc4-git5, got this cpufreq crash on x86 64-bit, during 'make
    > randconfig' random bootup testing:


    hm, does not seem to be easily reproducible. I tried 10 bootups and 2 of
    them failed.

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: [crash] kernel BUG at drivers/cpufreq/cpufreq.c:1060!

    On 12/12/07, Ingo Molnar wrote:
    >
    > 2.6.24-rc4-git5, got this cpufreq crash on x86 64-bit, during 'make
    > randconfig' random bootup testing:


    Ingo, since you already scripted this, maybe you can add
    "modprobe everything/rmmod everything" test after successful bootup.
    It will catch amazing amount of stuff, I promise.

    Ditto for modprobe/rmmod/modprobe and modprobe/rmmod/cat /proc,
    cat /sys smoke tests.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: [crash] kernel BUG at drivers/cpufreq/cpufreq.c:1060!


    * Alexey Dobriyan wrote:

    > On 12/12/07, Ingo Molnar wrote:
    > >
    > > 2.6.24-rc4-git5, got this cpufreq crash on x86 64-bit, during 'make
    > > randconfig' random bootup testing:

    >
    > Ingo, since you already scripted this, maybe you can add "modprobe
    > everything/rmmod everything" test after successful bootup. It will
    > catch amazing amount of stuff, I promise.


    something close to that is one of my standard tests: booting up an
    allyesconfig kernel.

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: [crash] kernel BUG at drivers/cpufreq/cpufreq.c:1060!

    On Wed, Dec 12, 2007 at 10:11:44AM +0100, Ingo Molnar wrote:

    > 2.6.24-rc4-git5, got this cpufreq crash on x86 64-bit, during 'make
    > randconfig' random bootup testing:


    You hit all the fun bugs.

    Just before we initialise cpufreqs notifier list..

    > Testing NMI watchdog ... <4>WARNING: CPU#0: NMI appears to be stuck (0->0)!


    eek?

    > powernow-k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ processors (1 cpu cores) (version 2.20.00)
    > powernow-k8: BIOS error - no PSB or ACPI _PSS objects
    > ------------[ cut here ]------------
    > kernel BUG at drivers/cpufreq/cpufreq.c:1060!


    The actual BUG you hit is

    if (unlikely(lock_policy_rwsem_write(cpu)))
    BUG();

    It _looks_ like we're leaking a refcount on that lock, but
    I don't see where. It's a shame you can't reproduce this easily,
    as cpufreq.debug=7 would give us more clues.
    (And CONFIG_CPUFREQ_DEBUG=y)

    I'll think about this some more.

    Dave

    --
    http://www.codemonkey.org.uk
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: [crash] kernel BUG at drivers/cpufreq/cpufreq.c:1060!

    On Wed, Dec 12, 2007 at 11:40:13AM -0500, Dave Jones wrote:

    > > powernow-k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ processors (1 cpu cores) (version 2.20.00)
    > > powernow-k8: BIOS error - no PSB or ACPI _PSS objects
    > > ------------[ cut here ]------------
    > > kernel BUG at drivers/cpufreq/cpufreq.c:1060!

    >
    > The actual BUG you hit is
    >
    > if (unlikely(lock_policy_rwsem_write(cpu)))
    > BUG();
    >
    > It _looks_ like we're leaking a refcount on that lock, but
    > I don't see where. It's a shame you can't reproduce this easily,
    > as cpufreq.debug=7 would give us more clues.
    > (And CONFIG_CPUFREQ_DEBUG=y)


    So we're missing some unlocks in some error paths.
    It's feasible you hit one of those.
    This patch should be the fix for that.

    Dave

    diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
    index 5e626b1..79581fa 100644
    --- a/drivers/cpufreq/cpufreq.c
    +++ b/drivers/cpufreq/cpufreq.c
    @@ -841,19 +841,25 @@ static int cpufreq_add_dev (struct sys_device * sys_dev)
    drv_attr = cpufreq_driver->attr;
    while ((drv_attr) && (*drv_attr)) {
    ret = sysfs_create_file(&policy->kobj, &((*drv_attr)->attr));
    - if (ret)
    + if (ret) {
    + unlock_policy_rwsem_write(cpu);
    goto err_out_driver_exit;
    + }
    drv_attr++;
    }
    if (cpufreq_driver->get){
    ret = sysfs_create_file(&policy->kobj, &cpuinfo_cur_freq.attr);
    - if (ret)
    + if (ret) {
    + unlock_policy_rwsem_write(cpu);
    goto err_out_driver_exit;
    + }
    }
    if (cpufreq_driver->target){
    ret = sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr);
    - if (ret)
    + if (ret) {
    + unlock_policy_rwsem_write(cpu);
    goto err_out_driver_exit;
    + }
    }

    spin_lock_irqsave(&cpufreq_driver_lock, flags);
    --
    http://www.codemonkey.org.uk
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: [crash] kernel BUG at drivers/cpufreq/cpufreq.c:1060!


    * Dave Jones wrote:

    > > It _looks_ like we're leaking a refcount on that lock, but I don't
    > > see where. It's a shame you can't reproduce this easily, as
    > > cpufreq.debug=7 would give us more clues. (And
    > > CONFIG_CPUFREQ_DEBUG=y)

    >
    > So we're missing some unlocks in some error paths. It's feasible you
    > hit one of those. This patch should be the fix for that.


    since it's not really reproducible (i failed to get it since then), how
    about you push your fix upstream (it's an obviously correct fix), we
    consider this regression fixed and i'll re-notify you if there's still
    any problem left. It's not like there's any escape from make randconfig
    bootup test coverage in the long run ;-)

    Ingo
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: [crash] kernel BUG at drivers/cpufreq/cpufreq.c:1060!

    On Thu, Dec 13, 2007 at 11:17:11AM +0100, Ingo Molnar wrote:
    >
    > * Dave Jones wrote:
    >
    > > > It _looks_ like we're leaking a refcount on that lock, but I don't
    > > > see where. It's a shame you can't reproduce this easily, as
    > > > cpufreq.debug=7 would give us more clues. (And
    > > > CONFIG_CPUFREQ_DEBUG=y)

    > >
    > > So we're missing some unlocks in some error paths. It's feasible you
    > > hit one of those. This patch should be the fix for that.

    >
    > since it's not really reproducible (i failed to get it since then), how
    > about you push your fix upstream (it's an obviously correct fix), we
    > consider this regression fixed and i'll re-notify you if there's still
    > any problem left. It's not like there's any escape from make randconfig
    > bootup test coverage in the long run ;-)


    Yeah, will push it to Linus today.

    Dave

    --
    http://www.codemonkey.org.uk
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread