2.6.28-rc4: BUG: spinlock lockup on CPU#0 - Kernel

This is a discussion on 2.6.28-rc4: BUG: spinlock lockup on CPU#0 - Kernel ; Hi It is on sparc64 (Ultra 10). 2.6.28-rc4-00011-g3ad4f59 BUG: spinlock lockup on CPU#0, metalog/1322, 00000000007f1430 Kernel panic - not syncing: Aiee, killing interrupt handler! It seems Sysrq does not work. David, how can I provide more info before I reboot ...

+ Reply to Thread
Results 1 to 4 of 4

Thread: 2.6.28-rc4: BUG: spinlock lockup on CPU#0

  1. 2.6.28-rc4: BUG: spinlock lockup on CPU#0

    Hi

    It is on sparc64 (Ultra 10). 2.6.28-rc4-00011-g3ad4f59

    BUG: spinlock lockup on CPU#0, metalog/1322, 00000000007f1430
    Kernel panic - not syncing: Aiee, killing interrupt handler!

    It seems Sysrq does not work.

    David, how can I provide more info before I reboot it ?

    Full dmesg:

    [ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.27.0 2000/08/23 15:45'
    [ 0.000000] PROMLIB: Root node compatible:
    [ 0.000000] Linux version 2.6.28-rc4-00011-g3ad4f59 (fireball@cygnus) (gcc version 4.2.4 (Gentoo 4.2.4 p1.0)) #6 PREEMPT Tue Nov 11 03:16:19 MSK 2008
    [ 0.000000] console [earlyprom0] enabled
    [ 0.000000] ARCH: SUN4U
    [ 0.000000] Ethernet address: 08:00:20:f5:ba:1d
    [ 0.000000] Kernel: Using 3 locked TLB entries for main kernel image.
    [ 0.000000] Remapping the kernel... done.
    [ 0.000000] OF stdout device is: /pci@1f,0/pci@1,1/SUNW,m64B@2
    [ 0.000000] PROM: Built device tree with 43896 bytes of memory.
    [ 0.000000] Top of RAM: 0x3ff46000, Total RAM: 0x3ff34000
    [ 0.000000] Memory hole size: 0MB
    [ 0.000000] [0000000200000000-fffff80001000000] page_structs=131072 node=0 entry=0/0
    [ 0.000000] [0000000200000000-fffff80001400000] page_structs=131072 node=0 entry=1/0
    [ 0.000000] Allocated 1056768 bytes for kernel page tables.
    [ 0.000000] Zone PFN ranges:
    [ 0.000000] Normal 0x00000000 -> 0x0001ffa3
    [ 0.000000] Movable zone start PFN for each node
    [ 0.000000] early_node_map[3] active PFN ranges
    [ 0.000000] 0: 0x00000000 -> 0x0001ff7f
    [ 0.000000] 0: 0x0001ff80 -> 0x0001ff97
    [ 0.000000] 0: 0x0001ff9f -> 0x0001ffa3
    [ 0.000000] Booting Linux...
    [ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 129946
    [ 0.000000] Kernel command line: root=/dev/sda2 netconsole=@192.168.1.5/eth1,6667@192.168.1.3/00:08:c7:da:88:fb
    [ 0.000000] Preemptible RCU implementation.
    [ 0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes)
    [ 0.000000] clocksource: mult[245d1] shift[16]
    [ 0.000000] clockevent: mult[70a3d70a] shift[32]
    [ 20.820092] Console: colour dummy device 80x25
    [ 20.920473] console handover: boot [earlyprom0] -> real [tty0]
    [ 21.029396] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
    [ 21.029516] ... MAX_LOCKDEP_SUBCLASSES: 8
    [ 21.029581] ... MAX_LOCK_DEPTH: 48
    [ 21.029644] ... MAX_LOCKDEP_KEYS: 8191
    [ 21.029710] ... CLASSHASH_SIZE: 4096
    [ 21.029776] ... MAX_LOCKDEP_ENTRIES: 8192
    [ 21.029842] ... MAX_LOCKDEP_CHAINS: 16384
    [ 21.029908] ... CHAINHASH_SIZE: 8192
    [ 21.029975] memory used by lock dependency info: 3839 kB
    [ 21.030052] per task-struct memory footprint: 1920 bytes
    [ 21.030129] ------------------------
    [ 21.030185] | Locking API testsuite:
    [ 21.030243] ----------------------------------------------------------------------------
    [ 21.030354] | spin |wlock |rlock |mutex | wsem | rsem |
    [ 21.030467] --------------------------------------------------------------------------
    [ 21.030620] A-A deadlock: ok | ok | ok | ok | ok | ok |
    [ 21.056564] A-B-B-A deadlock: ok | ok | ok | ok | ok | ok |
    [ 21.081869] A-B-B-C-C-A deadlock: ok | ok | ok | ok | ok | ok |
    [ 21.107377] A-B-C-A-B-C deadlock: ok | ok | ok | ok | ok | ok |
    [ 21.132867] A-B-B-C-C-D-D-A deadlock: ok | ok | ok | ok | ok | ok |
    [ 21.158603] A-B-C-D-B-D-D-A deadlock: ok | ok | ok | ok | ok | ok |
    [ 21.184318] A-B-C-D-B-C-D-A deadlock: ok | ok | ok | ok | ok | ok |
    [ 21.210082] double unlock: ok | ok | ok | ok | ok | ok |
    [ 21.235279] initialize held: ok | ok | ok | ok | ok | ok |
    [ 21.260474] bad unlock order: ok | ok | ok | ok | ok | ok |
    [ 21.285859] --------------------------------------------------------------------------
    [ 21.286009] recursive read-lock: | ok | | ok |
    [ 21.294539] recursive read-lock #2: | ok | | ok |
    [ 21.303071] mixed read-write-lock: | ok | | ok |
    [ 21.311616] mixed write-read-lock: | ok | | ok |
    [ 21.320169] --------------------------------------------------------------------------
    [ 21.320321] hard-irqs-on + irq-safe-A/12: ok | ok | ok |
    [ 21.332968] soft-irqs-on + irq-safe-A/12: ok | ok | ok |
    [ 21.345631] hard-irqs-on + irq-safe-A/21: ok | ok | ok |
    [ 21.358286] soft-irqs-on + irq-safe-A/21: ok | ok | ok |
    [ 21.370941] sirq-safe-A => hirqs-on/12: ok | ok | ok |
    [ 21.383607] sirq-safe-A => hirqs-on/21: ok | ok | ok |
    [ 21.396266] hard-safe-A + irqs-on/12: ok | ok | ok |
    [ 21.408931] soft-safe-A + irqs-on/12: ok | ok | ok |
    [ 21.421603] hard-safe-A + irqs-on/21: ok | ok | ok |
    [ 21.434274] soft-safe-A + irqs-on/21: ok | ok | ok |
    [ 21.446950] hard-safe-A + unsafe-B #1/123: ok | ok | ok |
    [ 21.459698] soft-safe-A + unsafe-B #1/123: ok | ok | ok |
    [ 21.472460] hard-safe-A + unsafe-B #1/132: ok | ok | ok |
    [ 21.485217] soft-safe-A + unsafe-B #1/132: ok | ok | ok |
    [ 21.497963] hard-safe-A + unsafe-B #1/213: ok | ok | ok |
    [ 21.510714] soft-safe-A + unsafe-B #1/213: ok | ok | ok |
    [ 21.523465] hard-safe-A + unsafe-B #1/231: ok | ok | ok |
    [ 21.536200] soft-safe-A + unsafe-B #1/231: ok | ok | ok |
    [ 21.548947] hard-safe-A + unsafe-B #1/312: ok | ok | ok |
    [ 21.561637] soft-safe-A + unsafe-B #1/312: ok | ok | ok |
    [ 21.574330] hard-safe-A + unsafe-B #1/321: ok | ok | ok |
    [ 21.587066] soft-safe-A + unsafe-B #1/321: ok | ok | ok |
    [ 21.599807] hard-safe-A + unsafe-B #2/123: ok | ok | ok |
    [ 21.612546] soft-safe-A + unsafe-B #2/123: ok | ok | ok |
    [ 21.625304] hard-safe-A + unsafe-B #2/132: ok | ok | ok |
    [ 21.638044] soft-safe-A + unsafe-B #2/132: ok | ok | ok |
    [ 21.650807] hard-safe-A + unsafe-B #2/213: ok | ok | ok |
    [ 21.663550] soft-safe-A + unsafe-B #2/213: ok | ok | ok |
    [ 21.676311] hard-safe-A + unsafe-B #2/231: ok | ok | ok |
    [ 21.689054] soft-safe-A + unsafe-B #2/231: ok | ok | ok |
    [ 21.701799] hard-safe-A + unsafe-B #2/312: ok | ok | ok |
    [ 21.714547] soft-safe-A + unsafe-B #2/312: ok | ok | ok |
    [ 21.727314] hard-safe-A + unsafe-B #2/321: ok | ok | ok |
    [ 21.740048] soft-safe-A + unsafe-B #2/321: ok | ok | ok |
    [ 21.752805] hard-irq lock-inversion/123: ok | ok | ok |
    [ 21.765560] soft-irq lock-inversion/123: ok | ok | ok |
    [ 21.778332] hard-irq lock-inversion/132: ok | ok | ok |
    [ 21.791099] soft-irq lock-inversion/132: ok | ok | ok |
    [ 21.803888] hard-irq lock-inversion/213: ok | ok | ok |
    [ 21.816643] soft-irq lock-inversion/213: ok | ok | ok |
    [ 21.829421] hard-irq lock-inversion/231: ok | ok | ok |
    [ 21.842171] soft-irq lock-inversion/231: ok | ok | ok |
    [ 21.854949] hard-irq lock-inversion/312: ok | ok | ok |
    [ 21.867715] soft-irq lock-inversion/312: ok | ok | ok |
    [ 21.880502] hard-irq lock-inversion/321: ok | ok | ok |
    [ 21.893255] soft-irq lock-inversion/321: ok | ok | ok |
    [ 21.906022] hard-irq read-recursion/123: ok |
    [ 21.910349] soft-irq read-recursion/123: ok |
    [ 21.914676] hard-irq read-recursion/132: ok |
    [ 21.918993] soft-irq read-recursion/132: ok |
    [ 21.923317] hard-irq read-recursion/213: ok |
    [ 21.927632] soft-irq read-recursion/213: ok |
    [ 21.931954] hard-irq read-recursion/231: ok |
    [ 21.936271] soft-irq read-recursion/231: ok |
    [ 21.940594] hard-irq read-recursion/312: ok |
    [ 21.944916] soft-irq read-recursion/312: ok |
    [ 21.949238] hard-irq read-recursion/321: ok |
    [ 21.953551] soft-irq read-recursion/321: ok |
    [ 21.957876] -------------------------------------------------------
    [ 21.957988] Good, all 218 testcases passed! |
    [ 21.958074] ---------------------------------
    [ 21.963659] Dentry cache hash table entries: 131072 (order: 7, 1048576 bytes)
    [ 21.968279] Inode-cache hash table entries: 65536 (order: 6, 524288 bytes)
    [ 22.071515] Memory: 1024992k available (2920k kernel code, 1344k data, 136k init) [fffff80000000000,000000003ff46000]
    [ 22.072160] SLUB: Genslabs=13, HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
    [ 22.156160] Calibrating delay using timer specific routine.. 888.86 BogoMIPS (lpj=1479779)
    [ 22.157595] Mount-cache hash table entries: 512
    [ 22.171305] net_namespace: 648 bytes
    [ 22.173816] NET: Registered protocol family 16
    [ 22.250788] /pci@1f,0: PCI IO[1fe02000000] MEM[1ff00000000]
    [ 22.251009] /pci@1f,0: SABRE PCI Bus Module ver[0:0]
    [ 22.252227] PCI: Scanning PBM /pci@1f,0
    [ 22.256690] pci 0000:02:01.0: PME# supported from D0 D1 D2 D3hot
    [ 22.256815] pci 0000:02:01.0: PME# disabled
    [ 22.357451] SCSI subsystem initialized
    [ 22.400547] AUXIO: Found device at /pci@1f,0/pci@1,1/ebus@1/auxio@14,726000
    [ 22.404167] /pci@1f,0/pci@1,1/ebus@1/eeprom@14,0: Mostek regs at 0x1fff1000000
    [ 22.425298] NET: Registered protocol family 2
    [ 22.533868] IP route cache hash table entries: 8192 (order: 3, 65536 bytes)
    [ 22.539031] TCP established hash table entries: 32768 (order: 6, 524288 bytes)
    [ 22.542581] TCP bind hash table entries: 32768 (order: 7, 1835008 bytes)
    [ 22.574770] TCP: Hash tables configured (established 32768 bind 32768)
    [ 22.575053] TCP reno registered
    [ 22.614450] NET: Registered protocol family 1
    [ 22.615982] power: Control reg at 1fff1724000
    [ 22.635240] Kprobe smoke test started
    [ 22.759844] Kprobe smoke test passed successfully
    [ 22.959192] Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
    [ 22.971385] msgmni has been set to 2003
    [ 22.974919] io scheduler noop registered
    [ 22.978475] io scheduler cfq registered (default)
    [ 22.982618] atyfb: 3D RAGE PRO (Mach64 GP, PQFP, PCI) [0x4750 rev 0x7c]
    [ 22.984071] atyfb: 4M SGRAM (1:1), 14.31818 MHz XTAL, 230 MHz PLL, 100 Mhz MCLK, 100 MHz XCLK
    [ 22.986493] fbcon: ATY Mach64 (fb0) is primary device
    [ 23.059089] Console: switching to colour frame buffer device 144x56
    [ 23.083274] atyfb: fb0: ATY Mach64 frame buffer device on PCI
    [ 23.179801] /pci@1f,0/pci@1,1/ebus@1/su@14,3083f8: Keyboard port at 1fff13083f8, irq 6
    [ 23.180774] /pci@1f,0/pci@1,1/ebus@1/su@14,3062f8: Mouse port at 1fff13062f8, irq 7
    [ 23.186766] sunhme.c:v3.10 August 26, 2008 David S. Miller (davem@davemloft.net)
    [ 23.196517] eth0: HAPPY MEAL (PCI/CheerIO) 10/100BaseT Ethernet 08:00:20:f5:ba:1d
    [ 23.198099] e100: Intel(R) PRO/100 Network Driver, 3.5.23-k4-NAPI
    [ 23.198415] e100: Copyright(c) 1999-2006 Intel Corporation
    [ 23.223898] e100 0000:02:01.0: PME# disabled
    [ 23.231072] e100: eth1: e100_probe: addr 0x1ff00002000, irq 15, MAC addr 00:d0:b7:e4:d6:f1
    [ 23.232687] netconsole: local port 6665
    [ 23.232950] netconsole: local IP 192.168.1.5
    [ 23.233173] netconsole: interface eth1
    [ 23.233367] netconsole: remote port 6667
    [ 23.233571] netconsole: remote IP 192.168.1.3
    [ 23.233801] netconsole: remote ethernet address 00:08:c7:da:88:fb
    [ 23.234141] netconsole: device eth1 not up yet, forcing it
    [ 23.279743] e100: eth1: e100_watchdog: link up, 100Mbps, full-duplex
    [ 23.280283] netconsole: carrier detect appears untrustworthy, waiting 4 seconds
    [ 27.323532] console [netcon0] enabled
    [ 28.521598] netconsole: network logging started
    [ 28.532863] Driver 'sd' needs updating - please use bus_type methods
    [ 28.543959] Driver 'sr' needs updating - please use bus_type methods
    [ 28.557924] scsi0 : pata_cmd64x
    [ 28.573880] scsi1 : pata_cmd64x
    [ 28.584894] ata1: PATA max MWDMA2 cmd 0x1fe02c00000 ctl 0x1fe02c00008 bmdma 0x1fe02c00020 irq 14
    [ 28.594900] ata2: PATA max MWDMA2 cmd 0x1fe02c00010 ctl 0x1fe02c00018 bmdma 0x1fe02c00028 irq 14
    [ 28.762973] ata1.00: ATA-5: ST320011A, 3.10, max UDMA/100
    [ 28.772425] ata1.00: 39102336 sectors, multi 0: LBA
    [ 28.789484] ata1.00: configured for MWDMA2
    [ 28.966201] scsi 0:0:0:0: Direct-Access ATA ST320011A 3.10 PQ: 0 ANSI: 5
    [ 28.981385] sd 0:0:0:0: [sda] 39102336 512-byte hardware sectors: (20.0 GB/18.6 GiB)
    [ 28.991592] sd 0:0:0:0: [sda] Write Protect is off
    [ 29.002396] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
    [ 29.014373] sd 0:0:0:0: [sda] 39102336 512-byte hardware sectors: (20.0 GB/18.6 GiB)
    [ 29.024624] sd 0:0:0:0: [sda] Write Protect is off
    [ 29.035137] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
    [ 29.045375] sda: sda1 sda2
    [ 29.077794] sd 0:0:0:0: [sda] Attached SCSI disk
    [ 29.091617] sd 0:0:0:0: Attached scsi generic sg0 type 0
    [ 29.107658] mice: PS/2 mouse device common for all mice
    [ 29.124328] rtc-m48t59 rtc-m48t59.0: rtc core: registered m48t59 as rtc0
    [ 29.147410] TCP cubic registered
    [ 29.157979] NET: Registered protocol family 17
    [ 29.176361] RPC: Registered udp transport module.
    [ 29.186071] RPC: Registered tcp transport module.
    [ 29.201571] rtc-m48t59 rtc-m48t59.0: setting system clock to 2008-11-11 03:24:40 UTC (1226373880)
    [ 29.771270] input: Sun Type 5 keyboard as /class/input/input0
    [ 29.962863] EXT3-fs: mounted filesystem with ordered data mode.
    [ 29.973192] VFS: Mounted root (ext3 filesystem) readonly.
    [ 29.984508] kjournald starting. Commit interval 5 seconds
    [66033.381833] BUG: spinlock lockup on CPU#0, metalog/1322, 00000000007f1430
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: 2.6.28-rc4: BUG: spinlock lockup on CPU#0

    From: Alexander Beregalov
    Date: Wed, 12 Nov 2008 02:32:30 +0300

    > It is on sparc64 (Ultra 10). 2.6.28-rc4-00011-g3ad4f59
    >
    > BUG: spinlock lockup on CPU#0, metalog/1322, 00000000007f1430
    > Kernel panic - not syncing: Aiee, killing interrupt handler!
    >
    > It seems Sysrq does not work.
    >
    > David, how can I provide more info before I reboot it ?
    >
    > Full dmesg:


    Hmm, this BUG message should have given you a register dump and a
    backtrace.

    Can you give it a "Alt-SysRQ-p" and show the output?

    Actually, if you got this over a serial console or
    similar, you're not getting the register dump because
    the console loglevel is filtering them out.

    If that's what's happening, reboot with "ignore_loglevel"
    (I always boot with this option, by adding it to the
    default kernel command line options in /etc/silo.conf)

    Also, in situations where the system is already booted and
    you want to see things like the Alt-SysRQ-p register dumps
    on the console, you'll need to type something like
    "Alt-SysRQ-8" to bump the console log level.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: 2.6.28-rc4: BUG: spinlock lockup on CPU#0

    2008/11/12 David Miller :
    > From: Alexander Beregalov
    > Date: Wed, 12 Nov 2008 02:32:30 +0300
    >
    >> It is on sparc64 (Ultra 10). 2.6.28-rc4-00011-g3ad4f59
    >>
    >> BUG: spinlock lockup on CPU#0, metalog/1322, 00000000007f1430
    >> Kernel panic - not syncing: Aiee, killing interrupt handler!
    >>
    >> It seems Sysrq does not work.
    >>
    >> David, how can I provide more info before I reboot it ?
    >>
    >> Full dmesg:

    >
    > Hmm, this BUG message should have given you a register dump and a
    > backtrace.
    >
    > Can you give it a "Alt-SysRQ-p" and show the output?

    No, sysrq does not work, even Stop-A does not work now.
    How could it be.

    >
    > Actually, if you got this over a serial console or
    > similar, you're not getting the register dump because
    > the console loglevel is filtering them out.

    Yes, I use netconsole.

    >
    > If that's what's happening, reboot with "ignore_loglevel"
    > (I always boot with this option, by adding it to the
    > default kernel command line options in /etc/silo.conf)
    >
    > Also, in situations where the system is already booted and
    > you want to see things like the Alt-SysRQ-p register dumps
    > on the console, you'll need to type something like
    > "Alt-SysRQ-8" to bump the console log level.


    Yes, I know, thanks.

    Ok, I have rebooted it, let's hope this will happen again...

    Hm, fsck says Superblock write time is in future.
    But rtc-m48t59 works correctly now
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: 2.6.28-rc4: BUG: spinlock lockup on CPU#0

    From: "Alexander Beregalov"
    Date: Wed, 12 Nov 2008 03:09:54 +0300

    > 2008/11/12 David Miller :
    > > From: Alexander Beregalov
    > > Date: Wed, 12 Nov 2008 02:32:30 +0300
    > >
    > >> It is on sparc64 (Ultra 10). 2.6.28-rc4-00011-g3ad4f59
    > >>
    > >> BUG: spinlock lockup on CPU#0, metalog/1322, 00000000007f1430
    > >> Kernel panic - not syncing: Aiee, killing interrupt handler!
    > >>
    > >> It seems Sysrq does not work.
    > >>
    > >> David, how can I provide more info before I reboot it ?
    > >>
    > >> Full dmesg:

    > >
    > > Hmm, this BUG message should have given you a register dump and a
    > > backtrace.
    > >
    > > Can you give it a "Alt-SysRQ-p" and show the output?

    > No, sysrq does not work, even Stop-A does not work now.
    > How could it be.


    The cpu stuck on the spinlock has interrupts disabled, but this
    is the cpu that services console interrupts.

    That's how it could be.
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread