WARNING in 2.6.25-07422-gb66e1f1 - Kernel

This is a discussion on WARNING in 2.6.25-07422-gb66e1f1 - Kernel ; Hi, I got this on boot: ------------[ cut here ]------------ WARNING: at include/linux/blkdev.h:443 blk_remove_plug+0x7d/0x90() Modules linked in: kvm_amd kvm powernow_k8 ehci_hcd ohci_hcd usbcore forcedeth Pid: 1681, comm: fsck.ext3 Not tainted 2.6.25-07422-gb66e1f1-dirty #74 Call Trace: [ ] warn_on_slowpath+0x64/0xa0 [ ] generic_make_request+0x194/0x250 ...

+ Reply to Thread
Results 1 to 16 of 16

Thread: WARNING in 2.6.25-07422-gb66e1f1

  1. WARNING in 2.6.25-07422-gb66e1f1

    Hi, I got this on boot:

    ------------[ cut here ]------------
    WARNING: at include/linux/blkdev.h:443 blk_remove_plug+0x7d/0x90()
    Modules linked in: kvm_amd kvm powernow_k8 ehci_hcd ohci_hcd usbcore forcedeth
    Pid: 1681, comm: fsck.ext3 Not tainted 2.6.25-07422-gb66e1f1-dirty #74

    Call Trace:
    [] warn_on_slowpath+0x64/0xa0
    [] generic_make_request+0x194/0x250
    [] submit_bio+0x97/0x140
    [] blk_remove_plug+0x7d/0x90
    [] raid5_unplug_device+0x44/0x110
    [] sync_page+0x2e/0x50
    [] __wait_on_bit+0x65/0x90
    [] wait_on_page_bit+0x78/0x80
    [] wake_bit_function+0x0/0x40
    [] pagevec_lookup_tag+0x1a/0x30
    [] wait_on_page_writeback_range+0xbd/0x130
    [] do_writepages+0x20/0x40
    [] __filemap_fdatawrite_range+0x52/0x60
    [] filemap_write_and_wait+0x44/0x50
    [] __blkdev_put+0x148/0x1b0
    [] __fput+0xb1/0x1c0
    [] filp_close+0x48/0x80
    [] sys_close+0x9f/0x110
    [] system_call_after_swapgs+0x7b/0x80

    ---[ end trace 94c0787a2e4d19eb ]---

    Just for the record: I patched powernow_k8 module, that's the reason for
    the "dirty" postfix, but I got above trace also with vanilla git.

    .config attached and full dmesg here:

    Linux version 2.6.25-07422-gb66e1f1-dirty (light@graviton) (gcc version 4.2.3
    (Gentoo 4.2.3 p1.0)) #74 SMP PREEMPT Sat May 3 11:39:12 CEST 2008
    Command line: root=/dev/md1 init=/sbin/init sata_nv.swncq=1
    snd-hda-intel.enable_msi=0
    BIOS-provided physical RAM map:
    BIOS-e820: 0000000000000000 - 000000000009f000 (usable)
    BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved)
    BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
    BIOS-e820: 0000000000100000 - 00000000cbef0000 (usable)
    BIOS-e820: 00000000cbef0000 - 00000000cbef3000 (ACPI NVS)
    BIOS-e820: 00000000cbef3000 - 00000000cbf00000 (ACPI data)
    BIOS-e820: 00000000cc000000 - 00000000d0000000 (reserved)
    BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
    BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
    BIOS-e820: 0000000100000000 - 0000000130000000 (usable)
    Entering add_active_range(0, 0, 159) 0 entries of 256 used
    Entering add_active_range(0, 256, 835312) 1 entries of 256 used
    Entering add_active_range(0, 1048576, 1245184) 2 entries of 256 used
    max_pfn_mapped = 1245184
    x86: PAT support disabled.
    init_memory_mapping
    DMI 2.4 present.
    ACPI: RSDP 000F7610, 0024 (r2 Nvidia)
    ACPI: XSDT CBEF3100, 004C (r1 Nvidia ASUSACPI 42302E31 AWRD 0)
    ACPI: FACP CBEF9C80, 00F4 (r3 Nvidia ASUSACPI 42302E31 AWRD 0)
    ACPI: DSDT CBEF3280, 6987 (r1 NVIDIA ASUSACPI 1000 MSFT 3000000)
    ACPI: FACS CBEF0000, 0040
    ACPI: SSDT CBEF9E80, 01C4 (r1 PTLTD POWERNOW 1 LTP 1)
    ACPI: HPET CBEFA0C0, 0038 (r1 Nvidia ASUSACPI 42302E31 AWRD 98)
    ACPI: MCFG CBEFA140, 003C (r1 Nvidia ASUSACPI 42302E31 AWRD 0)
    ACPI: APIC CBEF9DC0, 007C (r1 Nvidia ASUSACPI 42302E31 AWRD 0)
    Entering add_active_range(0, 0, 159) 0 entries of 256 used
    Entering add_active_range(0, 256, 835312) 1 entries of 256 used
    Entering add_active_range(0, 1048576, 1245184) 2 entries of 256 used
    early res: 0 [0-fff] BIOS data page
    early res: 1 [6000-7fff] TRAMPOLINE
    early res: 2 [200000-7e7e1f] TEXT DATA BSS
    early res: 3 [9f000-fffff] BIOS reserved
    early res: 4 [8000-dfff] PGTABLE
    [ffffe20000000000-ffffe20002dfffff] PMD ->
    [ffff810001200000-ffff810003ffffff] on node 0
    [ffffe20002e00000-ffffe200043fffff] PMD ->
    [ffff81000c000000-ffff81000d5fffff] on node 0
    Zone PFN ranges:
    DMA 0 -> 4096
    DMA32 4096 -> 1048576
    Normal 1048576 -> 1245184
    Movable zone start PFN for each node
    early_node_map[3] active PFN ranges
    0: 0 -> 159
    0: 256 -> 835312
    0: 1048576 -> 1245184
    On node 0 totalpages: 1031823
    DMA zone: 56 pages used for memmap
    DMA zone: 1619 pages reserved
    DMA zone: 2324 pages, LIFO batch:0
    DMA32 zone: 14280 pages used for memmap
    DMA32 zone: 816936 pages, LIFO batch:31
    Normal zone: 2688 pages used for memmap
    Normal zone: 193920 pages, LIFO batch:31
    Movable zone: 0 pages used for memmap
    ACPI: PM-Timer IO Port: 0x4008
    ACPI: Local APIC address 0xfee00000
    ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
    ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
    ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
    ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
    ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
    IOAPIC[0]: apic_id 2, version 0, address 0xfec00000, GSI 0-23
    ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
    ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
    ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge)
    ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15 high edge)
    ACPI: IRQ0 used by override.
    ACPI: IRQ2 used by override.
    ACPI: IRQ9 used by override.
    ACPI: IRQ14 used by override.
    ACPI: IRQ15 used by override.
    Setting APIC routing to flat
    ACPI: HPET id: 0x10de8201 base: 0xfefff000
    Using ACPI (MADT) for SMP configuration information
    Allocating PCI resources starting at d1000000 (gap: d0000000:10000000)
    SMP: Allowing 2 CPUs, 0 hotplug CPUs
    PERCPU: Allocating 33832 bytes of per cpu data
    NR_CPUS: 2, nr_cpu_ids: 2
    Built 1 zonelists in Zone order, mobility grouping on. Total pages: 1013180
    Kernel command line: root=/dev/md1 init=/sbin/init sata_nv.swncq=1
    snd-hda-intel.enable_msi=0
    Initializing CPU#0
    PID hash table entries: 4096 (order: 12, 32768 bytes)
    Extended CMOS year: 2000
    TSC calibrated against PM_TIMER
    Marking TSC unstable due to TSCs unsynchronized
    time.c: Detected 2004.178 MHz processor.
    spurious 8259A interrupt: IRQ7.
    Console: colour VGA+ 80x25
    console [tty0] enabled
    Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
    Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
    Checking aperture...
    Node 0: aperture @ 4000000 size 32 MB
    Aperture pointing to e820 RAM. Ignoring.
    No AGP bridge found
    Your BIOS doesn't leave a aperture memory hole
    Please enable the IOMMU option in the BIOS setup
    This costs you 64 MB of RAM
    Mapping aperture over 65536 KB of RAM @ 4000000
    Memory: 3978912k/4980736k available (3731k kernel code, 147748k reserved,
    1528k data, 264k init)
    CPA: page pool initialized 1 of 1 pages preallocated
    SLUB: Genslabs=12, HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
    hpet clockevent registered
    Calibrating delay using timer specific routine.. 4010.92 BogoMIPS
    (lpj=2005461)
    Mount-cache hash table entries: 256
    CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
    CPU: L2 Cache: 512K (64 bytes/line)
    CPU: Physical Processor ID: 0
    CPU: Processor Core ID: 0
    ACPI: Core revision 20080321
    CPU0: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ stepping 02
    Using local APIC timer interrupts.
    APIC timer calibration result 12526108
    Detected 12.526 MHz APIC timer.
    Booting processor 1/1 ip 6000
    Initializing CPU#1
    Calibrating delay using timer specific routine.. 4008.42 BogoMIPS
    (lpj=2004213)
    CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
    CPU: L2 Cache: 512K (64 bytes/line)
    CPU: Physical Processor ID: 0
    CPU: Processor Core ID: 1
    x86: PAT support disabled.
    CPU1: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ stepping 02
    Brought up 2 CPUs
    Total of 2 processors activated (8019.34 BogoMIPS).
    net_namespace: 936 bytes
    xor: automatically using best checksumming function: generic_sse
    generic_sse: 5716.000 MB/sec
    xor: using function: generic_sse (5716.000 MB/sec)
    NET: Registered protocol family 16
    No dock devices found.
    node 0 link 0: io port [a000, ffff]
    TOM: 00000000d0000000 aka 3328M
    node 0 link 0: mmio [a0000, bffff]
    node 0 link 0: mmio [d0000000, dfffffff]
    node 0 link 0: mmio [f0000000, fe02ffff]
    node 0 link 0: mmio [e0000000, e02fffff]
    TOM2: 0000000130000000 aka 4864M
    bus: [00,02] on node 0 link 0
    bus: 00 index 0 io port: [0, ffff]
    bus: 00 index 1 mmio: [a0000, bffff]
    bus: 00 index 2 mmio: [d0000000, efffffff]
    bus: 00 index 3 mmio: [f0000000, ffffffff]
    bus: 00 index 4 mmio: [130000000, fcffffffff]
    ACPI: bus type pci registered
    PCI: MCFG configuration 0: base e0000000 segment 0 buses 0 - 255
    PCI: MCFG area at e0000000 reserved in E820
    PCI: Using MMCONFIG at e0000000 - efffffff
    PCI: Using configuration type 1 for base access
    ACPI: EC: Look up EC in DSDT
    ACPI: Interpreter enabled
    ACPI: (supports S0 S1 S3 S5)
    ACPI: Using IOAPIC for interrupt routing
    ACPI: PCI Root Bridge [PCI0] (0000:00)
    PCI: Transparent bridge - 0000:00:10.0
    ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
    ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.HUB0._PRT]
    ACPI: PCI Interrupt Link [LNK1] (IRQs 5 7 9 10 11 14 15) *0, disabled.
    ACPI: PCI Interrupt Link [LNK2] (IRQs 5 7 9 10 11 14 15) *0, disabled.
    ACPI: PCI Interrupt Link [LNK3] (IRQs 5 7 9 10 11 14 15) *0, disabled.
    ACPI: PCI Interrupt Link [LNK4] (IRQs 5 7 9 10 11 14 15) *0, disabled.
    ACPI: PCI Interrupt Link [LNK5] (IRQs 5 7 9 10 *11 14 15)
    ACPI: PCI Interrupt Link [LNK6] (IRQs 5 7 9 10 11 14 15) *0, disabled.
    ACPI: PCI Interrupt Link [LNK7] (IRQs *5 7 9 10 11 14 15)
    ACPI: PCI Interrupt Link [LNK8] (IRQs 5 7 9 10 11 14 15) *0, disabled.
    ACPI: PCI Interrupt Link [LUBA] (IRQs 5 7 9 10 11 14 15) *0, disabled.
    ACPI: PCI Interrupt Link [LUBB] (IRQs 5 7 9 10 11 14 15) *0, disabled.
    ACPI: PCI Interrupt Link [LMAC] (IRQs 5 7 9 10 *11 14 15)
    ACPI: PCI Interrupt Link [LACI] (IRQs 5 7 9 10 11 14 15) *0, disabled.
    ACPI: PCI Interrupt Link [LAZA] (IRQs 5 7 9 10 *11 14 15)
    ACPI: PCI Interrupt Link [LPMU] (IRQs 5 7 9 10 11 14 15) *0, disabled.
    ACPI: PCI Interrupt Link [LMCI] (IRQs 5 7 9 10 11 14 15) *0, disabled.
    ACPI: PCI Interrupt Link [LSMB] (IRQs 5 7 9 10 11 14 15) *0, disabled.
    ACPI: PCI Interrupt Link [LUB2] (IRQs 5 7 9 10 11 14 15) *0, disabled.
    ACPI: PCI Interrupt Link [LIDE] (IRQs 5 7 9 10 11 14 15) *0, disabled.
    ACPI: PCI Interrupt Link [LSID] (IRQs *5 7 9 10 11 14 15)
    ACPI: PCI Interrupt Link [LFID] (IRQs *5 7 9 10 11 14 15)
    ACPI: PCI Interrupt Link [APC1] (IRQs 16) *0, disabled.
    ACPI: PCI Interrupt Link [APC2] (IRQs 17) *0, disabled.
    ACPI: PCI Interrupt Link [APC3] (IRQs 18) *0, disabled.
    ACPI: PCI Interrupt Link [APC4] (IRQs 19) *0, disabled.
    ACPI: PCI Interrupt Link [APC5] (IRQs 16) *0
    ACPI: PCI Interrupt Link [APC6] (IRQs 16) *0, disabled.
    ACPI: PCI Interrupt Link [APC7] (IRQs 16) *0
    ACPI: PCI Interrupt Link [APC8] (IRQs 16) *0, disabled.
    ACPI: PCI Interrupt Link [APCF] (IRQs 20 21 22 23) *0, disabled.
    ACPI: PCI Interrupt Link [APCG] (IRQs 20 21 22 23) *0, disabled.
    ACPI: PCI Interrupt Link [APCH] (IRQs 20 21 22 23) *0
    ACPI: PCI Interrupt Link [APCJ] (IRQs 20 21 22 23) *0, disabled.
    ACPI: PCI Interrupt Link [APMU] (IRQs 20 21 22 23) *0, disabled.
    ACPI: PCI Interrupt Link [AAZA] (IRQs 20 21 22 23) *0
    ACPI: PCI Interrupt Link [APCK] (IRQs 20 21 22 23) *0, disabled.
    ACPI: PCI Interrupt Link [APCS] (IRQs 20 21 22 23) *0, disabled.
    ACPI: PCI Interrupt Link [APCL] (IRQs 20 21 22 23) *0, disabled.
    ACPI: PCI Interrupt Link [APCM] (IRQs 20 21 22 23) *0, disabled.
    ACPI: PCI Interrupt Link [APCZ] (IRQs 20 21 22 23) *0, disabled.
    ACPI: PCI Interrupt Link [APSI] (IRQs 20 21 22 23) *0
    ACPI: PCI Interrupt Link [APSJ] (IRQs 20 21 22 23) *0
    Linux Plug and Play Support v0.97 (c) Adam Belay
    pnp: PnP ACPI init
    ACPI: bus type pnp registered
    pnp: PnP ACPI: found 18 devices
    ACPI: ACPI bus type pnp unregistered
    SCSI subsystem initialized
    libata version 3.00 loaded.
    PCI: Using ACPI for IRQ routing
    DMARarse DMAR table failure.
    PCI-DMA: Disabling AGP.
    PCI-DMA: aperture base @ 4000000 size 65536 KB
    PCI-DMA: using GART IOMMU.
    PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture
    hpet0: at MMIO 0xfefff000, IRQs 2, 8, 31
    hpet0: 3 32-bit timers, 25000000 Hz
    Switched to high resolution mode on CPU 0
    Switched to high resolution mode on CPU 1
    system 00:01: ioport range 0x4000-0x407f has been reserved
    system 00:01: ioport range 0x4080-0x40ff has been reserved
    system 00:01: ioport range 0x4400-0x447f has been reserved
    system 00:01: ioport range 0x4480-0x44ff has been reserved
    system 00:01: ioport range 0x4800-0x487f has been reserved
    system 00:01: ioport range 0x4880-0x48ff has been reserved
    system 00:01: ioport range 0x2000-0x207f has been reserved
    system 00:01: ioport range 0x2080-0x20ff has been reserved
    system 00:01: iomem range 0xcc000000-0xcfffffff could not be reserved
    system 00:02: ioport range 0x4d0-0x4d1 has been reserved
    system 00:02: ioport range 0x800-0x87f has been reserved
    system 00:02: ioport range 0x290-0x297 has been reserved
    system 00:10: iomem range 0xe0000000-0xefffffff could not be reserved
    system 00:11: iomem range 0xf0000-0xf3fff could not be reserved
    system 00:11: iomem range 0xf4000-0xf7fff could not be reserved
    system 00:11: iomem range 0xf8000-0xfbfff could not be reserved
    system 00:11: iomem range 0xfc000-0xfffff could not be reserved
    system 00:11: iomem range 0xfefff000-0xfefff0ff could not be reserved
    system 00:11: iomem range 0xcbef0000-0xcbefffff could not be reserved
    system 00:11: iomem range 0xffff0000-0xffffffff could not be reserved
    system 00:11: iomem range 0x0-0x9ffff could not be reserved
    system 00:11: iomem range 0x100000-0xcbeeffff could not be reserved
    system 00:11: iomem range 0xcbf00000-0xcfefffff could not be reserved
    system 00:11: iomem range 0xfec00000-0xfec00fff could not be reserved
    system 00:11: iomem range 0xfee00000-0xfeefffff could not be reserved
    system 00:11: iomem range 0xfefff000-0xfeffffff could not be reserved
    system 00:11: iomem range 0xfff80000-0xfff80fff could not be reserved
    system 00:11: iomem range 0xfff90000-0xfffbffff could not be reserved
    system 00:11: iomem range 0xfffed000-0xfffeffff could not be reserved
    PCI: Bridge: 0000:00:03.0
    IO window: a000-afff
    MEM window: 0xfdc00000-0xfdcfffff
    PREFETCH window: 0x00000000fdd00000-0x00000000fddfffff
    PCI: Bridge: 0000:00:10.0
    IO window: b000-bfff
    MEM window: 0xfdb00000-0xfdbfffff
    PREFETCH window: 0x00000000fde00000-0x00000000fdefffff
    PCI: Setting latency timer of device 0000:00:03.0 to 64
    PCI: Setting latency timer of device 0000:00:10.0 to 64
    NET: Registered protocol family 2
    IP route cache hash table entries: 131072 (order: 8, 1048576 bytes)
    TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
    TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
    TCP: Hash tables configured (established 262144 bind 65536)
    TCP reno registered
    NET: Registered protocol family 1
    Total HugeTLB memory allocated, 0
    Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
    msgmni has been set to 7772 for ipc namespace ffffffff806ef0e0
    async_tx: api initialized (sync-only)
    Block layer SCSI generic (bsg) driver version 0.4 loaded (major 254)
    io scheduler noop registered
    io scheduler anticipatory registered
    io scheduler deadline registered (default)
    io scheduler cfq registered
    pci 0000:00:00.0: Enabling HT MSI Mapping
    pci 0000:00:03.0: Enabling HT MSI Mapping
    pci 0000:00:05.0: Boot video device
    pci 0000:00:09.0: Enabling HT MSI Mapping
    pci 0000:00:0e.0: Enabling HT MSI Mapping
    pci 0000:00:0f.0: Enabling HT MSI Mapping
    pci 0000:00:10.0: Enabling HT MSI Mapping
    pci 0000:00:10.1: Enabling HT MSI Mapping
    PCI: Setting latency timer of device 0000:00:03.0 to 64
    assign_interrupt_mode Found MSI capability
    Allocate Port Service[0000:00:03.0cie00]
    Allocate Port Service[0000:00:03.0cie03]
    input: Power Button (FF) as /class/input/input0
    ACPI: Power Button (FF) [PWRF]
    input: Power Button (CM) as /class/input/input1
    ACPI: Power Button (CM) [PWRB]
    ACPI: PNP0C0B:00 is registered as cooling_device0
    ACPI: Fan [FAN] (on)
    ACPI: ACPI0007:00 is registered as cooling_device1
    ACPI: ACPI0007:01 is registered as cooling_device2
    ACPI: LNXTHERM:01 is registered as thermal_zone0
    ACPI: Thermal Zone [THRM] (40 C)
    lp: driver loaded but no devices found
    Real Time Clock Driver v1.12ac
    hpet_resources: 0xfefff000 is busy
    Non-volatile memory driver v1.2
    Linux agpgart interface v0.103
    Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
    serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
    serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
    00:09: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
    00:0a: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
    IT8716 SuperIO detected.
    parport_pc 00:0b: reported by Plug and Play ACPI
    parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE,EPP]
    lp0: using parport0 (interrupt-driven).
    loop: module loaded
    tun: Universal TUN/TAP device driver, 1.6
    tun: (C) 1999-2004 Max Krasnyansky
    r8169 Gigabit Ethernet driver 2.2LK-NAPI loaded
    ACPI: PCI Interrupt Link [APC5] enabled at IRQ 16
    ACPI: PCI Interrupt 0000:01:00.0[A] -> Link [APC5] -> GSI 16 (level, low) ->
    IRQ 16
    PCI: Setting latency timer of device 0000:01:00.0 to 64
    eth0: RTL8168b/8111b at 0xffffc20000026000, 00:08:54:52:28:41, XID 30000000
    IRQ 318
    Driver 'sd' needs updating - please use bus_type methods
    Driver 'sr' needs updating - please use bus_type methods
    sata_nv 0000:00:0e.0: version 3.5
    ACPI: PCI Interrupt Link [APSI] enabled at IRQ 23
    ACPI: PCI Interrupt 0000:00:0e.0[A] -> Link [APSI] -> GSI 23 (level, low) ->
    IRQ 23
    sata_nv 0000:00:0e.0: Using SWNCQ mode
    PCI: Setting latency timer of device 0000:00:0e.0 to 64
    scsi0 : sata_nv
    scsi1 : sata_nv
    ata1: SATA max UDMA/133 cmd 0x9f0 ctl 0xbf0 bmdma 0xe000 irq 23
    ata2: SATA max UDMA/133 cmd 0x970 ctl 0xb70 bmdma 0xe008 irq 23
    ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    ata1.00: ATA-7: ST3320620AS, 3.AAK, max UDMA/133
    ata1.00: 625142448 sectors, multi 1: LBA48 NCQ (depth 31/32)
    ata1.00: configured for UDMA/133
    ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    ata2.00: ATA-7: ST3320620AS, 3.AAK, max UDMA/133
    ata2.00: 625142448 sectors, multi 1: LBA48 NCQ (depth 31/32)
    ata2.00: configured for UDMA/133
    scsi 0:0:0:0: Direct-Access ATA ST3320620AS 3.AA PQ: 0 ANSI: 5
    sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
    sd 0:0:0:0: [sda] Write Protect is off
    sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
    sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support
    DPO or FUA
    sd 0:0:0:0: [sda] 625142448 512-byte hardware sectors (320073 MB)
    sd 0:0:0:0: [sda] Write Protect is off
    sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
    sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support
    DPO or FUA
    sda: sda1 sda2 < sda5 sda6 sda7 sda8 sda9 >
    sd 0:0:0:0: [sda] Attached SCSI disk
    sd 0:0:0:0: Attached scsi generic sg0 type 0
    scsi 1:0:0:0: Direct-Access ATA ST3320620AS 3.AA PQ: 0 ANSI: 5
    sd 1:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
    sd 1:0:0:0: [sdb] Write Protect is off
    sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
    sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support
    DPO or FUA
    sd 1:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
    sd 1:0:0:0: [sdb] Write Protect is off
    sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
    sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support
    DPO or FUA
    sdb: sdb1 sdb2 < sdb5 sdb6 sdb7 sdb8 sdb9 >
    sd 1:0:0:0: [sdb] Attached SCSI disk
    sd 1:0:0:0: Attached scsi generic sg1 type 0
    ACPI: PCI Interrupt Link [APSJ] enabled at IRQ 22
    ACPI: PCI Interrupt 0000:00:0f.0[A] -> Link [APSJ] -> GSI 22 (level, low) ->
    IRQ 22
    sata_nv 0000:00:0f.0: Using SWNCQ mode
    PCI: Setting latency timer of device 0000:00:0f.0 to 64
    scsi2 : sata_nv
    scsi3 : sata_nv
    ata3: SATA max UDMA/133 cmd 0x9e0 ctl 0xbe0 bmdma 0xcc00 irq 22
    ata4: SATA max UDMA/133 cmd 0x960 ctl 0xb60 bmdma 0xcc08 irq 22
    ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    ata3.00: ATA-7: ST3320620AS, 3.AAK, max UDMA/133
    ata3.00: 625142448 sectors, multi 1: LBA48 NCQ (depth 31/32)
    ata3.00: configured for UDMA/133
    ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
    ata4.00: ATA-7: ST3320620AS, 3.AAD, max UDMA/133
    ata4.00: 625142448 sectors, multi 1: LBA48 NCQ (depth 31/32)
    ata4.00: configured for UDMA/133
    scsi 2:0:0:0: Direct-Access ATA ST3320620AS 3.AA PQ: 0 ANSI: 5
    sd 2:0:0:0: [sdc] 625142448 512-byte hardware sectors (320073 MB)
    sd 2:0:0:0: [sdc] Write Protect is off
    sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
    sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support
    DPO or FUA
    sd 2:0:0:0: [sdc] 625142448 512-byte hardware sectors (320073 MB)
    sd 2:0:0:0: [sdc] Write Protect is off
    sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
    sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support
    DPO or FUA
    sdc: sdc1 sdc2 < sdc5 sdc6 sdc7 sdc8 sdc9 >
    sd 2:0:0:0: [sdc] Attached SCSI disk
    sd 2:0:0:0: Attached scsi generic sg2 type 0
    scsi 3:0:0:0: Direct-Access ATA ST3320620AS 3.AA PQ: 0 ANSI: 5
    sd 3:0:0:0: [sdd] 625142448 512-byte hardware sectors (320073 MB)
    sd 3:0:0:0: [sdd] Write Protect is off
    sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00
    sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support
    DPO or FUA
    sd 3:0:0:0: [sdd] 625142448 512-byte hardware sectors (320073 MB)
    sd 3:0:0:0: [sdd] Write Protect is off
    sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00
    sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support
    DPO or FUA
    sdd: sdd1 sdd2 < sdd5 sdd6 sdd7 sdd8 sdd9 >
    sd 3:0:0:0: [sdd] Attached SCSI disk
    sd 3:0:0:0: Attached scsi generic sg3 type 0
    pata_amd 0000:00:0d.0: version 0.3.10
    PCI: Setting latency timer of device 0000:00:0d.0 to 64
    scsi4 : pata_amd
    scsi5 : pata_amd
    ata5: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0xf400 irq 14
    ata6: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xf408 irq 15
    ata5.01: NODEV after polling detection
    ata5.00: ATAPI: NU DVDRW DDW-081, BX32, max UDMA/33
    ata5: nv_mode_filter: 0x739f&0x701f->0x701f, BIOS=0x7000 (0xc0000000)
    ACPI=0x701f (60:600:0x13)
    ata5.00: configured for UDMA/33
    ata6: port disabled. ignoring.
    scsi 4:0:0:0: CD-ROM NU DVDRW DDW-081 BX32 PQ: 0 ANSI: 5
    sr0: scsi3-mmc drive: 12x/40x writer cd/rw xa/form2 cdda tray
    Uniform CD-ROM driver Revision: 3.20
    sr 4:0:0:0: Attached scsi CD-ROM sr0
    sr 4:0:0:0: Attached scsi generic sg4 type 5
    PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
    serio: i8042 KBD port at 0x60,0x64 irq 1
    serio: i8042 AUX port at 0x60,0x64 irq 12
    gameport: NS558 PnP Gameport is pnp00:0f/gameport0, io 0x201, speed 638kHz
    mice: PS/2 mouse device common for all mice
    input: AT Translated Set 2 keyboard as /class/input/input2
    i2c /dev entries driver
    i2c-adapter i2c-0: nForce2 SMBus adapter at 0x4c00
    i2c-adapter i2c-1: nForce2 SMBus adapter at 0x4c40
    it87: Found IT8716F chip at 0x290, revision 0
    it87: in3 is VCC (+5V)
    it87: in7 is VCCH (+5V Stand-By)
    input: ImPS/2 Generic Wheel Mouse as /class/input/input3
    md: raid0 personality registered for level 0
    md: raid1 personality registered for level 1
    raid6: int64x1 2011 MB/s
    raid6: int64x2 2468 MB/s
    raid6: int64x4 2628 MB/s
    raid6: int64x8 1976 MB/s
    raid6: sse2x1 2753 MB/s
    raid6: sse2x2 3687 MB/s
    raid6: sse2x4 3816 MB/s
    raid6: using algorithm sse2x4 (3816 MB/s)
    md: raid6 personality registered for level 6
    md: raid5 personality registered for level 5
    md: raid4 personality registered for level 4
    md: multipath personality registered for level -4
    device-mapper: uevent: version 1.0.3
    device-mapper: ioctl: 4.13.0-ioctl (2007-10-18) initialised:
    dm-devel@redhat.com
    cpuidle: using governor ladder
    cpuidle: using governor menu
    Advanced Linux Sound Architecture Driver Version 1.0.16.
    ACPI: PCI Interrupt Link [AAZA] enabled at IRQ 21
    ACPI: PCI Interrupt 0000:00:10.1[B] -> Link [AAZA] -> GSI 21 (level, low) ->
    IRQ 21
    PCI: Setting latency timer of device 0000:00:10.1 to 64
    ALSA device list:
    #0: HDA NVidia at 0xfe024000 irq 21
    ip_tables: (C) 2000-2006 Netfilter Core Team
    TCP cubic registered
    NET: Registered protocol family 17
    RPC: Registered udp transport module.
    RPC: Registered tcp transport module.
    Installing 9P2000 support
    md: Autodetecting RAID arrays.
    md: Scanned 20 and added 20 devices.
    md: autorun ...
    md: considering sdd9 ...
    md: adding sdd9 ...
    md: sdd8 has different UUID to sdd9
    md: sdd7 has different UUID to sdd9
    md: sdd6 has different UUID to sdd9
    md: sdd1 has different UUID to sdd9
    md: adding sdc9 ...
    md: sdc8 has different UUID to sdd9
    md: sdc7 has different UUID to sdd9
    md: sdc6 has different UUID to sdd9
    md: sdc1 has different UUID to sdd9
    md: adding sdb9 ...
    md: sdb8 has different UUID to sdd9
    md: sdb7 has different UUID to sdd9
    md: sdb6 has different UUID to sdd9
    md: sdb1 has different UUID to sdd9
    md: adding sda9 ...
    md: sda8 has different UUID to sdd9
    md: sda7 has different UUID to sdd9
    md: sda6 has different UUID to sdd9
    md: sda1 has different UUID to sdd9
    md: created md4
    md: bind
    md: bind
    md: bind
    md: bind
    md: running:
    raid5: device sdd9 operational as raid disk 3
    raid5: device sdc9 operational as raid disk 2
    raid5: device sdb9 operational as raid disk 1
    raid5: device sda9 operational as raid disk 0
    raid5: allocated 4274kB for md4
    raid5: raid level 5 set md4 active with 4 out of 4 devices, algorithm 2
    RAID5 conf printout:
    --- rd:4 wd:4
    disk 0, o:1, dev:sda9
    disk 1, o:1, dev:sdb9
    disk 2, o:1, dev:sdc9
    disk 3, o:1, dev:sdd9
    md: considering sdd8 ...
    md: adding sdd8 ...
    md: sdd7 has different UUID to sdd8
    md: sdd6 has different UUID to sdd8
    md: sdd1 has different UUID to sdd8
    md: adding sdc8 ...
    md: sdc7 has different UUID to sdd8
    md: sdc6 has different UUID to sdd8
    md: sdc1 has different UUID to sdd8
    md: adding sdb8 ...
    md: sdb7 has different UUID to sdd8
    md: sdb6 has different UUID to sdd8
    md: sdb1 has different UUID to sdd8
    md: adding sda8 ...
    md: sda7 has different UUID to sdd8
    md: sda6 has different UUID to sdd8
    md: sda1 has different UUID to sdd8
    md: created md3
    md: bind
    md: bind
    md: bind
    md: bind
    md: running:
    raid5: device sdd8 operational as raid disk 3
    raid5: device sdc8 operational as raid disk 2
    raid5: device sdb8 operational as raid disk 1
    raid5: device sda8 operational as raid disk 0
    raid5: allocated 4274kB for md3
    raid5: raid level 5 set md3 active with 4 out of 4 devices, algorithm 2
    RAID5 conf printout:
    --- rd:4 wd:4
    disk 0, o:1, dev:sda8
    disk 1, o:1, dev:sdb8
    disk 2, o:1, dev:sdc8
    disk 3, o:1, dev:sdd8
    md: considering sdd7 ...
    md: adding sdd7 ...
    md: sdd6 has different UUID to sdd7
    md: sdd1 has different UUID to sdd7
    md: adding sdc7 ...
    md: sdc6 has different UUID to sdd7
    md: sdc1 has different UUID to sdd7
    md: adding sdb7 ...
    md: sdb6 has different UUID to sdd7
    md: sdb1 has different UUID to sdd7
    md: adding sda7 ...
    md: sda6 has different UUID to sdd7
    md: sda1 has different UUID to sdd7
    md: created md2
    md: bind
    md: bind
    md: bind
    md: bind
    md: running:
    md2: setting max_sectors to 128, segment boundary to 32767
    raid0: looking at sdd7
    raid0: comparing sdd7(2939776) with sdd7(2939776)
    raid0: END
    raid0: ==> UNIQUE
    raid0: 1 zones
    raid0: looking at sdc7
    raid0: comparing sdc7(2939776) with sdd7(2939776)
    raid0: EQUAL
    raid0: looking at sdb7
    raid0: comparing sdb7(2939776) with sdd7(2939776)
    raid0: EQUAL
    raid0: looking at sda7
    raid0: comparing sda7(2939776) with sdd7(2939776)
    raid0: EQUAL
    raid0: FINAL 1 zones
    raid0: done.
    raid0 : md_size is 11759104 blocks.
    raid0 : conf->hash_spacing is 11759104 blocks.
    raid0 : nb_zone is 1.
    raid0 : Allocating 8 bytes for hash.
    md: considering sdd6 ...
    md: adding sdd6 ...
    md: sdd1 has different UUID to sdd6
    md: adding sdc6 ...
    md: sdc1 has different UUID to sdd6
    md: adding sdb6 ...
    md: sdb1 has different UUID to sdd6
    md: adding sda6 ...
    md: sda1 has different UUID to sdd6
    md: created md1
    md: bind
    md: bind
    md: bind
    md: bind
    md: running:
    raid5: device sdd6 operational as raid disk 3
    raid5: device sdc6 operational as raid disk 2
    raid5: device sdb6 operational as raid disk 1
    raid5: device sda6 operational as raid disk 0
    raid5: allocated 4274kB for md1
    raid5: raid level 5 set md1 active with 4 out of 4 devices, algorithm 2
    RAID5 conf printout:
    --- rd:4 wd:4
    disk 0, o:1, dev:sda6
    disk 1, o:1, dev:sdb6
    disk 2, o:1, dev:sdc6
    disk 3, o:1, dev:sdd6
    md: considering sdd1 ...
    md: adding sdd1 ...
    md: adding sdc1 ...
    md: adding sdb1 ...
    md: adding sda1 ...
    md: created md0
    md: bind
    md: bind
    md: bind
    md: bind
    md: running:
    raid1: raid set md0 active with 4 out of 4 mirrors
    md: ... autorun DONE.
    kjournald starting. Commit interval 5 seconds
    EXT3-fs: mounted filesystem with writeback data mode.
    VFS: Mounted root (ext3 filesystem) readonly.
    Freeing unused kernel memory: 264k freed
    udev: renamed network interface eth0 to eth1
    forcedeth: Reverse Engineered nForce ethernet driver. Version 0.61.
    ACPI: PCI Interrupt Link [APCH] enabled at IRQ 20
    ACPI: PCI Interrupt 0000:00:14.0[A] -> Link [APCH] -> GSI 20 (level, low) ->
    IRQ 20
    PCI: Setting latency timer of device 0000:00:14.0 to 64
    usbcore: registered new interface driver usbfs
    usbcore: registered new interface driver hub
    usbcore: registered new device driver usb
    ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver
    forcedeth 0000:00:14.0: ifname eth0, PHY OUI 0x5043 @ 1, addr
    00:17:31:e2:89:2f
    forcedeth 0000:00:14.0: highdma pwrctl timirq gbit lnktim desc-v3
    ACPI: PCI Interrupt Link [APCF] enabled at IRQ 23
    ACPI: PCI Interrupt 0000:00:0b.0[A] -> Link [APCF] -> GSI 23 (level, low) ->
    IRQ 23
    PCI: Setting latency timer of device 0000:00:0b.0 to 64
    ohci_hcd 0000:00:0b.0: OHCI Host Controller
    ohci_hcd 0000:00:0b.0: new USB bus registered, assigned bus number 1
    ohci_hcd 0000:00:0b.0: irq 23, io mem 0xfe02f000
    usb usb1: configuration #1 chosen from 1 choice
    hub 1-0:1.0: USB hub found
    hub 1-0:1.0: 8 ports detected
    ACPI: PCI Interrupt Link [APCL] enabled at IRQ 22
    ACPI: PCI Interrupt 0000:00:0b.1[B] -> Link [APCL] -> GSI 22 (level, low) ->
    IRQ 22
    PCI: Setting latency timer of device 0000:00:0b.1 to 64
    ehci_hcd 0000:00:0b.1: EHCI Host Controller
    ehci_hcd 0000:00:0b.1: new USB bus registered, assigned bus number 2
    ehci_hcd 0000:00:0b.1: debug port 1
    PCI: cache line size of 64 is not supported by device 0000:00:0b.1
    ehci_hcd 0000:00:0b.1: irq 22, io mem 0xfe02e000
    ehci_hcd 0000:00:0b.1: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
    usb usb2: configuration #1 chosen from 1 choice
    hub 2-0:1.0: USB hub found
    hub 2-0:1.0: 8 ports detected
    usb 2-1: new high speed USB device using ehci_hcd and address 2
    usb 2-1: configuration #1 chosen from 1 choice
    hub 2-1:1.0: USB hub found
    hub 2-1:1.0: 3 ports detected
    powernow-k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ processors
    (2 cpu cores) (version 2.20.00)
    powernow-k8: overridding fid 0xc (2000 MHz) to vid 0xe
    powernow-k8: overridding fid 0xa (1800 MHz) to vid 0x10
    powernow-k8: overridding fid 0x2 (1000 MHz) to vid 0x16
    powernow-k8: 0 : fid 0xc (2000 MHz), vid 0xe
    powernow-k8: 1 : fid 0xa (1800 MHz), vid 0x10
    powernow-k8: 2 : fid 0x2 (1000 MHz), vid 0x16
    usb 2-1.3: new full speed USB device using ehci_hcd and address 3
    usb 2-1.3: configuration #1 chosen from 1 choice
    Clocksource tsc unstable (delta = -117343945 ns)
    ------------[ cut here ]------------
    WARNING: at include/linux/blkdev.h:443 blk_remove_plug+0x7d/0x90()
    Modules linked in: kvm_amd kvm powernow_k8 ehci_hcd ohci_hcd usbcore forcedeth
    Pid: 1681, comm: fsck.ext3 Not tainted 2.6.25-07422-gb66e1f1-dirty #74

    Call Trace:
    [] warn_on_slowpath+0x64/0xa0
    [] generic_make_request+0x194/0x250
    [] submit_bio+0x97/0x140
    [] blk_remove_plug+0x7d/0x90
    [] raid5_unplug_device+0x44/0x110
    [] sync_page+0x2e/0x50
    [] __wait_on_bit+0x65/0x90
    [] wait_on_page_bit+0x78/0x80
    [] wake_bit_function+0x0/0x40
    [] pagevec_lookup_tag+0x1a/0x30
    [] wait_on_page_writeback_range+0xbd/0x130
    [] do_writepages+0x20/0x40
    [] __filemap_fdatawrite_range+0x52/0x60
    [] filemap_write_and_wait+0x44/0x50
    [] __blkdev_put+0x148/0x1b0
    [] __fput+0xb1/0x1c0
    [] filp_close+0x48/0x80
    [] sys_close+0x9f/0x110
    [] system_call_after_swapgs+0x7b/0x80

    ---[ end trace 94c0787a2e4d19eb ]---
    EXT3 FS on md1, internal journal
    kjournald starting. Commit interval 5 seconds
    EXT3 FS on md2, internal journal
    EXT3-fs: mounted filesystem with writeback data mode.
    kjournald starting. Commit interval 5 seconds
    EXT3 FS on md3, internal journal
    EXT3-fs: mounted filesystem with writeback data mode.
    eth0: no link during initialization.
    r8169: eth1: link up
    r8169: eth1: link up
    NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
    NFSD: starting 90-second grace period

    --
    (°= =°)
    //\ Prakash Punnoor /\\
    V_/ \_V

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v2.0.9 (GNU/Linux)

    iEYEABECAAYFAkgcNbAACgkQxU2n/+9+t5iMRACfexbwmDwUA1vcfrdPchtC52Pn
    djAAn2a6Sv8cgOLWKnFM42q5elCsog9v
    =0sJD
    -----END PGP SIGNATURE-----


  2. Re: WARNING in 2.6.25-07422-gb66e1f1

    Hi,

    I've CC:-ed few guys which may help.

    Prakash Punnoor pisze:
    > Hi, I got this on boot:
    >
    > usb 2-1.3: new full speed USB device using ehci_hcd and address 3
    > usb 2-1.3: configuration #1 chosen from 1 choice
    > Clocksource tsc unstable (delta = -117343945 ns)
    > ------------[ cut here ]------------
    > WARNING: at include/linux/blkdev.h:443 blk_remove_plug+0x7d/0x90()
    > Modules linked in: kvm_amd kvm powernow_k8 ehci_hcd ohci_hcd usbcore forcedeth
    > Pid: 1681, comm: fsck.ext3 Not tainted 2.6.25-07422-gb66e1f1-dirty #74
    >
    > Call Trace:
    > [] warn_on_slowpath+0x64/0xa0
    > [] generic_make_request+0x194/0x250
    > [] submit_bio+0x97/0x140
    > [] blk_remove_plug+0x7d/0x90
    > [] raid5_unplug_device+0x44/0x110
    > [] sync_page+0x2e/0x50
    > [] __wait_on_bit+0x65/0x90
    > [] wait_on_page_bit+0x78/0x80
    > [] wake_bit_function+0x0/0x40
    > [] pagevec_lookup_tag+0x1a/0x30
    > [] wait_on_page_writeback_range+0xbd/0x130
    > [] do_writepages+0x20/0x40
    > [] __filemap_fdatawrite_range+0x52/0x60
    > [] filemap_write_and_wait+0x44/0x50
    > [] __blkdev_put+0x148/0x1b0
    > [] __fput+0xb1/0x1c0
    > [] filp_close+0x48/0x80
    > [] sys_close+0x9f/0x110
    > [] system_call_after_swapgs+0x7b/0x80
    >
    > ---[ end trace 94c0787a2e4d19eb ]---
    > EXT3 FS on md1, internal journal
    > kjournald starting. Commit interval 5 seconds
    > EXT3 FS on md2, internal journal
    > EXT3-fs: mounted filesystem with writeback data mode.
    > kjournald starting. Commit interval 5 seconds
    > EXT3 FS on md3, internal journal
    > EXT3-fs: mounted filesystem with writeback data mode.
    > eth0: no link during initialization.
    > r8169: eth1: link up
    > r8169: eth1: link up
    > NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
    > NFSD: starting 90-second grace period


    Is this problem reproducible?

    Trace is printed by last change from Jens on Apr-29 (commit:
    7663c1e2792a9662b23dec6e19bfcd3d55360b8f, Improve queue_is_locked()), but it
    looks like raid5_unplug_device() is called during EXT3 check on md0 [why?].

    -Jacek
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: WARNING in 2.6.25-07422-gb66e1f1

    On Sun, May 04 2008, Jacek Luczak wrote:
    > Hi,
    >
    > I've CC:-ed few guys which may help.
    >
    > Prakash Punnoor pisze:
    > > Hi, I got this on boot:
    > >
    > > usb 2-1.3: new full speed USB device using ehci_hcd and address 3
    > > usb 2-1.3: configuration #1 chosen from 1 choice
    > > Clocksource tsc unstable (delta = -117343945 ns)
    > > ------------[ cut here ]------------
    > > WARNING: at include/linux/blkdev.h:443 blk_remove_plug+0x7d/0x90()
    > > Modules linked in: kvm_amd kvm powernow_k8 ehci_hcd ohci_hcd usbcore forcedeth
    > > Pid: 1681, comm: fsck.ext3 Not tainted 2.6.25-07422-gb66e1f1-dirty #74
    > >
    > > Call Trace:
    > > [] warn_on_slowpath+0x64/0xa0
    > > [] generic_make_request+0x194/0x250
    > > [] submit_bio+0x97/0x140
    > > [] blk_remove_plug+0x7d/0x90
    > > [] raid5_unplug_device+0x44/0x110
    > > [] sync_page+0x2e/0x50
    > > [] __wait_on_bit+0x65/0x90
    > > [] wait_on_page_bit+0x78/0x80
    > > [] wake_bit_function+0x0/0x40
    > > [] pagevec_lookup_tag+0x1a/0x30
    > > [] wait_on_page_writeback_range+0xbd/0x130
    > > [] do_writepages+0x20/0x40
    > > [] __filemap_fdatawrite_range+0x52/0x60
    > > [] filemap_write_and_wait+0x44/0x50
    > > [] __blkdev_put+0x148/0x1b0
    > > [] __fput+0xb1/0x1c0
    > > [] filp_close+0x48/0x80
    > > [] sys_close+0x9f/0x110
    > > [] system_call_after_swapgs+0x7b/0x80
    > >
    > > ---[ end trace 94c0787a2e4d19eb ]---
    > > EXT3 FS on md1, internal journal
    > > kjournald starting. Commit interval 5 seconds
    > > EXT3 FS on md2, internal journal
    > > EXT3-fs: mounted filesystem with writeback data mode.
    > > kjournald starting. Commit interval 5 seconds
    > > EXT3 FS on md3, internal journal
    > > EXT3-fs: mounted filesystem with writeback data mode.
    > > eth0: no link during initialization.
    > > r8169: eth1: link up
    > > r8169: eth1: link up
    > > NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
    > > NFSD: starting 90-second grace period

    >
    > Is this problem reproducible?
    >
    > Trace is printed by last change from Jens on Apr-29 (commit:
    > 7663c1e2792a9662b23dec6e19bfcd3d55360b8f, Improve queue_is_locked()), but it
    > looks like raid5_unplug_device() is called during EXT3 check on md0 [why?].


    Looks like it caught a real bug there - unfortunately we have to check
    for ->queue_lock here as well, if this is another stacked devices and
    not the bottom device. Does this make the warning go away for you?

    diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
    index 087eee0..958f26b 100644
    --- a/drivers/md/raid5.c
    +++ b/drivers/md/raid5.c
    @@ -3264,6 +3264,8 @@ static void raid5_unplug_device(struct request_queue *q)
    unsigned long flags;

    spin_lock_irqsave(&conf->device_lock, flags);
    + if (q->queue_lock)
    + spin_lock(q->queue_lock);

    if (blk_remove_plug(q)) {
    conf->seq_flush++;
    @@ -3271,6 +3273,8 @@ static void raid5_unplug_device(struct request_queue *q)
    }
    md_wakeup_thread(mddev->thread);

    + if (q->queue_lock)
    + spin_unlock(q->queue_lock);
    spin_unlock_irqrestore(&conf->device_lock, flags);

    unplug_slaves(mddev);

    --
    Jens Axboe

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: WARNING in 2.6.25-07422-gb66e1f1

    On Sunday May 4, jens.axboe@oracle.com wrote:
    > On Sun, May 04 2008, Jacek Luczak wrote:
    > > Hi,
    > >
    > > I've CC:-ed few guys which may help.
    > >
    > > Prakash Punnoor pisze:
    > > > Hi, I got this on boot:
    > > >
    > > > usb 2-1.3: new full speed USB device using ehci_hcd and address 3
    > > > usb 2-1.3: configuration #1 chosen from 1 choice
    > > > Clocksource tsc unstable (delta = -117343945 ns)
    > > > ------------[ cut here ]------------
    > > > WARNING: at include/linux/blkdev.h:443 blk_remove_plug+0x7d/0x90()

    ....
    >
    > Looks like it caught a real bug there - unfortunately we have to check
    > for ->queue_lock here as well, if this is another stacked devices and
    > not the bottom device. Does this make the warning go away for you?
    >
    > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
    > index 087eee0..958f26b 100644
    > --- a/drivers/md/raid5.c
    > +++ b/drivers/md/raid5.c
    > @@ -3264,6 +3264,8 @@ static void raid5_unplug_device(struct request_queue *q)
    > unsigned long flags;
    >
    > spin_lock_irqsave(&conf->device_lock, flags);
    > + if (q->queue_lock)
    > + spin_lock(q->queue_lock);
    >
    > if (blk_remove_plug(q)) {
    > conf->seq_flush++;
    > @@ -3271,6 +3273,8 @@ static void raid5_unplug_device(struct request_queue *q)
    > }
    > md_wakeup_thread(mddev->thread);
    >
    > + if (q->queue_lock)
    > + spin_unlock(q->queue_lock);
    > spin_unlock_irqrestore(&conf->device_lock, flags);
    >
    > unplug_slaves(mddev);
    >


    I suspect that will just cause more problems, as the 'q' for an md
    device never gets ->queue_lock initialised.
    I suspect the correct thing to do is set
    q->queue_lock = &conf->device_lock;

    at some stage, probably immediately after device_lock is initialised
    in 'run'.

    I was discussing this with Dan Williams starting
    http://marc.info/?l=linux-raid&m=120951839903995&w=4
    though we don't have an agreed patch yet.

    I'm wondering why you mention the issues of stacked devices though. I
    don't see how it applies. Could you explain?

    Thanks,
    NeilBrown
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: WARNING in 2.6.25-07422-gb66e1f1


    On Mon, 2008-05-05 at 00:24 -0700, Neil Brown wrote:
    > On Sunday May 4, jens.axboe@oracle.com wrote:
    > > On Sun, May 04 2008, Jacek Luczak wrote:
    > > > Hi,
    > > >
    > > > I've CC:-ed few guys which may help.
    > > >
    > > > Prakash Punnoor pisze:
    > > > > Hi, I got this on boot:
    > > > >
    > > > > usb 2-1.3: new full speed USB device using ehci_hcd and address 3
    > > > > usb 2-1.3: configuration #1 chosen from 1 choice
    > > > > Clocksource tsc unstable (delta = -117343945 ns)
    > > > > ------------[ cut here ]------------
    > > > > WARNING: at include/linux/blkdev.h:443 blk_remove_plug+0x7d/0x90()

    > ...
    > >
    > > Looks like it caught a real bug there - unfortunately we have to check
    > > for ->queue_lock here as well, if this is another stacked devices and
    > > not the bottom device. Does this make the warning go away for you?
    > >

    [..]
    > I suspect that will just cause more problems, as the 'q' for an md
    > device never gets ->queue_lock initialised.
    > I suspect the correct thing to do is set
    > q->queue_lock = &conf->device_lock;
    >
    > at some stage, probably immediately after device_lock is initialised
    > in 'run'.
    >
    > I was discussing this with Dan Williams starting
    > http://marc.info/?l=linux-raid&m=120951839903995&w=4
    > though we don't have an agreed patch yet.


    The patch below appears to work for the raid5 case, but I am
    encountering a new issue when testing linear arrays? raid0/1/10 are not
    triggering this issue.

    $ mdadm --create /dev/md0 /dev/loop[0-3] -n 4 -l linear
    mdadm: RUN_ARRAY failed: Invalid argument # huh?
    mdadm: stopped /dev/md0
    $ cat /proc/mdstat
    Personalities : [raid0] [linear]
    unused devices:
    $ mdadm --create /dev/md0 /dev/loop[0-3] -n 4 -l linear
    Segmentation fault

    [293399.915068] BUG: unable to handle kernel NULL pointer dereference at 00000000
    [293399.931249] IP: [] find_usage_backwards+0x9c/0xb6
    [293399.945735] *pde = 00000000
    [293399.957323] Oops: 0000 [#1] SMP
    [293399.968978] Modules linked in: raid456 async_xor async_memcpy async_tx xor linear loop ipt_MASQUERADE iptable_nat nf_nat bridge rfcomm l2cap bluetooth ]
    [293400.093457]
    [293400.105809] Pid: 30652, comm: mdadm Not tainted (2.6.25-imsm #63)
    [293400.123339] EIP: 0060:[] EFLAGS: 00210046 CPU: 2
    [293400.140261] EIP is at find_usage_backwards+0x9c/0xb6
    [293400.156651] EAX: 00000002 EBX: 00000000 ECX: 00000001 EDX: 0000a9a8
    [293400.174211] ESI: 00000000 EDI: d54f2400 EBP: d1db9ba8 ESP: d1db9b9c
    [293400.191645] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
    [293400.207967] Process mdadm (pid: 30652, ti=d1db9000 task=e0f28000 task.ti=d1db9000)
    [293400.216021] Stack: e0f284f0 e0f28000 00000004 d1db9bb8 c0441d2d e0f284f0 e0f28000 d1db9bd4
    [293400.236094] c0442032 c06d1fed 00000010 00200246 e0f284f0 d54f2400 d1db9c24 c0442b63
    [293400.256296] 0000025d 00000002 00000000 00000000 f72cd3ec 00000001 e0f28000 00000000
    [293400.276699] Call Trace:
    [293400.302628] [] ? check_usage_backwards+0x19/0x3b
    [293400.320626] [] ? mark_lock+0x228/0x399
    [293400.337629] [] ? __lock_acquire+0x440/0xad5
    [293400.355036] [] ? mark_held_locks+0x41/0x5c
    [293400.372027] [] ? native_sched_clock+0x8d/0x9f
    [293400.389053] [] ? lock_acquire+0x57/0x73
    [293400.405617] [] ? linear_conf+0xac/0x399 [linear]
    [293400.422874] [] ? _spin_lock+0x1c/0x49
    [293400.439193] [] ? linear_conf+0xac/0x399 [linear]
    [293400.456628] [] ? linear_conf+0xac/0x399 [linear]
    [293400.474060] [] ? mark_held_locks+0x41/0x5c
    [293400.491130] [] ? __mutex_unlock_slowpath+0xe1/0xe9
    [293400.509098] [] ? lock_release_holdtime+0x3f/0x44
    [293400.526942] [] ? do_md_run+0x514/0x9ea
    [293400.543989] [] ? linear_run+0x11/0x71 [linear]
    [293400.561848] [] ? do_md_run+0x6cf/0x9ea
    [293400.579013] [] ? _spin_unlock_irq+0x22/0x26
    [293400.596696] [] ? mark_held_locks+0x41/0x5c
    [293400.614585] [] ? mutex_lock_interruptible_nested+0x25f/0x273
    [293400.634244] [] ? trace_hardirqs_on+0xe1/0x102
    [293400.652580] [] ? mutex_lock_interruptible_nested+0x269/0x273
    [293400.672573] [] ? md_ioctl+0xb8/0xdc6
    [293400.690261] [] ? md_ioctl+0xbac/0xdc6
    [293400.708073] [] ? native_sched_clock+0x8d/0x9f
    [293400.726798] [] ? lock_release_holdtime+0x3f/0x44
    [293400.745947] [] ? _spin_unlock_irqrestore+0x36/0x3c
    [293400.765174] [] ? trace_hardirqs_on+0xe1/0x102
    [293400.783838] [] ? down+0x2b/0x2f
    [293400.801143] [] ? blkdev_driver_ioctl+0x49/0x5b
    [293400.819931] [] ? blkdev_ioctl+0x71b/0x769
    [293400.837909] [] ? free_hot_cold_page+0x15c/0x185
    [293400.856024] [] ? native_sched_clock+0x8d/0x9f
    [293400.873546] [] ? lock_release_holdtime+0x3f/0x44
    [293400.891111] [] ? _spin_unlock_irqrestore+0x36/0x3c
    [293400.908540] [] ? trace_hardirqs_on+0xe1/0x102
    [293400.925100] [] ? block_ioctl+0x16/0x1b
    [293400.940642] [] ? block_ioctl+0x0/0x1b
    [293400.956000] [] ? vfs_ioctl+0x22/0x67
    [293400.971108] [] ? do_vfs_ioctl+0x264/0x27b
    [293400.986610] [] ? sys_ioctl+0x40/0x5a
    [293401.001599] [] ? sysenter_past_esp+0x6a/0xb1
    [293401.017331] =======================
    [293401.031194] Code: 89 3d 30 86 a2 c0 b8 02 00 00 00 eb 33 8b 9f b4 00 00 00 eb 16 8b 43 08 8d 56 01 e8 6f ff ff ff 83 f8 02 74 1b 85 c0 74 17 8b 1b <8b>
    [293401.073207] EIP: [] find_usage_backwards+0x9c/0xb6 SS:ESP 0068:d1db9b9c
    [293401.121680] ---[ end trace 6a498ad836843586 ]---

    ---
    md: tell blk-core about device_lock for protecting the queue flags

    From: Dan Williams

    Now that queue flags are no longer atomic (commit:
    75ad23bc0fcb4f992a5d06982bf0857ab1738e9e) blk-core checks the queue is locked
    via ->queue_lock. As noticed by Neil conf->device_lock already satisfies this
    requirement.

    Signed-off-by: Dan Williams
    ---

    drivers/md/linear.c | 6 ++++++
    drivers/md/multipath.c | 6 ++++++
    drivers/md/raid0.c | 6 ++++++
    drivers/md/raid1.c | 7 ++++++-
    drivers/md/raid10.c | 7 ++++++-
    drivers/md/raid5.c | 2 ++
    include/linux/raid/linear.h | 1 +
    include/linux/raid/raid0.h | 1 +
    8 files changed, 34 insertions(+), 2 deletions(-)


    diff --git a/drivers/md/linear.c b/drivers/md/linear.c
    index 0b85117..d026f08 100644
    --- a/drivers/md/linear.c
    +++ b/drivers/md/linear.c
    @@ -122,6 +122,10 @@ static linear_conf_t *linear_conf(mddev_t *mddev, int raid_disks)
    cnt = 0;
    conf->array_size = 0;

    + spin_lock_init(&conf->device_lock);
    + /* blk-core uses queue_lock to verify protection of the queue flags */
    + mddev->queue->queue_lock = &conf->device_lock;
    +
    rdev_for_each(rdev, tmp, mddev) {
    int j = rdev->raid_disk;
    dev_info_t *disk = conf->disks + j;
    @@ -133,8 +137,10 @@ static linear_conf_t *linear_conf(mddev_t *mddev, int raid_disks)

    disk->rdev = rdev;

    + spin_lock(&conf->device_lock);
    blk_queue_stack_limits(mddev->queue,
    rdev->bdev->bd_disk->queue);
    + spin_unlock(&conf->device_lock);
    /* as we don't honour merge_bvec_fn, we must never risk
    * violating it, so limit ->max_sector to one PAGE, as
    * a one page request is never in violation.
    diff --git a/drivers/md/multipath.c b/drivers/md/multipath.c
    index 42ee1a2..ee7df38 100644
    --- a/drivers/md/multipath.c
    +++ b/drivers/md/multipath.c
    @@ -436,6 +436,10 @@ static int multipath_run (mddev_t *mddev)
    goto out_free_conf;
    }

    + spin_lock_init(&conf->device_lock);
    + /* blk-core uses queue_lock to verify protection of the queue flags */
    + mddev->queue->queue_lock = &conf->device_lock;
    +
    conf->working_disks = 0;
    rdev_for_each(rdev, tmp, mddev) {
    disk_idx = rdev->raid_disk;
    @@ -446,8 +450,10 @@ static int multipath_run (mddev_t *mddev)
    disk = conf->multipaths + disk_idx;
    disk->rdev = rdev;

    + spin_lock(&conf->device_lock);
    blk_queue_stack_limits(mddev->queue,
    rdev->bdev->bd_disk->queue);
    + spin_unlock(&conf->device_lock);
    /* as we don't honour merge_bvec_fn, we must never risk
    * violating it, not that we ever expect a device with
    * a merge_bvec_fn to be involved in multipath */
    diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
    index 818b482..deb5609 100644
    --- a/drivers/md/raid0.c
    +++ b/drivers/md/raid0.c
    @@ -117,6 +117,10 @@ static int create_strip_zones (mddev_t *mddev)
    if (!conf->devlist)
    return 1;

    + spin_lock_init(&conf->device_lock);
    + /* blk-core uses queue_lock to verify protection of the queue flags */
    + mddev->queue->queue_lock = &conf->device_lock;
    +
    /* The first zone must contain all devices, so here we check that
    * there is a proper alignment of slots to devices and find them all
    */
    @@ -138,8 +142,10 @@ static int create_strip_zones (mddev_t *mddev)
    }
    zone->dev[j] = rdev1;

    + spin_lock(&conf->device_lock);
    blk_queue_stack_limits(mddev->queue,
    rdev1->bdev->bd_disk->queue);
    + spin_unlock(&conf->device_lock);
    /* as we don't honour merge_bvec_fn, we must never risk
    * violating it, so limit ->max_sector to one PAGE, as
    * a one page request is never in violation.
    diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
    index 6778b7c..a01fc7e 100644
    --- a/drivers/md/raid1.c
    +++ b/drivers/md/raid1.c
    @@ -1935,6 +1935,10 @@ static int run(mddev_t *mddev)
    if (!conf->r1bio_pool)
    goto out_no_mem;

    + spin_lock_init(&conf->device_lock);
    + /* blk-core uses queue_lock to verify protection of the queue flags */
    + mddev->queue->queue_lock = &conf->device_lock;
    +
    rdev_for_each(rdev, tmp, mddev) {
    disk_idx = rdev->raid_disk;
    if (disk_idx >= mddev->raid_disks
    @@ -1944,8 +1948,10 @@ static int run(mddev_t *mddev)

    disk->rdev = rdev;

    + spin_lock(&conf->device_lock);
    blk_queue_stack_limits(mddev->queue,
    rdev->bdev->bd_disk->queue);
    + spin_unlock(&conf->device_lock);
    /* as we don't honour merge_bvec_fn, we must never risk
    * violating it, so limit ->max_sector to one PAGE, as
    * a one page request is never in violation.
    @@ -1958,7 +1964,6 @@ static int run(mddev_t *mddev)
    }
    conf->raid_disks = mddev->raid_disks;
    conf->mddev = mddev;
    - spin_lock_init(&conf->device_lock);
    INIT_LIST_HEAD(&conf->retry_list);

    spin_lock_init(&conf->resync_lock);
    diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
    index 5938fa9..c28af78 100644
    --- a/drivers/md/raid10.c
    +++ b/drivers/md/raid10.c
    @@ -2082,6 +2082,10 @@ static int run(mddev_t *mddev)
    goto out_free_conf;
    }

    + spin_lock_init(&conf->device_lock);
    + /* blk-core uses queue_lock to verify protection of the queue flags */
    + mddev->queue->queue_lock = &conf->device_lock;
    +
    rdev_for_each(rdev, tmp, mddev) {
    disk_idx = rdev->raid_disk;
    if (disk_idx >= mddev->raid_disks
    @@ -2091,8 +2095,10 @@ static int run(mddev_t *mddev)

    disk->rdev = rdev;

    + spin_lock(&conf->device_lock);
    blk_queue_stack_limits(mddev->queue,
    rdev->bdev->bd_disk->queue);
    + spin_unlock(&conf->device_lock);
    /* as we don't honour merge_bvec_fn, we must never risk
    * violating it, so limit ->max_sector to one PAGE, as
    * a one page request is never in violation.
    @@ -2103,7 +2109,6 @@ static int run(mddev_t *mddev)

    disk->head_position = 0;
    }
    - spin_lock_init(&conf->device_lock);
    INIT_LIST_HEAD(&conf->retry_list);

    spin_lock_init(&conf->resync_lock);
    diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
    index ee0ea91..59964a7 100644
    --- a/drivers/md/raid5.c
    +++ b/drivers/md/raid5.c
    @@ -4257,6 +4257,8 @@ static int run(mddev_t *mddev)
    goto abort;
    }
    spin_lock_init(&conf->device_lock);
    + /* blk-core uses queue_lock to verify protection of the queue flags */
    + mddev->queue->queue_lock = &conf->device_lock;
    init_waitqueue_head(&conf->wait_for_stripe);
    init_waitqueue_head(&conf->wait_for_overlap);
    INIT_LIST_HEAD(&conf->handle_list);
    diff --git a/include/linux/raid/linear.h b/include/linux/raid/linear.h
    index ba15469..1bb90cf 100644
    --- a/include/linux/raid/linear.h
    +++ b/include/linux/raid/linear.h
    @@ -19,6 +19,7 @@ struct linear_private_data
    sector_t array_size;
    int preshift; /* shift before dividing by hash_spacing */
    dev_info_t disks[0];
    + spinlock_t device_lock;
    };


    diff --git a/include/linux/raid/raid0.h b/include/linux/raid/raid0.h
    index 1b2dda0..3d20d14 100644
    --- a/include/linux/raid/raid0.h
    +++ b/include/linux/raid/raid0.h
    @@ -21,6 +21,7 @@ struct raid0_private_data

    sector_t hash_spacing;
    int preshift; /* shift this before divide by hash_spacing */
    + spinlock_t device_lock;
    };

    typedef struct raid0_private_data raid0_conf_t;


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: WARNING in 2.6.25-07422-gb66e1f1

    On the day of Sunday 04 May 2008 Jens Axboe hast written:
    > On Sun, May 04 2008, Jacek Luczak wrote:
    > > Hi,
    > >
    > > I've CC:-ed few guys which may help.
    > >
    > > Prakash Punnoor pisze:
    > > > Hi, I got this on boot:
    > > >
    > > > usb 2-1.3: new full speed USB device using ehci_hcd and address 3
    > > > usb 2-1.3: configuration #1 chosen from 1 choice
    > > > Clocksource tsc unstable (delta = -117343945 ns)
    > > > ------------[ cut here ]------------
    > > > WARNING: at include/linux/blkdev.h:443 blk_remove_plug+0x7d/0x90()
    > > > Modules linked in: kvm_amd kvm powernow_k8 ehci_hcd ohci_hcd usbcore
    > > > forcedeth Pid: 1681, comm: fsck.ext3 Not tainted
    > > > 2.6.25-07422-gb66e1f1-dirty #74
    > > >
    > > > Call Trace:
    > > > [] warn_on_slowpath+0x64/0xa0
    > > > [] generic_make_request+0x194/0x250
    > > > [] submit_bio+0x97/0x140
    > > > [] blk_remove_plug+0x7d/0x90
    > > > [] raid5_unplug_device+0x44/0x110
    > > > [] sync_page+0x2e/0x50
    > > > [] __wait_on_bit+0x65/0x90
    > > > [] wait_on_page_bit+0x78/0x80
    > > > [] wake_bit_function+0x0/0x40
    > > > [] pagevec_lookup_tag+0x1a/0x30
    > > > [] wait_on_page_writeback_range+0xbd/0x130
    > > > [] do_writepages+0x20/0x40
    > > > [] __filemap_fdatawrite_range+0x52/0x60
    > > > [] filemap_write_and_wait+0x44/0x50
    > > > [] __blkdev_put+0x148/0x1b0
    > > > [] __fput+0xb1/0x1c0
    > > > [] filp_close+0x48/0x80
    > > > [] sys_close+0x9f/0x110
    > > > [] system_call_after_swapgs+0x7b/0x80
    > > >
    > > > ---[ end trace 94c0787a2e4d19eb ]---


    [...]
    >
    > Looks like it caught a real bug there - unfortunately we have to check
    > for ->queue_lock here as well, if this is another stacked devices and
    > not the bottom device. Does this make the warning go away for you?


    Nope it does not. But I see some other comments have already been posted about
    this issue...
    --
    (= =)
    //\ Prakash Punnoor /\\
    V_/ \_V

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v2.0.9 (GNU/Linux)

    iEYEABECAAYFAkgfUnIACgkQxU2n/+9+t5iK6QCdHGC8s+BnbgHAMfDcAMrm/4cA
    OLwAoMeEuEfp/vLfsThMBBKaHitIUTzG
    =Gw4g
    -----END PGP SIGNATURE-----


  7. Re: WARNING in 2.6.25-07422-gb66e1f1

    On Mon, May 05 2008, Neil Brown wrote:
    > On Sunday May 4, jens.axboe@oracle.com wrote:
    > > On Sun, May 04 2008, Jacek Luczak wrote:
    > > > Hi,
    > > >
    > > > I've CC:-ed few guys which may help.
    > > >
    > > > Prakash Punnoor pisze:
    > > > > Hi, I got this on boot:
    > > > >
    > > > > usb 2-1.3: new full speed USB device using ehci_hcd and address 3
    > > > > usb 2-1.3: configuration #1 chosen from 1 choice
    > > > > Clocksource tsc unstable (delta = -117343945 ns)
    > > > > ------------[ cut here ]------------
    > > > > WARNING: at include/linux/blkdev.h:443 blk_remove_plug+0x7d/0x90()

    > ...
    > >
    > > Looks like it caught a real bug there - unfortunately we have to check
    > > for ->queue_lock here as well, if this is another stacked devices and
    > > not the bottom device. Does this make the warning go away for you?
    > >
    > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
    > > index 087eee0..958f26b 100644
    > > --- a/drivers/md/raid5.c
    > > +++ b/drivers/md/raid5.c
    > > @@ -3264,6 +3264,8 @@ static void raid5_unplug_device(struct request_queue *q)
    > > unsigned long flags;
    > >
    > > spin_lock_irqsave(&conf->device_lock, flags);
    > > + if (q->queue_lock)
    > > + spin_lock(q->queue_lock);
    > >
    > > if (blk_remove_plug(q)) {
    > > conf->seq_flush++;
    > > @@ -3271,6 +3273,8 @@ static void raid5_unplug_device(struct request_queue *q)
    > > }
    > > md_wakeup_thread(mddev->thread);
    > >
    > > + if (q->queue_lock)
    > > + spin_unlock(q->queue_lock);
    > > spin_unlock_irqrestore(&conf->device_lock, flags);
    > >
    > > unplug_slaves(mddev);
    > >

    >
    > I suspect that will just cause more problems, as the 'q' for an md
    > device never gets ->queue_lock initialised.
    > I suspect the correct thing to do is set
    > q->queue_lock = &conf->device_lock;
    >
    > at some stage, probably immediately after device_lock is initialised
    > in 'run'.
    >
    > I was discussing this with Dan Williams starting
    > http://marc.info/?l=linux-raid&m=120951839903995&w=4
    > though we don't have an agreed patch yet.


    I agree with the usage of the device lock. I (mistakenly) thought that
    raid5 used the bottom device queue for that unplug - I see that it does
    not, so where does the warning come from? mddev->queue->queue_lock
    should be NULL, since md never sets it and it's zeroed to begin with??

    > I'm wondering why you mention the issues of stacked devices though. I
    > don't see how it applies. Could you explain?


    See above, if the queue had been the bottom queue, ->queue_lock may or
    may not be NULL depending on whether this is the real device or
    (another) stacked device.

    --
    Jens Axboe

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. Re: WARNING in 2.6.25-07422-gb66e1f1

    Prakash Punnoor wrote:
    > On the day of Sunday 04 May 2008 Jens Axboe hast written:
    >> On Sun, May 04 2008, Jacek Luczak wrote:
    >>> Hi,
    >>>
    >>> I've CC:-ed few guys which may help.
    >>>
    >>> Prakash Punnoor pisze:
    >>>> Hi, I got this on boot:
    >>>>
    >>>> usb 2-1.3: new full speed USB device using ehci_hcd and address 3
    >>>> usb 2-1.3: configuration #1 chosen from 1 choice
    >>>> Clocksource tsc unstable (delta = -117343945 ns)
    >>>> ------------[ cut here ]------------
    >>>> WARNING: at include/linux/blkdev.h:443 blk_remove_plug+0x7d/0x90()
    >>>> Modules linked in: kvm_amd kvm powernow_k8 ehci_hcd ohci_hcd usbcore
    >>>> forcedeth Pid: 1681, comm: fsck.ext3 Not tainted
    >>>> 2.6.25-07422-gb66e1f1-dirty #74
    >>>>
    >>>> Call Trace:
    >>>> [] warn_on_slowpath+0x64/0xa0
    >>>> [] generic_make_request+0x194/0x250
    >>>> [] submit_bio+0x97/0x140
    >>>> [] blk_remove_plug+0x7d/0x90
    >>>> [] raid5_unplug_device+0x44/0x110
    >>>> [] sync_page+0x2e/0x50
    >>>> [] __wait_on_bit+0x65/0x90
    >>>> [] wait_on_page_bit+0x78/0x80
    >>>> [] wake_bit_function+0x0/0x40
    >>>> [] pagevec_lookup_tag+0x1a/0x30
    >>>> [] wait_on_page_writeback_range+0xbd/0x130
    >>>> [] do_writepages+0x20/0x40
    >>>> [] __filemap_fdatawrite_range+0x52/0x60
    >>>> [] filemap_write_and_wait+0x44/0x50
    >>>> [] __blkdev_put+0x148/0x1b0
    >>>> [] __fput+0xb1/0x1c0
    >>>> [] filp_close+0x48/0x80
    >>>> [] sys_close+0x9f/0x110
    >>>> [] system_call_after_swapgs+0x7b/0x80
    >>>>
    >>>> ---[ end trace 94c0787a2e4d19eb ]---

    >
    > [...]
    >> Looks like it caught a real bug there - unfortunately we have to check
    >> for ->queue_lock here as well, if this is another stacked devices and
    >> not the bottom device. Does this make the warning go away for you?

    >
    > Nope it does not. But I see some other comments have already been posted about
    > this issue...


    Hmm I get a similar warning with raid1 also.

    [ 415.851920] kjournald starting. Commit interval 5 seconds
    [ 416.282180] ------------[ cut here ]------------
    [ 416.282184] WARNING: at include/linux/blkdev.h:443 blk_remove_plug+0x51/0x84()
    [ 416.282186] Modules linked in: i915 drm fuse snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_seq_device loop usbhid ff_memless firewire_ohci firewire_core thermal shpchp uhci_hcd sr_mod pci_hotplug pata_acpi i2c_i801 ehci_hcd crc_itu_t button serio_raw processor i2c_core evdev
    [ 416.282241] Pid: 1118, comm: md0_raid1 Not tainted 2.6.26-rc1-00065-g5717922-dirty #824
    [ 416.282243]
    [ 416.282244] Call Trace:
    [ 416.282250] [] warn_on_slowpath+0x51/0x8c
    [ 416.282255] [] __wake_up+0x38/0x4f
    [ 416.282260] [] enqueue_hrtimer+0xdd/0xea
    [ 416.282263] [] blk_remove_plug+0x51/0x84
    [ 416.282267] [] flush_pending_writes+0x3b/0x84
    [ 416.282271] [] raid1d+0x4b/0xd97
    [ 416.282274] [] sync_buffer+0x0/0x3f
    [ 416.282277] [] hrtick_set+0x9d/0x106
    [ 416.282282] [] thread_return+0x68/0xbb
    [ 416.282288] [] schedule_timeout+0x1e/0xc9
    [ 416.282292] [] md_thread+0xe4/0x102
    [ 416.282296] [] autoremove_wake_function+0x0/0x2e
    [ 416.282300] [] md_thread+0x0/0x102
    [ 416.282303] [] kthread+0x47/0x73
    [ 416.282306] [] schedule_tail+0x27/0x5b
    [ 416.282315] [] child_rip+0xa/0x12
    [ 416.282319] [] kthread+0x0/0x73
    [ 416.282322] [] child_rip+0x0/0x12
    [ 416.282325]
    [ 416.282328] ---[ end trace 4873a8598d8bf786 ]---
    [ 416.282463] EXT3 FS on md0, internal journal


    Is there any other patch to try out ?

    Regards,

    Gabriel C
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. Re: WARNING in 2.6.25-07422-gb66e1f1

    On Thu, 2008-05-08 at 11:39 -0700, Rafael J. Wysocki wrote:
    > I get a similar warning with RAID1 on one of my test boxes:
    >
    > WARNING: at /home/rafael/src/linux-2.6/include/linux/blkdev.h:443 blk_remove_plug+0x85/0xa0()
    > Modules linked in: raid456 async_xor async_memcpy async_tx xor raid0 ehci_hcd ohci_hcd sd_mod edd raid1 ext3 jbd fan sata_uli pata_ali thermal processor
    > Pid: 2159, comm: md1_raid1 Not tainted 2.6.26-rc1 #158
    >
    > Call Trace:
    > [] warn_on_slowpath+0x5f/0x80
    > [] ? __lock_acquire+0x748/0x10d0
    > [] blk_remove_plug+0x85/0xa0
    > [] :raid1:flush_pending_writes+0x44/0xb0
    > [] :raid1:raid1d+0x59/0xfe0
    > [] ? __lock_acquire+0x748/0x10d0
    > [] ? trace_hardirqs_on+0xbf/0x150
    > [] md_thread+0x3c/0x110
    > [] ? autoremove_wake_function+0x0/0x40
    > [] ? md_thread+0x0/0x110
    > [] kthread+0x4d/0x80
    > [] child_rip+0xa/0x12
    > [] ? restore_args+0x0/0x30
    > [] ? kthread+0x0/0x80
    > [] ? child_rip+0x0/0x12
    >
    > ---[ end trace 05d4e0844c61f45d ]---
    >
    > This is the WARN_ON_ONCE(!queue_is_locked(q)) in queue_flag_clear(),
    > apparently.


    Yes, it triggers on all RAID levels. The patch in this message:

    http://marc.info/?l=linux-raid&m=121001065404056&w=2
    >

    ....fixes the raid 0/1/10/5/6 cases, but I am still trying to isolate an
    issue (potentially unrelated) with linear arrays.

    --
    Dan

    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: WARNING in 2.6.25-07422-gb66e1f1

    On Monday, 5 of May 2008, Jens Axboe wrote:
    > On Mon, May 05 2008, Neil Brown wrote:
    > > On Sunday May 4, jens.axboe@oracle.com wrote:
    > > > On Sun, May 04 2008, Jacek Luczak wrote:
    > > > > Hi,
    > > > >
    > > > > I've CC:-ed few guys which may help.
    > > > >
    > > > > Prakash Punnoor pisze:
    > > > > > Hi, I got this on boot:
    > > > > >
    > > > > > usb 2-1.3: new full speed USB device using ehci_hcd and address 3
    > > > > > usb 2-1.3: configuration #1 chosen from 1 choice
    > > > > > Clocksource tsc unstable (delta = -117343945 ns)
    > > > > > ------------[ cut here ]------------
    > > > > > WARNING: at include/linux/blkdev.h:443 blk_remove_plug+0x7d/0x90()

    > > ...
    > > >
    > > > Looks like it caught a real bug there - unfortunately we have to check
    > > > for ->queue_lock here as well, if this is another stacked devices and
    > > > not the bottom device. Does this make the warning go away for you?
    > > >
    > > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
    > > > index 087eee0..958f26b 100644
    > > > --- a/drivers/md/raid5.c
    > > > +++ b/drivers/md/raid5.c
    > > > @@ -3264,6 +3264,8 @@ static void raid5_unplug_device(struct request_queue *q)
    > > > unsigned long flags;
    > > >
    > > > spin_lock_irqsave(&conf->device_lock, flags);
    > > > + if (q->queue_lock)
    > > > + spin_lock(q->queue_lock);
    > > >
    > > > if (blk_remove_plug(q)) {
    > > > conf->seq_flush++;
    > > > @@ -3271,6 +3273,8 @@ static void raid5_unplug_device(struct request_queue *q)
    > > > }
    > > > md_wakeup_thread(mddev->thread);
    > > >
    > > > + if (q->queue_lock)
    > > > + spin_unlock(q->queue_lock);
    > > > spin_unlock_irqrestore(&conf->device_lock, flags);
    > > >
    > > > unplug_slaves(mddev);
    > > >

    > >
    > > I suspect that will just cause more problems, as the 'q' for an md
    > > device never gets ->queue_lock initialised.
    > > I suspect the correct thing to do is set
    > > q->queue_lock = &conf->device_lock;
    > >
    > > at some stage, probably immediately after device_lock is initialised
    > > in 'run'.
    > >
    > > I was discussing this with Dan Williams starting
    > > http://marc.info/?l=linux-raid&m=120951839903995&w=4
    > > though we don't have an agreed patch yet.

    >
    > I agree with the usage of the device lock. I (mistakenly) thought that
    > raid5 used the bottom device queue for that unplug - I see that it does
    > not, so where does the warning come from? mddev->queue->queue_lock
    > should be NULL, since md never sets it and it's zeroed to begin with??
    >
    > > I'm wondering why you mention the issues of stacked devices though. I
    > > don't see how it applies. Could you explain?

    >
    > See above, if the queue had been the bottom queue, ->queue_lock may or
    > may not be NULL depending on whether this is the real device or
    > (another) stacked device.


    I get a similar warning with RAID1 on one of my test boxes:

    WARNING: at /home/rafael/src/linux-2.6/include/linux/blkdev.h:443 blk_remove_plug+0x85/0xa0()
    Modules linked in: raid456 async_xor async_memcpy async_tx xor raid0 ehci_hcd ohci_hcd sd_mod edd raid1 ext3 jbd fan sata_uli pata_ali thermal processor
    Pid: 2159, comm: md1_raid1 Not tainted 2.6.26-rc1 #158

    Call Trace:
    [] warn_on_slowpath+0x5f/0x80
    [] ? __lock_acquire+0x748/0x10d0
    [] blk_remove_plug+0x85/0xa0
    [] :raid1:flush_pending_writes+0x44/0xb0
    [] :raid1:raid1d+0x59/0xfe0
    [] ? __lock_acquire+0x748/0x10d0
    [] ? trace_hardirqs_on+0xbf/0x150
    [] md_thread+0x3c/0x110
    [] ? autoremove_wake_function+0x0/0x40
    [] ? md_thread+0x0/0x110
    [] kthread+0x4d/0x80
    [] child_rip+0xa/0x12
    [] ? restore_args+0x0/0x30
    [] ? kthread+0x0/0x80
    [] ? child_rip+0x0/0x12

    ---[ end trace 05d4e0844c61f45d ]---

    This is the WARN_ON_ONCE(!queue_is_locked(q)) in queue_flag_clear(),
    apparently.

    Thanks,
    Rafael
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: WARNING in 2.6.25-07422-gb66e1f1

    On Thu, 2008-05-08 at 11:46 -0700, Dan Williams wrote:
    > On Thu, 2008-05-08 at 11:39 -0700, Rafael J. Wysocki wrote:
    > > I get a similar warning with RAID1 on one of my test boxes:
    > >
    > > WARNING: at /home/rafael/src/linux-2.6/include/linux/blkdev.h:443 blk_remove_plug+0x85/0xa0()
    > > Modules linked in: raid456 async_xor async_memcpy async_tx xor raid0 ehci_hcd ohci_hcd sd_mod edd raid1 ext3 jbd fan sata_uli pata_ali thermal processor
    > > Pid: 2159, comm: md1_raid1 Not tainted 2.6.26-rc1 #158
    > >
    > > Call Trace:
    > > [] warn_on_slowpath+0x5f/0x80
    > > [] ? __lock_acquire+0x748/0x10d0
    > > [] blk_remove_plug+0x85/0xa0
    > > [] :raid1:flush_pending_writes+0x44/0xb0
    > > [] :raid1:raid1d+0x59/0xfe0
    > > [] ? __lock_acquire+0x748/0x10d0
    > > [] ? trace_hardirqs_on+0xbf/0x150
    > > [] md_thread+0x3c/0x110
    > > [] ? autoremove_wake_function+0x0/0x40
    > > [] ? md_thread+0x0/0x110
    > > [] kthread+0x4d/0x80
    > > [] child_rip+0xa/0x12
    > > [] ? restore_args+0x0/0x30
    > > [] ? kthread+0x0/0x80
    > > [] ? child_rip+0x0/0x12
    > >
    > > ---[ end trace 05d4e0844c61f45d ]---
    > >
    > > This is the WARN_ON_ONCE(!queue_is_locked(q)) in queue_flag_clear(),
    > > apparently.

    >
    > Yes, it triggers on all RAID levels. The patch in this message:
    >
    > http://marc.info/?l=linux-raid&m=121001065404056&w=2
    > >

    > ...fixes the raid 0/1/10/5/6 cases, but I am still trying to isolate an
    > issue (potentially unrelated) with linear arrays.
    >


    Gah, 'device_lock' can not come after 'disks[0]' in 'struct
    linear_private_data'. Updated patch below. Simple testing passes:
    'mdadm --create /dev/md0; mkfs.ext3 /dev/md0' for each raid level
    linear, 0, 1, 10, 5, and 6.

    ---snip--->
    Subject: md: tell blk-core about device_lock for protecting the queue flags
    From: Dan Williams

    Now that queue flags are no longer atomic (commit:
    75ad23bc0fcb4f992a5d06982bf0857ab1738e9e) blk-core checks the queue is locked
    via ->queue_lock. As noticed by Neil conf->device_lock already satisfies this
    requirement.

    Signed-off-by: Dan Williams
    ---

    drivers/md/linear.c | 6 ++++++
    drivers/md/multipath.c | 6 ++++++
    drivers/md/raid0.c | 6 ++++++
    drivers/md/raid1.c | 7 ++++++-
    drivers/md/raid10.c | 7 ++++++-
    drivers/md/raid5.c | 2 ++
    include/linux/raid/linear.h | 3 ++-
    include/linux/raid/raid0.h | 1 +
    8 files changed, 35 insertions(+), 3 deletions(-)


    diff --git a/drivers/md/linear.c b/drivers/md/linear.c
    index 0b85117..d026f08 100644
    --- a/drivers/md/linear.c
    +++ b/drivers/md/linear.c
    @@ -122,6 +122,10 @@ static linear_conf_t *linear_conf(mddev_t *mddev, int raid_disks)
    cnt = 0;
    conf->array_size = 0;

    + spin_lock_init(&conf->device_lock);
    + /* blk-core uses queue_lock to verify protection of the queue flags */
    + mddev->queue->queue_lock = &conf->device_lock;
    +
    rdev_for_each(rdev, tmp, mddev) {
    int j = rdev->raid_disk;
    dev_info_t *disk = conf->disks + j;
    @@ -133,8 +137,10 @@ static linear_conf_t *linear_conf(mddev_t *mddev, int raid_disks)

    disk->rdev = rdev;

    + spin_lock(&conf->device_lock);
    blk_queue_stack_limits(mddev->queue,
    rdev->bdev->bd_disk->queue);
    + spin_unlock(&conf->device_lock);
    /* as we don't honour merge_bvec_fn, we must never risk
    * violating it, so limit ->max_sector to one PAGE, as
    * a one page request is never in violation.
    diff --git a/drivers/md/multipath.c b/drivers/md/multipath.c
    index 42ee1a2..ee7df38 100644
    --- a/drivers/md/multipath.c
    +++ b/drivers/md/multipath.c
    @@ -436,6 +436,10 @@ static int multipath_run (mddev_t *mddev)
    goto out_free_conf;
    }

    + spin_lock_init(&conf->device_lock);
    + /* blk-core uses queue_lock to verify protection of the queue flags */
    + mddev->queue->queue_lock = &conf->device_lock;
    +
    conf->working_disks = 0;
    rdev_for_each(rdev, tmp, mddev) {
    disk_idx = rdev->raid_disk;
    @@ -446,8 +450,10 @@ static int multipath_run (mddev_t *mddev)
    disk = conf->multipaths + disk_idx;
    disk->rdev = rdev;

    + spin_lock(&conf->device_lock);
    blk_queue_stack_limits(mddev->queue,
    rdev->bdev->bd_disk->queue);
    + spin_unlock(&conf->device_lock);
    /* as we don't honour merge_bvec_fn, we must never risk
    * violating it, not that we ever expect a device with
    * a merge_bvec_fn to be involved in multipath */
    diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
    index 818b482..deb5609 100644
    --- a/drivers/md/raid0.c
    +++ b/drivers/md/raid0.c
    @@ -117,6 +117,10 @@ static int create_strip_zones (mddev_t *mddev)
    if (!conf->devlist)
    return 1;

    + spin_lock_init(&conf->device_lock);
    + /* blk-core uses queue_lock to verify protection of the queue flags */
    + mddev->queue->queue_lock = &conf->device_lock;
    +
    /* The first zone must contain all devices, so here we check that
    * there is a proper alignment of slots to devices and find them all
    */
    @@ -138,8 +142,10 @@ static int create_strip_zones (mddev_t *mddev)
    }
    zone->dev[j] = rdev1;

    + spin_lock(&conf->device_lock);
    blk_queue_stack_limits(mddev->queue,
    rdev1->bdev->bd_disk->queue);
    + spin_unlock(&conf->device_lock);
    /* as we don't honour merge_bvec_fn, we must never risk
    * violating it, so limit ->max_sector to one PAGE, as
    * a one page request is never in violation.
    diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
    index 6778b7c..a01fc7e 100644
    --- a/drivers/md/raid1.c
    +++ b/drivers/md/raid1.c
    @@ -1935,6 +1935,10 @@ static int run(mddev_t *mddev)
    if (!conf->r1bio_pool)
    goto out_no_mem;

    + spin_lock_init(&conf->device_lock);
    + /* blk-core uses queue_lock to verify protection of the queue flags */
    + mddev->queue->queue_lock = &conf->device_lock;
    +
    rdev_for_each(rdev, tmp, mddev) {
    disk_idx = rdev->raid_disk;
    if (disk_idx >= mddev->raid_disks
    @@ -1944,8 +1948,10 @@ static int run(mddev_t *mddev)

    disk->rdev = rdev;

    + spin_lock(&conf->device_lock);
    blk_queue_stack_limits(mddev->queue,
    rdev->bdev->bd_disk->queue);
    + spin_unlock(&conf->device_lock);
    /* as we don't honour merge_bvec_fn, we must never risk
    * violating it, so limit ->max_sector to one PAGE, as
    * a one page request is never in violation.
    @@ -1958,7 +1964,6 @@ static int run(mddev_t *mddev)
    }
    conf->raid_disks = mddev->raid_disks;
    conf->mddev = mddev;
    - spin_lock_init(&conf->device_lock);
    INIT_LIST_HEAD(&conf->retry_list);

    spin_lock_init(&conf->resync_lock);
    diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
    index 5938fa9..c28af78 100644
    --- a/drivers/md/raid10.c
    +++ b/drivers/md/raid10.c
    @@ -2082,6 +2082,10 @@ static int run(mddev_t *mddev)
    goto out_free_conf;
    }

    + spin_lock_init(&conf->device_lock);
    + /* blk-core uses queue_lock to verify protection of the queue flags */
    + mddev->queue->queue_lock = &conf->device_lock;
    +
    rdev_for_each(rdev, tmp, mddev) {
    disk_idx = rdev->raid_disk;
    if (disk_idx >= mddev->raid_disks
    @@ -2091,8 +2095,10 @@ static int run(mddev_t *mddev)

    disk->rdev = rdev;

    + spin_lock(&conf->device_lock);
    blk_queue_stack_limits(mddev->queue,
    rdev->bdev->bd_disk->queue);
    + spin_unlock(&conf->device_lock);
    /* as we don't honour merge_bvec_fn, we must never risk
    * violating it, so limit ->max_sector to one PAGE, as
    * a one page request is never in violation.
    @@ -2103,7 +2109,6 @@ static int run(mddev_t *mddev)

    disk->head_position = 0;
    }
    - spin_lock_init(&conf->device_lock);
    INIT_LIST_HEAD(&conf->retry_list);

    spin_lock_init(&conf->resync_lock);
    diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
    index ee0ea91..59964a7 100644
    --- a/drivers/md/raid5.c
    +++ b/drivers/md/raid5.c
    @@ -4257,6 +4257,8 @@ static int run(mddev_t *mddev)
    goto abort;
    }
    spin_lock_init(&conf->device_lock);
    + /* blk-core uses queue_lock to verify protection of the queue flags */
    + mddev->queue->queue_lock = &conf->device_lock;
    init_waitqueue_head(&conf->wait_for_stripe);
    init_waitqueue_head(&conf->wait_for_overlap);
    INIT_LIST_HEAD(&conf->handle_list);
    diff --git a/include/linux/raid/linear.h b/include/linux/raid/linear.h
    index ba15469..3c35e1e 100644
    --- a/include/linux/raid/linear.h
    +++ b/include/linux/raid/linear.h
    @@ -18,7 +18,8 @@ struct linear_private_data
    sector_t hash_spacing;
    sector_t array_size;
    int preshift; /* shift before dividing by hash_spacing */
    - dev_info_t disks[0];
    + spinlock_t device_lock;
    + dev_info_t disks[0]; /* grows depending on 'raid_disks' */
    };


    diff --git a/include/linux/raid/raid0.h b/include/linux/raid/raid0.h
    index 1b2dda0..3d20d14 100644
    --- a/include/linux/raid/raid0.h
    +++ b/include/linux/raid/raid0.h
    @@ -21,6 +21,7 @@ struct raid0_private_data

    sector_t hash_spacing;
    int preshift; /* shift this before divide by hash_spacing */
    + spinlock_t device_lock;
    };

    typedef struct raid0_private_data raid0_conf_t;



    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: WARNING in 2.6.25-07422-gb66e1f1

    On Thursday May 8, dan.j.williams@intel.com wrote:
    > On Thu, 2008-05-08 at 11:46 -0700, Dan Williams wrote:
    > Subject: md: tell blk-core about device_lock for protecting the queue flags
    > From: Dan Williams
    >
    > Now that queue flags are no longer atomic (commit:
    > 75ad23bc0fcb4f992a5d06982bf0857ab1738e9e) blk-core checks the queue is locked
    > via ->queue_lock. As noticed by Neil conf->device_lock already satisfies this
    > requirement.
    >
    > Signed-off-by: Dan Williams
    > ---
    >
    > drivers/md/linear.c | 6 ++++++
    > drivers/md/multipath.c | 6 ++++++
    > drivers/md/raid0.c | 6 ++++++
    > drivers/md/raid1.c | 7 ++++++-
    > drivers/md/raid10.c | 7 ++++++-
    > drivers/md/raid5.c | 2 ++
    > include/linux/raid/linear.h | 3 ++-
    > include/linux/raid/raid0.h | 1 +
    > 8 files changed, 35 insertions(+), 3 deletions(-)
    >
    >
    > diff --git a/drivers/md/linear.c b/drivers/md/linear.c
    > index 0b85117..d026f08 100644
    > --- a/drivers/md/linear.c
    > +++ b/drivers/md/linear.c
    > @@ -122,6 +122,10 @@ static linear_conf_t *linear_conf(mddev_t *mddev, int raid_disks)
    > cnt = 0;
    > conf->array_size = 0;
    >
    > + spin_lock_init(&conf->device_lock);
    > + /* blk-core uses queue_lock to verify protection of the queue flags */
    > + mddev->queue->queue_lock = &conf->device_lock;
    > +
    > rdev_for_each(rdev, tmp, mddev) {
    > int j = rdev->raid_disk;
    > dev_info_t *disk = conf->disks + j;
    > @@ -133,8 +137,10 @@ static linear_conf_t *linear_conf(mddev_t *mddev, int raid_disks)
    >
    > disk->rdev = rdev;
    >
    > + spin_lock(&conf->device_lock);
    > blk_queue_stack_limits(mddev->queue,
    > rdev->bdev->bd_disk->queue);
    > + spin_unlock(&conf->device_lock);
    > /* as we don't honour merge_bvec_fn, we must never risk
    > * violating it, so limit ->max_sector to one PAGE, as
    > * a one page request is never in violation.


    This shouldn't be necessary.
    There is no actual race here -- mddev->queue->queue_flags is not going to be
    accessed by anyone else until do_md_run does
    mddev->queue->make_request_fn = mddev->pers->make_request;
    which is much later.
    So we only need to be sure that "queue_is_locked" doesn't complain.
    And as q->queue_lock is still NULL at this point, it won't complain.

    I think that the *only* change that is needs is to put

    > + /* blk-core uses queue_lock to verify protection of the queue flags */
    > + mddev->queue->queue_lock = &conf->device_lock;


    after each
    > + spin_lock_init(&conf->device_lock);


    i.e. in raid1.c, raid10.c and raid5.c

    ??

    NeilBrown
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  13. Re: WARNING in 2.6.25-07422-gb66e1f1

    On Thu, May 8, 2008 at 7:15 PM, Neil Brown wrote:
    > On Thursday May 8, dan.j.williams@intel.com wrote:
    > > @@ -133,8 +137,10 @@ static linear_conf_t *linear_conf(mddev_t *mddev, int raid_disks)
    > >
    > > disk->rdev = rdev;
    > >
    > > + spin_lock(&conf->device_lock);
    > > blk_queue_stack_limits(mddev->queue,
    > > rdev->bdev->bd_disk->queue);
    > > + spin_unlock(&conf->device_lock);
    > > /* as we don't honour merge_bvec_fn, we must never risk
    > > * violating it, so limit ->max_sector to one PAGE, as
    > > * a one page request is never in violation.

    >
    > This shouldn't be necessary.
    > There is no actual race here -- mddev->queue->queue_flags is not going to be
    > accessed by anyone else until do_md_run does
    > mddev->queue->make_request_fn = mddev->pers->make_request;
    > which is much later.
    > So we only need to be sure that "queue_is_locked" doesn't complain.
    > And as q->queue_lock is still NULL at this point, it won't complain.
    >
    > I think that the *only* change that is needs is to put
    >
    >
    > > + /* blk-core uses queue_lock to verify protection of the queue flags */
    > > + mddev->queue->queue_lock = &conf->device_lock;

    >
    > after each
    >
    > > + spin_lock_init(&conf->device_lock);

    >
    > i.e. in raid1.c, raid10.c and raid5.c
    >
    > ??


    Yes, locking shouldn't be needed at those points; however, the warning
    still fires because blk_queue_stack_limits() is using
    queue_flag_clear() instead of queue_flag_unlocked(). Taking a look at
    converting it to queue_flag_clear_unlocked() uncovered a couple more
    overlooked sites (multipath.c:multipath_add_disk and
    raid1.c:raid1_add_disk) where ->run has already been called...

    The options I am thinking of all seem ugly:
    1/ keep the unnecessary locking in MD
    2/ make blk_queue_stack_limits() use queue_flag_clear_unlocked() even
    though it needs to be locked sometimes
    3/ conditionally use queue_flag_clear_unlocked if !t->queue_lock

    --
    Dan
    I'm having a h
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  14. Re: WARNING in 2.6.25-07422-gb66e1f1

    On Friday May 9, neilb@suse.de wrote:
    > On Thursday May 8, dan.j.williams@intel.com wrote:
    > > @@ -133,8 +137,10 @@ static linear_conf_t *linear_conf(mddev_t *mddev, int raid_disks)
    > >
    > > disk->rdev = rdev;
    > >
    > > + spin_lock(&conf->device_lock);
    > > blk_queue_stack_limits(mddev->queue,
    > > rdev->bdev->bd_disk->queue);
    > > + spin_unlock(&conf->device_lock);
    > > /* as we don't honour merge_bvec_fn, we must never risk
    > > * violating it, so limit ->max_sector to one PAGE, as
    > > * a one page request is never in violation.

    >
    > This shouldn't be necessary.
    > There is no actual race here -- mddev->queue->queue_flags is not going to be
    > accessed by anyone else until do_md_run does
    > mddev->queue->make_request_fn = mddev->pers->make_request;
    > which is much later.
    > So we only need to be sure that "queue_is_locked" doesn't complain.
    > And as q->queue_lock is still NULL at this point, it won't complain.


    Sorry, I got that backwards. It will complain, won't it. :-)

    I gotta say that I think it shouldn't. Introducing a spinlock in
    linear.c, raid0.c, multipath.c just to silence a "WARN_ON" seems like
    the wrong thing to do. Of course we could just use q->__queue_lock so
    we don't have to add a new lock, but we still have to take the lock
    unnecessarily.

    Unfortunately I cannot find a nice solution that both avoids clutter
    in md code and also protects against carelessly changing flags without
    a proper lock.....

    Maybe....
    We could get blk_queue_stack_limits to lock the queue, and always
    spin_lock_init __queue_lock. Then the only change needed in linear.c
    et al would be to set ->queue_lock to &->__queue_lock.

    Jens: What do you think of this??

    NeilBrown



    diff --git a/block/blk-core.c b/block/blk-core.c
    index b754a4a..2d31dc2 100644
    --- a/block/blk-core.c
    +++ b/block/blk-core.c
    @@ -479,6 +479,7 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id)
    kobject_init(&q->kobj, &blk_queue_ktype);

    mutex_init(&q->sysfs_lock);
    + spin_lock_init(&q->__queue_lock);

    return q;
    }
    @@ -541,10 +542,8 @@ blk_init_queue_node(request_fn_proc *rfn, spinlock_t *lock, int node_id)
    * if caller didn't supply a lock, they get per-queue locking with
    * our embedded lock
    */
    - if (!lock) {
    - spin_lock_init(&q->__queue_lock);
    + if (!lock)
    lock = &q->__queue_lock;
    - }

    q->request_fn = rfn;
    q->prep_rq_fn = NULL;
    diff --git a/block/blk-settings.c b/block/blk-settings.c
    index bb93d4c..488199a 100644
    --- a/block/blk-settings.c
    +++ b/block/blk-settings.c
    @@ -286,8 +286,14 @@ void blk_queue_stack_limits(struct request_queue *t, struct request_queue *b)
    t->max_hw_segments = min(t->max_hw_segments, b->max_hw_segments);
    t->max_segment_size = min(t->max_segment_size, b->max_segment_size);
    t->hardsect_size = max(t->hardsect_size, b->hardsect_size);
    - if (!test_bit(QUEUE_FLAG_CLUSTER, &b->queue_flags))
    + if (!t->queue_lock)
    + WARN_ON_ONCE(1);
    + else if (!test_bit(QUEUE_FLAG_CLUSTER, &b->queue_flags)) {
    + unsigned long flags;
    + spin_lock_irqsave(&t->queue_lock, flags);
    queue_flag_clear(QUEUE_FLAG_CLUSTER, t);
    + spin_unlock_irqrestore(&t->queue_lock, flags);
    + }
    }
    EXPORT_SYMBOL(blk_queue_stack_limits);

    diff --git a/drivers/md/linear.c b/drivers/md/linear.c
    index 0b85117..552f81b 100644
    --- a/drivers/md/linear.c
    +++ b/drivers/md/linear.c
    @@ -250,6 +250,7 @@ static int linear_run (mddev_t *mddev)
    {
    linear_conf_t *conf;

    + mddev->queue_lock = &mddev->__queue_lock;
    conf = linear_conf(mddev, mddev->raid_disks);

    if (!conf)
    diff --git a/drivers/md/multipath.c b/drivers/md/multipath.c
    index 42ee1a2..90f85e4 100644
    --- a/drivers/md/multipath.c
    +++ b/drivers/md/multipath.c
    @@ -417,6 +417,7 @@ static int multipath_run (mddev_t *mddev)
    * bookkeeping area. [whatever we allocate in multipath_run(),
    * should be freed in multipath_stop()]
    */
    + mddev->queue_lock = &mddev->__queue_lock;

    conf = kzalloc(sizeof(multipath_conf_t), GFP_KERNEL);
    mddev->private = conf;
    diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
    index 818b482..a179c8f 100644
    --- a/drivers/md/raid0.c
    +++ b/drivers/md/raid0.c
    @@ -280,6 +280,7 @@ static int raid0_run (mddev_t *mddev)
    (mddev->chunk_size>>1)-1);
    blk_queue_max_sectors(mddev->queue, mddev->chunk_size >> 9);
    blk_queue_segment_boundary(mddev->queue, (mddev->chunk_size>>1) - 1);
    + mddev->queue_lock = &mddev->__queue_lock;

    conf = kmalloc(sizeof (raid0_conf_t), GFP_KERNEL);
    if (!conf)
    diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
    index 6778b7c..ac409b7 100644
    --- a/drivers/md/raid1.c
    +++ b/drivers/md/raid1.c
    @@ -1935,6 +1935,9 @@ static int run(mddev_t *mddev)
    if (!conf->r1bio_pool)
    goto out_no_mem;

    + spin_lock_init(&conf->device_lock);
    + mddev->queue->queue_lock = &conf->device_lock;
    +
    rdev_for_each(rdev, tmp, mddev) {
    disk_idx = rdev->raid_disk;
    if (disk_idx >= mddev->raid_disks
    @@ -1958,7 +1961,6 @@ static int run(mddev_t *mddev)
    }
    conf->raid_disks = mddev->raid_disks;
    conf->mddev = mddev;
    - spin_lock_init(&conf->device_lock);
    INIT_LIST_HEAD(&conf->retry_list);

    spin_lock_init(&conf->resync_lock);
    diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
    index 5938fa9..740f670 100644
    --- a/drivers/md/raid10.c
    +++ b/drivers/md/raid10.c
    @@ -2082,6 +2082,9 @@ static int run(mddev_t *mddev)
    goto out_free_conf;
    }

    + spin_lock_init(&conf->device_lock);
    + mddev->queue->queue_lock = &mddev->queue->__queue_lock;
    +
    rdev_for_each(rdev, tmp, mddev) {
    disk_idx = rdev->raid_disk;
    if (disk_idx >= mddev->raid_disks
    @@ -2103,7 +2106,6 @@ static int run(mddev_t *mddev)

    disk->head_position = 0;
    }
    - spin_lock_init(&conf->device_lock);
    INIT_LIST_HEAD(&conf->retry_list);

    spin_lock_init(&conf->resync_lock);
    diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
    index 087eee0..4fafc79 100644
    --- a/drivers/md/raid5.c
    +++ b/drivers/md/raid5.c
    @@ -4256,6 +4256,7 @@ static int run(mddev_t *mddev)
    goto abort;
    }
    spin_lock_init(&conf->device_lock);
    + mddev->queue->queue_lock = &conf->device_lock;
    init_waitqueue_head(&conf->wait_for_stripe);
    init_waitqueue_head(&conf->wait_for_overlap);
    INIT_LIST_HEAD(&conf->handle_list);
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  15. Re: WARNING in 2.6.25-07422-gb66e1f1


    On Thu, 2008-05-08 at 22:38 -0700, Neil Brown wrote:
    > On Friday May 9, neilb@suse.de wrote:
    > > On Thursday May 8, dan.j.williams@intel.com wrote:
    > > > @@ -133,8 +137,10 @@ static linear_conf_t *linear_conf(mddev_t

    > *mddev, int raid_disks)
    > > >
    > > > disk->rdev = rdev;
    > > >
    > > > + spin_lock(&conf->device_lock);
    > > > blk_queue_stack_limits(mddev->queue,
    > > > rdev->bdev->bd_disk->queue);
    > > > + spin_unlock(&conf->device_lock);
    > > > /* as we don't honour merge_bvec_fn, we must never

    > risk
    > > > * violating it, so limit ->max_sector to one PAGE, as
    > > > * a one page request is never in violation.

    > >
    > > This shouldn't be necessary.
    > > There is no actual race here -- mddev->queue->queue_flags is not

    > going to be
    > > accessed by anyone else until do_md_run does
    > > mddev->queue->make_request_fn = mddev->pers->make_request;
    > > which is much later.
    > > So we only need to be sure that "queue_is_locked" doesn't complain.
    > > And as q->queue_lock is still NULL at this point, it won't complain.

    >
    > Sorry, I got that backwards. It will complain, won't it. :-)
    >
    > I gotta say that I think it shouldn't. Introducing a spinlock in
    > linear.c, raid0.c, multipath.c just to silence a "WARN_ON" seems like
    > the wrong thing to do. Of course we could just use q->__queue_lock so
    > we don't have to add a new lock, but we still have to take the lock
    > unnecessarily.
    >
    > Unfortunately I cannot find a nice solution that both avoids clutter
    > in md code and also protects against carelessly changing flags without
    > a proper lock.....
    >
    > Maybe....
    > We could get blk_queue_stack_limits to lock the queue, and always
    > spin_lock_init __queue_lock. Then the only change needed in linear.c
    > et al would be to set ->queue_lock to &->__queue_lock.
    >
    > Jens: What do you think of this??
    >
    > diff --git a/block/blk-core.c b/block/blk-core.c
    > index b754a4a..2d31dc2 100644
    > --- a/block/blk-core.c
    > +++ b/block/blk-core.c
    > @@ -479,6 +479,7 @@ struct request_queue *blk_alloc_queue_node(gfp_t
    > gfp_mask, int node_id)
    > kobject_init(&q->kobj, &blk_queue_ktype);
    >
    > mutex_init(&q->sysfs_lock);
    > + spin_lock_init(&q->__queue_lock);
    >
    > return q;
    > }
    > @@ -541,10 +542,8 @@ blk_init_queue_node(request_fn_proc *rfn,
    > spinlock_t *lock, int node_id)
    > * if caller didn't supply a lock, they get per-queue locking
    > with
    > * our embedded lock
    > */
    > - if (!lock) {
    > - spin_lock_init(&q->__queue_lock);
    > + if (!lock)
    > lock = &q->__queue_lock;
    > - }
    >
    > q->request_fn = rfn;
    > q->prep_rq_fn = NULL;
    > diff --git a/block/blk-settings.c b/block/blk-settings.c
    > index bb93d4c..488199a 100644
    > --- a/block/blk-settings.c
    > +++ b/block/blk-settings.c
    > @@ -286,8 +286,14 @@ void blk_queue_stack_limits(struct request_queue
    > *t, struct request_queue *b)
    > t->max_hw_segments = min(t->max_hw_segments,
    > b->max_hw_segments);
    > t->max_segment_size = min(t->max_segment_size,
    > b->max_segment_size);
    > t->hardsect_size = max(t->hardsect_size, b->hardsect_size);
    > - if (!test_bit(QUEUE_FLAG_CLUSTER, &b->queue_flags))
    > + if (!t->queue_lock)
    > + WARN_ON_ONCE(1);
    > + else if (!test_bit(QUEUE_FLAG_CLUSTER, &b->queue_flags)) {
    > + unsigned long flags;
    > + spin_lock_irqsave(&t->queue_lock, flags);
    > queue_flag_clear(QUEUE_FLAG_CLUSTER, t);
    > + spin_unlock_irqrestore(&t->queue_lock, flags);
    > + }
    > }
    > EXPORT_SYMBOL(blk_queue_stack_limits);
    >
    > diff --git a/drivers/md/linear.c b/drivers/md/linear.c
    > index 0b85117..552f81b 100644
    > --- a/drivers/md/linear.c
    > +++ b/drivers/md/linear.c
    > @@ -250,6 +250,7 @@ static int linear_run (mddev_t *mddev)
    > {
    > linear_conf_t *conf;
    >
    > + mddev->queue_lock = &mddev->__queue_lock;
    > conf = linear_conf(mddev, mddev->raid_disks);
    >
    > if (!conf)
    > diff --git a/drivers/md/multipath.c b/drivers/md/multipath.c
    > index 42ee1a2..90f85e4 100644
    > --- a/drivers/md/multipath.c
    > +++ b/drivers/md/multipath.c
    > @@ -417,6 +417,7 @@ static int multipath_run (mddev_t *mddev)
    > * bookkeeping area. [whatever we allocate in multipath_run(),
    > * should be freed in multipath_stop()]
    > */
    > + mddev->queue_lock = &mddev->__queue_lock;
    >
    > conf = kzalloc(sizeof(multipath_conf_t), GFP_KERNEL);
    > mddev->private = conf;
    > diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
    > index 818b482..a179c8f 100644
    > --- a/drivers/md/raid0.c
    > +++ b/drivers/md/raid0.c
    > @@ -280,6 +280,7 @@ static int raid0_run (mddev_t *mddev)
    > (mddev->chunk_size>>1)-1);
    > blk_queue_max_sectors(mddev->queue, mddev->chunk_size >> 9);
    > blk_queue_segment_boundary(mddev->queue,
    > (mddev->chunk_size>>1) - 1);
    > + mddev->queue_lock = &mddev->__queue_lock;
    >
    > conf = kmalloc(sizeof (raid0_conf_t), GFP_KERNEL);
    > if (!conf)
    > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
    > index 6778b7c..ac409b7 100644
    > --- a/drivers/md/raid1.c
    > +++ b/drivers/md/raid1.c
    > @@ -1935,6 +1935,9 @@ static int run(mddev_t *mddev)
    > if (!conf->r1bio_pool)
    > goto out_no_mem;
    >
    > + spin_lock_init(&conf->device_lock);
    > + mddev->queue->queue_lock = &conf->device_lock;
    > +
    > rdev_for_each(rdev, tmp, mddev) {
    > disk_idx = rdev->raid_disk;
    > if (disk_idx >= mddev->raid_disks
    > @@ -1958,7 +1961,6 @@ static int run(mddev_t *mddev)
    > }
    > conf->raid_disks = mddev->raid_disks;
    > conf->mddev = mddev;
    > - spin_lock_init(&conf->device_lock);
    > INIT_LIST_HEAD(&conf->retry_list);
    >
    > spin_lock_init(&conf->resync_lock);
    > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
    > index 5938fa9..740f670 100644
    > --- a/drivers/md/raid10.c
    > +++ b/drivers/md/raid10.c
    > @@ -2082,6 +2082,9 @@ static int run(mddev_t *mddev)
    > goto out_free_conf;
    > }
    >
    > + spin_lock_init(&conf->device_lock);
    > + mddev->queue->queue_lock = &mddev->queue->__queue_lock;
    > +
    > rdev_for_each(rdev, tmp, mddev) {
    > disk_idx = rdev->raid_disk;
    > if (disk_idx >= mddev->raid_disks
    > @@ -2103,7 +2106,6 @@ static int run(mddev_t *mddev)
    >
    > disk->head_position = 0;
    > }
    > - spin_lock_init(&conf->device_lock);
    > INIT_LIST_HEAD(&conf->retry_list);
    >
    > spin_lock_init(&conf->resync_lock);
    > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
    > index 087eee0..4fafc79 100644
    > --- a/drivers/md/raid5.c
    > +++ b/drivers/md/raid5.c
    > @@ -4256,6 +4256,7 @@ static int run(mddev_t *mddev)
    > goto abort;
    > }
    > spin_lock_init(&conf->device_lock);
    > + mddev->queue->queue_lock = &conf->device_lock;
    > init_waitqueue_head(&conf->wait_for_stripe);
    > init_waitqueue_head(&conf->wait_for_overlap);
    > INIT_LIST_HEAD(&conf->handle_list);
    >


    Yes, this is simpler than what I had... spotted some fixups.

    --
    Dan

    diff --git a/block/blk-settings.c b/block/blk-settings.c
    index 488199a..8dd8641 100644
    --- a/block/blk-settings.c
    +++ b/block/blk-settings.c
    @@ -290,9 +290,9 @@ void blk_queue_stack_limits(struct request_queue *t, struct request_queue *b)
    WARN_ON_ONCE(1);
    else if (!test_bit(QUEUE_FLAG_CLUSTER, &b->queue_flags)) {
    unsigned long flags;
    - spin_lock_irqsave(&t->queue_lock, flags);
    + spin_lock_irqsave(t->queue_lock, flags);
    queue_flag_clear(QUEUE_FLAG_CLUSTER, t);
    - spin_unlock_irqrestore(&t->queue_lock, flags);
    + spin_unlock_irqrestore(t->queue_lock, flags);
    }
    }
    EXPORT_SYMBOL(blk_queue_stack_limits);
    diff --git a/drivers/md/linear.c b/drivers/md/linear.c
    index 552f81b..1074824 100644
    --- a/drivers/md/linear.c
    +++ b/drivers/md/linear.c
    @@ -250,7 +250,7 @@ static int linear_run (mddev_t *mddev)
    {
    linear_conf_t *conf;

    - mddev->queue_lock = &mddev->__queue_lock;
    + mddev->queue->queue_lock = &mddev->queue->__queue_lock;
    conf = linear_conf(mddev, mddev->raid_disks);

    if (!conf)
    diff --git a/drivers/md/multipath.c b/drivers/md/multipath.c
    index 90f85e4..4f4d1f3 100644
    --- a/drivers/md/multipath.c
    +++ b/drivers/md/multipath.c
    @@ -417,7 +417,7 @@ static int multipath_run (mddev_t *mddev)
    * bookkeeping area. [whatever we allocate in multipath_run(),
    * should be freed in multipath_stop()]
    */
    - mddev->queue_lock = &mddev->__queue_lock;
    + mddev->queue->queue_lock = &mddev->queue->__queue_lock;

    conf = kzalloc(sizeof(multipath_conf_t), GFP_KERNEL);
    mddev->private = conf;
    diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
    index a179c8f..914c04d 100644
    --- a/drivers/md/raid0.c
    +++ b/drivers/md/raid0.c
    @@ -280,7 +280,7 @@ static int raid0_run (mddev_t *mddev)
    (mddev->chunk_size>>1)-1);
    blk_queue_max_sectors(mddev->queue, mddev->chunk_size >> 9);
    blk_queue_segment_boundary(mddev->queue, (mddev->chunk_size>>1) - 1);
    - mddev->queue_lock = &mddev->__queue_lock;
    + mddev->queue->queue_lock = &mddev->queue->__queue_lock;

    conf = kmalloc(sizeof (raid0_conf_t), GFP_KERNEL);
    if (!conf)
    diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
    index f46d448..8536ede 100644
    --- a/drivers/md/raid10.c
    +++ b/drivers/md/raid10.c
    @@ -2083,7 +2083,7 @@ static int run(mddev_t *mddev)
    }

    spin_lock_init(&conf->device_lock);
    - mddev->queue->queue_lock = &mddev->queue->__queue_lock;
    + mddev->queue->queue_lock = &conf->device_lock;

    rdev_for_each(rdev, tmp, mddev) {
    disk_idx = rdev->raid_disk;


    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  16. Re: WARNING in 2.6.25-07422-gb66e1f1

    On Monday May 12, dan.j.williams@intel.com wrote:
    >
    > Yes, this is simpler than what I had... spotted some fixups.
    >


    Ahh, you noticed that I hadn't actually compiled it. :-)
    Thanks.

    I've set it off to Linus.

    NeilBrown
    --
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread