2.6.23-rc9 boot failure (megaraid?) - Kernel

This is a discussion on 2.6.23-rc9 boot failure (megaraid?) - Kernel ; 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine. System is a Dell Poweredge with PERC 2/DC with RAID1 volume. From 2.6.23-rc9: Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx ...

+ Reply to Thread
Results 1 to 16 of 16

Thread: 2.6.23-rc9 boot failure (megaraid?)

  1. 2.6.23-rc9 boot failure (megaraid?)

    2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.

    System is a Dell Poweredge with PERC 2/DC with RAID1 volume.

    From 2.6.23-rc9:

    Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
    ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
    PIIX4: IDE controller at PCI slot 0000:00:07.1
    eth1: Optical link UP (Full Duplex, Flow Control: )
    PIIX4: chipset revision 1
    PIIX4: not 100% native mode: will probe irqs later
    ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdcMA, hddio
    hdc: SAMSUNG SC-140B, ATAPI CD/DVD-ROM drive
    ide1 at 0x170-0x177,0x376 on irq 15
    hdc: ATAPI 40X CD-ROM drive, 128kB Cache, UDMA(33)
    Uniform CD-ROM driver Revision: 3.20
    ide-floppy driver 0.99.newide
    ACPI: PCI Interrupt 0000:00:0d.1[A] -> GSI 17 (level, low) -> IRQ 18
    megaraid: found 0x8086:0x1960:bus 0:slot 13:func 1
    scsi0:Found MegaRAID controller at 0xf8812000, IRQ:18
    megaraid: [1.06:1p00] detected 1 logical drives.
    megaraid: channel[0] is raid.
    megaraid: channel[1] is raid.
    scsi0 : LSI Logic MegaRAID 1.06 254 commands 16 targs 5 chans 7 luns
    scsi0: scanning scsi channel 0 for logical drives.
    scsi 0:0:0:0: Direct-Access MegaRAID LD0 RAID1 8568R 1.06 PQ: 0 ANSI: 2
    scsi0: scanning scsi channel 4 [P0] for physical devices.
    scsi0: scanning scsi channel 5 [P1] for physical devices.
    st: Version 20070203, fixed bufsize 32768, s/g segs 256
    sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    sd 0:0:0:0: [sda] Write Protect is off
    sd 0:0:0:0: [sda] Asking for cache data failed
    sd 0:0:0:0: [sda] Assuming drive cache: write through
    sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    sd 0:0:0:0: [sda] Write Protect is off
    sd 0:0:0:0: [sda] Asking for cache data failed
    sd 0:0:0:0: [sda] Assuming drive cache: write through
    sda: sda1
    sda: p1 exceeds device capacity
    sd 0:0:0:0: [sda] Attached SCSI disk
    PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
    serio: i8042 KBD port at 0x60,0x64 irq 1
    serio: i8042 AUX port at 0x60,0x64 irq 12
    mice: PS/2 mouse device common for all mice
    input: PC Speaker as /class/input/input1
    input: AT Translated Set 2 keyboard as /class/input/input2
    i2c /dev entries driver
    piix4_smbus 0000:00:07.3: Found 0000:00:07.3 device
    piix4_smbus 0000:00:07.3: Host SMBus controller not enabled!
    NET: Registered protocol family 26
    TCP cubic registered
    Initializing XFRM netlink socket
    NET: Registered protocol family 1
    NET: Registered protocol family 17
    NET: Registered protocol family 15
    Starting balanced_irq
    Using IPI Shortcut mode
    attempt to access beyond end of device
    sda: rw=0, want=67, limit=1
    EXT3-fs: unable to read superblock
    attempt to access beyond end of device
    sda: rw=0, want=67, limit=1
    EXT2-fs: unable to read superblock
    attempt to access beyond end of device
    sda: rw=0, want=129, limit=1
    isofs_fill_super: bread failed, dev=sda1, iso_blknum=16, block=32
    attempt to access beyond end of device
    sda: rw=0, want=131, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=17542979, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=17541955, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=17541731, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=17542971, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=17541947, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=17541723, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=17542379, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=17541355, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=17541131, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=17542371, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=17541347, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=17541123, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=14394267, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=14393243, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=14393019, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=14394259, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=14393235, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=14393011, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=14393667, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=14392643, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=14392419, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=14393659, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=14392635, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=14392411, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=1315, limit=1
    attempt to access beyond end of device
    sda: rw=0, want=1091, limit=1
    UDF-fs: No partition found (1)
    List of all partitions:
    1600 4194302 hdc driver: ide-cdrom
    0800 0 sda driver: sd
    0801 8771458 sda1
    No filesystem could mount root, tried: ext3 ext2 iso9660 udf
    Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,1)


    From 2.6.22.9:

    megaraid: found 0x8086:0x1960:bus 0:slot 13:func 1
    scsi0:Found MegaRAID controller at 0xf8812000, IRQ:18
    megaraid: [1.06:1p00] detected 1 logical drives.
    megaraid: channel[0] is raid.
    megaraid: channel[1] is raid.
    scsi0 : LSI Logic MegaRAID 1.06 254 commands 16 targs 5 chans 7 luns
    scsi0: scanning scsi channel 0 for logical drives.
    scsi 0:0:0:0: Direct-Access MegaRAID LD0 RAID1 8568R 1.06 PQ: 0 ANSI: 2
    scsi0: scanning scsi channel 4 [P0] for physical devices.
    scsi0: scanning scsi channel 5 [P1] for physical devices.
    st: Version 20070203, fixed bufsize 32768, s/g segs 256
    sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
    sd 0:0:0:0: [sda] Write Protect is off
    sd 0:0:0:0: [sda] Asking for cache data failed
    sd 0:0:0:0: [sda] Assuming drive cache: write through
    sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
    sd 0:0:0:0: [sda] Write Protect is off
    sd 0:0:0:0: [sda] Asking for cache data failed
    sd 0:0:0:0: [sda] Assuming drive cache: write through
    sda: sda1
    sd 0:0:0:0: [sda] Attached SCSI disk
    PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
    serio: i8042 KBD port at 0x60,0x64 irq 1
    serio: i8042 AUX port at 0x60,0x64 irq 12
    mice: PS/2 mouse device common for all mice
    input: PC Speaker as /class/input/input1
    input: AT Translated Set 2 keyboard as /class/input/input2
    i2c /dev entries driver
    piix4_smbus 0000:00:07.3: Found 0000:00:07.3 device
    piix4_smbus 0000:00:07.3: Host SMBus controller not enabled!
    NET: Registered protocol family 26
    TCP cubic registered
    Initializing XFRM netlink socket
    NET: Registered protocol family 1
    NET: Registered protocol family 17
    NET: Registered protocol family 15
    Starting balanced_irq
    Using IPI Shortcut mode
    kjournald starting. Commit interval 5 seconds
    EXT3-fs: mounted filesystem with ordered data mode.
    VFS: Mounted root (ext3 filesystem) readonly.
    Freeing unused kernel memory: 260k freed
    EXT3 FS on sda1, internal journal




    00:0d.1 I2O: Intel Corporation 80960RP [i960RP Microprocessor] (rev 02) (prog-if 01)
    Subsystem: Dell PowerEdge Expandable RAID Controller 2/DC
    Flags: bus master, medium devsel, latency 64, IRQ 18
    Memory at f7000000 (32-bit, prefetchable) [size=4M]
    [virtual] Expansion ROM at 50000000 [disabled] [size=32K]
    Capabilities:




    CONFIG_SCSI=y
    CONFIG_SCSI_DMA=y
    CONFIG_SCSI_NETLINK=y
    CONFIG_SCSI_PROC_FS=y
    CONFIG_SCSI_LOGGING=y
    CONFIG_SCSI_FC_ATTRS=y
    CONFIG_SCSI_LOWLEVEL=y
    CONFIG_MEGARAID_LEGACY=y


    --
    Burton Windle bwindle@fint.org

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  2. Re: 2.6.23-rc9 boot failure (megaraid?)

    Cc's added, the complete bug report is at
    http://lkml.org/lkml/2007/10/2/243

    On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
    > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
    >
    > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
    >...


    Thanks for your report.

    Diff'ing the dmesg's shows:

    <-- snip -->

    scsi0: scanning scsi channel 4 [P0] for physical devices.
    scsi0: scanning scsi channel 5 [P1] for physical devices.
    st: Version 20070203, fixed bufsize 32768, s/g segs 256
    -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
    +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    sd 0:0:0:0: [sda] Write Protect is off
    sd 0:0:0:0: [sda] Asking for cache data failed
    sd 0:0:0:0: [sda] Assuming drive cache: write through
    -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
    +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    sd 0:0:0:0: [sda] Write Protect is off
    sd 0:0:0:0: [sda] Asking for cache data failed
    sd 0:0:0:0: [sda] Assuming drive cache: write through
    sda: sda1
    + sda: p1 exceeds device capacity

    <-- snip -->

    Does reverting the commit below fix the problem?

    cu
    Adrian


    commit 3f6270ef76f2ce5c134615a470685d6c2a66c07e
    Author: FUJITA Tomonori
    Date: Mon May 14 20:17:27 2007 +0900

    [SCSI] megaraid_old: convert to use the data buffer accessors

    - remove the unnecessary map_single path.

    - convert to use the new accessors for the sg lists and the
    parameters.

    Jens Axboe did the for_each_sg cleanup.

    Signed-off-by: FUJITA Tomonori
    Acked-by: Sumant Patro
    Signed-off-by: James Bottomley

    diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
    index 40ee07d..3907f67 100644
    --- a/drivers/scsi/megaraid.c
    +++ b/drivers/scsi/megaraid.c
    @@ -523,10 +523,8 @@ mega_build_cmd(adapter_t *adapter, Scsi_Cmnd *cmd, int *busy)
    /*
    * filter the internal and ioctl commands
    */
    - if((cmd->cmnd[0] == MEGA_INTERNAL_CMD)) {
    - return cmd->request_buffer;
    - }
    -
    + if((cmd->cmnd[0] == MEGA_INTERNAL_CMD))
    + return (scb_t *)cmd->host_scribble;

    /*
    * We know what channels our logical drives are on - mega_find_card()
    @@ -657,22 +655,14 @@ mega_build_cmd(adapter_t *adapter, Scsi_Cmnd *cmd, int *busy)

    case MODE_SENSE: {
    char *buf;
    + struct scatterlist *sg;

    - if (cmd->use_sg) {
    - struct scatterlist *sg;
    + sg = scsi_sglist(cmd);
    + buf = kmap_atomic(sg->page, KM_IRQ0) + sg->offset;

    - sg = (struct scatterlist *)cmd->request_buffer;
    - buf = kmap_atomic(sg->page, KM_IRQ0) +
    - sg->offset;
    - } else
    - buf = cmd->request_buffer;
    memset(buf, 0, cmd->cmnd[4]);
    - if (cmd->use_sg) {
    - struct scatterlist *sg;
    + kunmap_atomic(buf - sg->offset, KM_IRQ0);

    - sg = (struct scatterlist *)cmd->request_buffer;
    - kunmap_atomic(buf - sg->offset, KM_IRQ0);
    - }
    cmd->result = (DID_OK << 16);
    cmd->scsi_done(cmd);
    return NULL;
    @@ -1551,23 +1541,15 @@ mega_cmd_done(adapter_t *adapter, u8 completed[], int nstatus, int status)
    islogical = adapter->logdrv_chan[cmd->device->channel];
    if( cmd->cmnd[0] == INQUIRY && !islogical ) {

    - if( cmd->use_sg ) {
    - sgl = (struct scatterlist *)
    - cmd->request_buffer;
    -
    - if( sgl->page ) {
    - c = *(unsigned char *)
    + sgl = scsi_sglist(cmd);
    + if( sgl->page ) {
    + c = *(unsigned char *)
    page_address((&sgl[0])->page) +
    (&sgl[0])->offset;
    - }
    - else {
    - printk(KERN_WARNING
    - "megaraid: invalid sg.\n");
    - c = 0;
    - }
    - }
    - else {
    - c = *(u8 *)cmd->request_buffer;
    + } else {
    + printk(KERN_WARNING
    + "megaraid: invalid sg.\n");
    + c = 0;
    }

    if(IS_RAID_CH(adapter, cmd->device->channel) &&
    @@ -1704,30 +1686,14 @@ mega_rundoneq (adapter_t *adapter)
    static void
    mega_free_scb(adapter_t *adapter, scb_t *scb)
    {
    - unsigned long length;
    -
    switch( scb->dma_type ) {

    case MEGA_DMA_TYPE_NONE:
    break;

    - case MEGA_BULK_DATA:
    - if (scb->cmd->use_sg == 0)
    - length = scb->cmd->request_bufflen;
    - else {
    - struct scatterlist *sgl =
    - (struct scatterlist *)scb->cmd->request_buffer;
    - length = sgl->length;
    - }
    - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
    - length, scb->dma_direction);
    - break;
    -
    case MEGA_SGLIST:
    - pci_unmap_sg(adapter->dev, scb->cmd->request_buffer,
    - scb->cmd->use_sg, scb->dma_direction);
    + scsi_dma_unmap(scb->cmd);
    break;
    -
    default:
    break;
    }
    @@ -1767,80 +1733,33 @@ __mega_busywait_mbox (adapter_t *adapter)
    static int
    mega_build_sglist(adapter_t *adapter, scb_t *scb, u32 *buf, u32 *len)
    {
    - struct scatterlist *sgl;
    - struct page *page;
    - unsigned long offset;
    - unsigned int length;
    + struct scatterlist *sg;
    Scsi_Cmnd *cmd;
    int sgcnt;
    int idx;

    cmd = scb->cmd;

    - /* Scatter-gather not used */
    - if( cmd->use_sg == 0 || (cmd->use_sg == 1 &&
    - !adapter->has_64bit_addr)) {
    -
    - if (cmd->use_sg == 0) {
    - page = virt_to_page(cmd->request_buffer);
    - offset = offset_in_page(cmd->request_buffer);
    - length = cmd->request_bufflen;
    - } else {
    - sgl = (struct scatterlist *)cmd->request_buffer;
    - page = sgl->page;
    - offset = sgl->offset;
    - length = sgl->length;
    - }
    -
    - scb->dma_h_bulkdata = pci_map_page(adapter->dev,
    - page, offset,
    - length,
    - scb->dma_direction);
    - scb->dma_type = MEGA_BULK_DATA;
    -
    - /*
    - * We need to handle special 64-bit commands that need a
    - * minimum of 1 SG
    - */
    - if( adapter->has_64bit_addr ) {
    - scb->sgl64[0].address = scb->dma_h_bulkdata;
    - scb->sgl64[0].length = length;
    - *buf = (u32)scb->sgl_dma_addr;
    - *len = (u32)length;
    - return 1;
    - }
    - else {
    - *buf = (u32)scb->dma_h_bulkdata;
    - *len = (u32)length;
    - }
    - return 0;
    - }
    -
    - sgl = (struct scatterlist *)cmd->request_buffer;
    -
    /*
    * Copy Scatter-Gather list info into controller structure.
    *
    * The number of sg elements returned must not exceed our limit
    */
    - sgcnt = pci_map_sg(adapter->dev, sgl, cmd->use_sg,
    - scb->dma_direction);
    + sgcnt = scsi_dma_map(cmd);

    scb->dma_type = MEGA_SGLIST;

    - BUG_ON(sgcnt > adapter->sglen);
    + BUG_ON(sgcnt > adapter->sglen || sgcnt < 0);

    *len = 0;

    - for( idx = 0; idx < sgcnt; idx++, sgl++ ) {
    -
    - if( adapter->has_64bit_addr ) {
    - scb->sgl64[idx].address = sg_dma_address(sgl);
    - *len += scb->sgl64[idx].length = sg_dma_len(sgl);
    - }
    - else {
    - scb->sgl[idx].address = sg_dma_address(sgl);
    - *len += scb->sgl[idx].length = sg_dma_len(sgl);
    + scsi_for_each_sg(cmd, sg, sgcnt, idx) {
    + if (adapter->has_64bit_addr) {
    + scb->sgl64[idx].address = sg_dma_address(sg);
    + *len += scb->sgl64[idx].length = sg_dma_len(sg);
    + } else {
    + scb->sgl[idx].address = sg_dma_address(sg);
    + *len += scb->sgl[idx].length = sg_dma_len(sg);
    }
    }

    @@ -4494,7 +4413,7 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
    scmd->device = sdev;

    scmd->device->host = adapter->host;
    - scmd->request_buffer = (void *)scb;
    + scmd->host_scribble = (void *)scb;
    scmd->cmnd[0] = MEGA_INTERNAL_CMD;

    scb->state |= SCB_ACTIVE;
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  3. Re: 2.6.23-rc9 boot failure (megaraid?)

    On Tue, 2 Oct 2007, Adrian Bunk wrote:

    > Cc's added, the complete bug report is at
    > http://lkml.org/lkml/2007/10/2/243
    >
    > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
    >> 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
    >>
    >> System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
    >> ...

    >
    > Thanks for your report.
    >
    > Does reverting the commit below fix the problem?
    >
    > cu
    > Adrian
    >
    >
    > commit 3f6270ef76f2ce5c134615a470685d6c2a66c07e
    > Author: FUJITA Tomonori
    > Date: Mon May 14 20:17:27 2007 +0900
    >


    Confirmed; reverting the above (snipped) patch does fix the issue.

    --
    Burton Windle bwindle@fint.org

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  4. Re: 2.6.23-rc9 boot failure (megaraid?)

    On Tuesday, 2 October 2007 20:46, Burton Windle wrote:
    > On Tue, 2 Oct 2007, Adrian Bunk wrote:
    >
    > > Cc's added, the complete bug report is at
    > > http://lkml.org/lkml/2007/10/2/243
    > >
    > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
    > >> 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
    > >>
    > >> System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
    > >> ...

    > >
    > > Thanks for your report.
    > >
    > > Does reverting the commit below fix the problem?
    > >
    > > cu
    > > Adrian
    > >
    > >
    > > commit 3f6270ef76f2ce5c134615a470685d6c2a66c07e
    > > Author: FUJITA Tomonori
    > > Date: Mon May 14 20:17:27 2007 +0900
    > >

    >
    > Confirmed; reverting the above (snipped) patch does fix the issue.


    I've created a bugzilla entry for your report at:

    http://bugzilla.kernel.org/show_bug.cgi?id=9113

    Please add a summary of your observations in there.

    Greetings,
    Rafael
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  5. Re: 2.6.23-rc9 boot failure (megaraid?)

    On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
    > Cc's added, the complete bug report is at
    > http://lkml.org/lkml/2007/10/2/243
    >
    > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
    > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
    > >
    > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
    > >...

    >
    > Thanks for your report.
    >
    > Diff'ing the dmesg's shows:
    >
    > <-- snip -->
    >
    > scsi0: scanning scsi channel 4 [P0] for physical devices.
    > scsi0: scanning scsi channel 5 [P1] for physical devices.
    > st: Version 20070203, fixed bufsize 32768, s/g segs 256
    > -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
    > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    > sd 0:0:0:0: [sda] Write Protect is off
    > sd 0:0:0:0: [sda] Asking for cache data failed
    > sd 0:0:0:0: [sda] Assuming drive cache: write through
    > -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
    > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    > sd 0:0:0:0: [sda] Write Protect is off
    > sd 0:0:0:0: [sda] Asking for cache data failed
    > sd 0:0:0:0: [sda] Assuming drive cache: write through
    > sda: sda1
    > + sda: p1 exceeds device capacity
    >
    > <-- snip -->
    >
    > - case MEGA_BULK_DATA:
    > - if (scb->cmd->use_sg == 0)
    > - length = scb->cmd->request_bufflen;
    > - else {
    > - struct scatterlist *sgl =
    > - (struct scatterlist *)scb->cmd->request_buffer;
    > - length = sgl->length;
    > - }
    > - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
    > - length, scb->dma_direction);
    > - break;
    > -


    This is the problem piece I think. We've reintroduced a very old bug:

    commit 51c928c34fa7cff38df584ad01de988805877dba
    Author: James Bottomley
    Date: Sat Oct 1 09:38:05 2005 -0500

    [SCSI] Legacy MegaRAID: Fix READ CAPACITY

    Some Legacy megaraid cards can't actually cope with the scatter/gather
    version of the READ CAPACITY command (which is what we now send them
    since altering all SCSI internal I/O to go via the block layer). Fix
    this (and a few other broken megaraid driver assumptions) by sending
    the non-sg version of the command if the sg list only has a single
    element.

    Signed-off-by: James Bottomley

    So what we have to do is put back the check for use_sg == 1 and send
    that as a bulk transfer command.

    James


    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  6. Re: 2.6.23-rc9 boot failure (megaraid?)

    On Tue, 02 Oct 2007 15:38:13 -0500
    James Bottomley wrote:

    > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
    > > Cc's added, the complete bug report is at
    > > http://lkml.org/lkml/2007/10/2/243
    > >
    > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
    > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
    > > >
    > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
    > > >...

    > >
    > > Thanks for your report.
    > >
    > > Diff'ing the dmesg's shows:
    > >
    > > <-- snip -->
    > >
    > > scsi0: scanning scsi channel 4 [P0] for physical devices.
    > > scsi0: scanning scsi channel 5 [P1] for physical devices.
    > > st: Version 20070203, fixed bufsize 32768, s/g segs 256
    > > -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
    > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    > > sd 0:0:0:0: [sda] Write Protect is off
    > > sd 0:0:0:0: [sda] Asking for cache data failed
    > > sd 0:0:0:0: [sda] Assuming drive cache: write through
    > > -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
    > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    > > sd 0:0:0:0: [sda] Write Protect is off
    > > sd 0:0:0:0: [sda] Asking for cache data failed
    > > sd 0:0:0:0: [sda] Assuming drive cache: write through
    > > sda: sda1
    > > + sda: p1 exceeds device capacity
    > >
    > > <-- snip -->
    > >
    > > - case MEGA_BULK_DATA:
    > > - if (scb->cmd->use_sg == 0)
    > > - length = scb->cmd->request_bufflen;
    > > - else {
    > > - struct scatterlist *sgl =
    > > - (struct scatterlist *)scb->cmd->request_buffer;
    > > - length = sgl->length;
    > > - }
    > > - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
    > > - length, scb->dma_direction);
    > > - break;
    > > -

    >
    > This is the problem piece I think. We've reintroduced a very old bug:
    >
    > commit 51c928c34fa7cff38df584ad01de988805877dba
    > Author: James Bottomley
    > Date: Sat Oct 1 09:38:05 2005 -0500
    >
    > [SCSI] Legacy MegaRAID: Fix READ CAPACITY
    >
    > Some Legacy megaraid cards can't actually cope with the scatter/gather
    > version of the READ CAPACITY command (which is what we now send them
    > since altering all SCSI internal I/O to go via the block layer). Fix
    > this (and a few other broken megaraid driver assumptions) by sending
    > the non-sg version of the command if the sg list only has a single
    > element.
    >
    > Signed-off-by: James Bottomley
    >
    > So what we have to do is put back the check for use_sg == 1 and send
    > that as a bulk transfer command.


    Sorry again. Needs to check sg count before dma mapping.


    diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
    index 3907f67..ae0b220 100644
    --- a/drivers/scsi/megaraid.c
    +++ b/drivers/scsi/megaraid.c
    @@ -1737,9 +1737,12 @@ mega_build_sglist(adapter_t *adapter, scb_t *scb, u32 *buf, u32 *len)
    Scsi_Cmnd *cmd;
    int sgcnt;
    int idx;
    + int bulkdata;

    cmd = scb->cmd;

    + bulkdata = (scsi_sg_count(cmd) == 1) ? 1 : 0;
    +
    /*
    * Copy Scatter-Gather list info into controller structure.
    *
    @@ -1753,6 +1756,14 @@ mega_build_sglist(adapter_t *adapter, scb_t *scb, u32 *buf, u32 *len)

    *len = 0;

    + if (bulkdata && !adapter->has_64bit_addr) {
    + sg = scsi_sglist(cmd);
    + scb->dma_h_bulkdata = sg_dma_address(sg);
    + *buf = (u32)scb->dma_h_bulkdata;
    + *len = sg_dma_len(sg);
    + return 0;
    + }
    +
    scsi_for_each_sg(cmd, sg, sgcnt, idx) {
    if (adapter->has_64bit_addr) {
    scb->sgl64[idx].address = sg_dma_address(sg);
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  7. Re: 2.6.23-rc9 boot failure (megaraid?)

    On Tue, 02 Oct 2007 15:38:13 -0500
    James Bottomley wrote:

    > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
    > > Cc's added, the complete bug report is at
    > > http://lkml.org/lkml/2007/10/2/243
    > >
    > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
    > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
    > > >
    > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
    > > >...

    > >
    > > Thanks for your report.
    > >
    > > Diff'ing the dmesg's shows:
    > >
    > > <-- snip -->
    > >
    > > scsi0: scanning scsi channel 4 [P0] for physical devices.
    > > scsi0: scanning scsi channel 5 [P1] for physical devices.
    > > st: Version 20070203, fixed bufsize 32768, s/g segs 256
    > > -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
    > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    > > sd 0:0:0:0: [sda] Write Protect is off
    > > sd 0:0:0:0: [sda] Asking for cache data failed
    > > sd 0:0:0:0: [sda] Assuming drive cache: write through
    > > -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
    > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    > > sd 0:0:0:0: [sda] Write Protect is off
    > > sd 0:0:0:0: [sda] Asking for cache data failed
    > > sd 0:0:0:0: [sda] Assuming drive cache: write through
    > > sda: sda1
    > > + sda: p1 exceeds device capacity
    > >
    > > <-- snip -->
    > >
    > > - case MEGA_BULK_DATA:
    > > - if (scb->cmd->use_sg == 0)
    > > - length = scb->cmd->request_bufflen;
    > > - else {
    > > - struct scatterlist *sgl =
    > > - (struct scatterlist *)scb->cmd->request_buffer;
    > > - length = sgl->length;
    > > - }
    > > - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
    > > - length, scb->dma_direction);
    > > - break;
    > > -

    >
    > This is the problem piece I think. We've reintroduced a very old bug:
    >
    > commit 51c928c34fa7cff38df584ad01de988805877dba
    > Author: James Bottomley
    > Date: Sat Oct 1 09:38:05 2005 -0500
    >
    > [SCSI] Legacy MegaRAID: Fix READ CAPACITY
    >
    > Some Legacy megaraid cards can't actually cope with the scatter/gather
    > version of the READ CAPACITY command (which is what we now send them
    > since altering all SCSI internal I/O to go via the block layer). Fix
    > this (and a few other broken megaraid driver assumptions) by sending
    > the non-sg version of the command if the sg list only has a single
    > element.
    >
    > Signed-off-by: James Bottomley
    >
    > So what we have to do is put back the check for use_sg == 1 and send
    > that as a bulk transfer command.


    Sorry about this. Can this fix the problem?

    Thanks,


    diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
    index 3907f67..da56163 100644
    --- a/drivers/scsi/megaraid.c
    +++ b/drivers/scsi/megaraid.c
    @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter, scb_t *scb, u32 *buf, u32 *len)

    *len = 0;

    + if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
    + sg = scsi_sglist(cmd);
    + scb->dma_h_bulkdata = sg_dma_address(sg);
    + *buf = (u32)scb->dma_h_bulkdata;
    + *len = sg_dma_len(sg);
    + return 0;
    + }
    +
    scsi_for_each_sg(cmd, sg, sgcnt, idx) {
    if (adapter->has_64bit_addr) {
    scb->sgl64[idx].address = sg_dma_address(sg);
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  8. RE: 2.6.23-rc9 boot failure (megaraid?)



    > -----Original Message-----
    > From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp]
    > Sent: Tuesday, October 02, 2007 5:01 PM
    > To: James.Bottomley@SteelEye.com
    > Cc: bunk@kernel.org; bwindle@fint.org;
    > linux-kernel@vger.kernel.org; jens.axboe@oracle.com;
    > fujita.tomonori@lab.ntt.co.jp; Patro, Sumant; DL-MegaRAID
    > Linux; linux-scsi@vger.kernel.org
    > Subject: Re: 2.6.23-rc9 boot failure (megaraid?)
    >
    > On Tue, 02 Oct 2007 15:38:13 -0500
    > James Bottomley wrote:
    >
    > > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
    > > > Cc's added, the complete bug report is at
    > > > http://lkml.org/lkml/2007/10/2/243
    > > >
    > > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
    > > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
    > > > >
    > > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
    > > > >...
    > > >
    > > > Thanks for your report.
    > > >
    > > > Diff'ing the dmesg's shows:
    > > >
    > > > <-- snip -->
    > > >
    > > > scsi0: scanning scsi channel 4 [P0] for physical devices.
    > > > scsi0: scanning scsi channel 5 [P1] for physical devices.
    > > > st: Version 20070203, fixed bufsize 32768, s/g segs 256 -sd
    > > > 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
    > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
    > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive

    > cache: write
    > > > through -sd 0:0:0:0: [sda] 17547264 512-byte hardware

    > sectors (8984
    > > > MB)
    > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
    > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive

    > cache: write
    > > > through
    > > > sda: sda1
    > > > + sda: p1 exceeds device capacity
    > > >
    > > > <-- snip -->
    > > >
    > > > - case MEGA_BULK_DATA:
    > > > - if (scb->cmd->use_sg == 0)
    > > > - length = scb->cmd->request_bufflen;
    > > > - else {
    > > > - struct scatterlist *sgl =
    > > > - (struct scatterlist

    > *)scb->cmd->request_buffer;
    > > > - length = sgl->length;
    > > > - }
    > > > - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
    > > > - length, scb->dma_direction);
    > > > - break;
    > > > -

    > >
    > > This is the problem piece I think. We've reintroduced a

    > very old bug:
    > >
    > > commit 51c928c34fa7cff38df584ad01de988805877dba
    > > Author: James Bottomley
    > > Date: Sat Oct 1 09:38:05 2005 -0500
    > >
    > > [SCSI] Legacy MegaRAID: Fix READ CAPACITY
    > >
    > > Some Legacy megaraid cards can't actually cope with the

    > scatter/gather
    > > version of the READ CAPACITY command (which is what we

    > now send them
    > > since altering all SCSI internal I/O to go via the

    > block layer). Fix
    > > this (and a few other broken megaraid driver

    > assumptions) by sending
    > > the non-sg version of the command if the sg list only

    > has a single
    > > element.
    > >
    > > Signed-off-by: James Bottomley
    > >
    > > So what we have to do is put back the check for use_sg == 1

    > and send
    > > that as a bulk transfer command.

    >
    > Sorry about this. Can this fix the problem?
    >
    > Thanks,
    >
    >
    > diff --git a/drivers/scsi/megaraid.c
    > b/drivers/scsi/megaraid.c index 3907f67..da56163 100644
    > --- a/drivers/scsi/megaraid.c
    > +++ b/drivers/scsi/megaraid.c
    > @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter,
    > scb_t *scb, u32 *buf, u32 *len)
    >
    > *len = 0;
    >
    > + if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
    > + sg = scsi_sglist(cmd);
    > + scb->dma_h_bulkdata = sg_dma_address(sg);
    > + *buf = (u32)scb->dma_h_bulkdata;
    > + *len = sg_dma_len(sg);
    > + return 0;
    > + }
    > +
    > scsi_for_each_sg(cmd, sg, sgcnt, idx) {
    > if (adapter->has_64bit_addr) {
    > scb->sgl64[idx].address = sg_dma_address(sg);
    >



    With this patch I see the correct logical disk size reported.
    Thanks.

    Sumant
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  9. RE: 2.6.23-rc9 boot failure (megaraid?)

    On Wed, 3 Oct 2007 17:32:55 -0600
    "Patro, Sumant" wrote:

    >
    >
    > > -----Original Message-----
    > > From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp]
    > > Sent: Tuesday, October 02, 2007 5:01 PM
    > > To: James.Bottomley@SteelEye.com
    > > Cc: bunk@kernel.org; bwindle@fint.org;
    > > linux-kernel@vger.kernel.org; jens.axboe@oracle.com;
    > > fujita.tomonori@lab.ntt.co.jp; Patro, Sumant; DL-MegaRAID
    > > Linux; linux-scsi@vger.kernel.org
    > > Subject: Re: 2.6.23-rc9 boot failure (megaraid?)
    > >
    > > On Tue, 02 Oct 2007 15:38:13 -0500
    > > James Bottomley wrote:
    > >
    > > > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
    > > > > Cc's added, the complete bug report is at
    > > > > http://lkml.org/lkml/2007/10/2/243
    > > > >
    > > > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
    > > > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
    > > > > >
    > > > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
    > > > > >...
    > > > >
    > > > > Thanks for your report.
    > > > >
    > > > > Diff'ing the dmesg's shows:
    > > > >
    > > > > <-- snip -->
    > > > >
    > > > > scsi0: scanning scsi channel 4 [P0] for physical devices.
    > > > > scsi0: scanning scsi channel 5 [P1] for physical devices.
    > > > > st: Version 20070203, fixed bufsize 32768, s/g segs 256 -sd
    > > > > 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
    > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    > > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
    > > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive

    > > cache: write
    > > > > through -sd 0:0:0:0: [sda] 17547264 512-byte hardware

    > > sectors (8984
    > > > > MB)
    > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    > > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
    > > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive

    > > cache: write
    > > > > through
    > > > > sda: sda1
    > > > > + sda: p1 exceeds device capacity
    > > > >
    > > > > <-- snip -->
    > > > >
    > > > > - case MEGA_BULK_DATA:
    > > > > - if (scb->cmd->use_sg == 0)
    > > > > - length = scb->cmd->request_bufflen;
    > > > > - else {
    > > > > - struct scatterlist *sgl =
    > > > > - (struct scatterlist

    > > *)scb->cmd->request_buffer;
    > > > > - length = sgl->length;
    > > > > - }
    > > > > - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
    > > > > - length, scb->dma_direction);
    > > > > - break;
    > > > > -
    > > >
    > > > This is the problem piece I think. We've reintroduced a

    > > very old bug:
    > > >
    > > > commit 51c928c34fa7cff38df584ad01de988805877dba
    > > > Author: James Bottomley
    > > > Date: Sat Oct 1 09:38:05 2005 -0500
    > > >
    > > > [SCSI] Legacy MegaRAID: Fix READ CAPACITY
    > > >
    > > > Some Legacy megaraid cards can't actually cope with the

    > > scatter/gather
    > > > version of the READ CAPACITY command (which is what we

    > > now send them
    > > > since altering all SCSI internal I/O to go via the

    > > block layer). Fix
    > > > this (and a few other broken megaraid driver

    > > assumptions) by sending
    > > > the non-sg version of the command if the sg list only

    > > has a single
    > > > element.
    > > >
    > > > Signed-off-by: James Bottomley
    > > >
    > > > So what we have to do is put back the check for use_sg == 1

    > > and send
    > > > that as a bulk transfer command.

    > >
    > > Sorry about this. Can this fix the problem?
    > >
    > > Thanks,
    > >
    > >
    > > diff --git a/drivers/scsi/megaraid.c
    > > b/drivers/scsi/megaraid.c index 3907f67..da56163 100644
    > > --- a/drivers/scsi/megaraid.c
    > > +++ b/drivers/scsi/megaraid.c
    > > @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter,
    > > scb_t *scb, u32 *buf, u32 *len)
    > >
    > > *len = 0;
    > >
    > > + if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
    > > + sg = scsi_sglist(cmd);
    > > + scb->dma_h_bulkdata = sg_dma_address(sg);
    > > + *buf = (u32)scb->dma_h_bulkdata;
    > > + *len = sg_dma_len(sg);
    > > + return 0;
    > > + }
    > > +
    > > scsi_for_each_sg(cmd, sg, sgcnt, idx) {
    > > if (adapter->has_64bit_addr) {
    > > scb->sgl64[idx].address = sg_dma_address(sg);
    > >

    >
    >
    > With this patch I see the correct logical disk size reported.
    > Thanks.


    Great, thanks for testing!

    Can you try the following patch instead of the above patch?

    http://marc.info/?l=linux-scsi&m=119137033016550&w=2


    I know the changes are pretty trivial and it should work...
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  10. Re: 2.6.23-rc9 boot failure (megaraid?)

    On Thu, Oct 04 2007, FUJITA Tomonori wrote:
    > On Wed, 3 Oct 2007 17:32:55 -0600
    > "Patro, Sumant" wrote:
    >
    > >
    > >
    > > > -----Original Message-----
    > > > From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp]
    > > > Sent: Tuesday, October 02, 2007 5:01 PM
    > > > To: James.Bottomley@SteelEye.com
    > > > Cc: bunk@kernel.org; bwindle@fint.org;
    > > > linux-kernel@vger.kernel.org; jens.axboe@oracle.com;
    > > > fujita.tomonori@lab.ntt.co.jp; Patro, Sumant; DL-MegaRAID
    > > > Linux; linux-scsi@vger.kernel.org
    > > > Subject: Re: 2.6.23-rc9 boot failure (megaraid?)
    > > >
    > > > On Tue, 02 Oct 2007 15:38:13 -0500
    > > > James Bottomley wrote:
    > > >
    > > > > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
    > > > > > Cc's added, the complete bug report is at
    > > > > > http://lkml.org/lkml/2007/10/2/243
    > > > > >
    > > > > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
    > > > > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
    > > > > > >
    > > > > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
    > > > > > >...
    > > > > >
    > > > > > Thanks for your report.
    > > > > >
    > > > > > Diff'ing the dmesg's shows:
    > > > > >
    > > > > > <-- snip -->
    > > > > >
    > > > > > scsi0: scanning scsi channel 4 [P0] for physical devices.
    > > > > > scsi0: scanning scsi channel 5 [P1] for physical devices.
    > > > > > st: Version 20070203, fixed bufsize 32768, s/g segs 256 -sd
    > > > > > 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
    > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    > > > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
    > > > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
    > > > cache: write
    > > > > > through -sd 0:0:0:0: [sda] 17547264 512-byte hardware
    > > > sectors (8984
    > > > > > MB)
    > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    > > > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
    > > > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
    > > > cache: write
    > > > > > through
    > > > > > sda: sda1
    > > > > > + sda: p1 exceeds device capacity
    > > > > >
    > > > > > <-- snip -->
    > > > > >
    > > > > > - case MEGA_BULK_DATA:
    > > > > > - if (scb->cmd->use_sg == 0)
    > > > > > - length = scb->cmd->request_bufflen;
    > > > > > - else {
    > > > > > - struct scatterlist *sgl =
    > > > > > - (struct scatterlist
    > > > *)scb->cmd->request_buffer;
    > > > > > - length = sgl->length;
    > > > > > - }
    > > > > > - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
    > > > > > - length, scb->dma_direction);
    > > > > > - break;
    > > > > > -
    > > > >
    > > > > This is the problem piece I think. We've reintroduced a
    > > > very old bug:
    > > > >
    > > > > commit 51c928c34fa7cff38df584ad01de988805877dba
    > > > > Author: James Bottomley
    > > > > Date: Sat Oct 1 09:38:05 2005 -0500
    > > > >
    > > > > [SCSI] Legacy MegaRAID: Fix READ CAPACITY
    > > > >
    > > > > Some Legacy megaraid cards can't actually cope with the
    > > > scatter/gather
    > > > > version of the READ CAPACITY command (which is what we
    > > > now send them
    > > > > since altering all SCSI internal I/O to go via the
    > > > block layer). Fix
    > > > > this (and a few other broken megaraid driver
    > > > assumptions) by sending
    > > > > the non-sg version of the command if the sg list only
    > > > has a single
    > > > > element.
    > > > >
    > > > > Signed-off-by: James Bottomley
    > > > >
    > > > > So what we have to do is put back the check for use_sg == 1
    > > > and send
    > > > > that as a bulk transfer command.
    > > >
    > > > Sorry about this. Can this fix the problem?
    > > >
    > > > Thanks,
    > > >
    > > >
    > > > diff --git a/drivers/scsi/megaraid.c
    > > > b/drivers/scsi/megaraid.c index 3907f67..da56163 100644
    > > > --- a/drivers/scsi/megaraid.c
    > > > +++ b/drivers/scsi/megaraid.c
    > > > @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter,
    > > > scb_t *scb, u32 *buf, u32 *len)
    > > >
    > > > *len = 0;
    > > >
    > > > + if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
    > > > + sg = scsi_sglist(cmd);
    > > > + scb->dma_h_bulkdata = sg_dma_address(sg);
    > > > + *buf = (u32)scb->dma_h_bulkdata;
    > > > + *len = sg_dma_len(sg);
    > > > + return 0;
    > > > + }
    > > > +
    > > > scsi_for_each_sg(cmd, sg, sgcnt, idx) {
    > > > if (adapter->has_64bit_addr) {
    > > > scb->sgl64[idx].address = sg_dma_address(sg);
    > > >

    > >
    > >
    > > With this patch I see the correct logical disk size reported.
    > > Thanks.

    >
    > Great, thanks for testing!
    >
    > Can you try the following patch instead of the above patch?
    >
    > http://marc.info/?l=linux-scsi&m=119137033016550&w=2
    >
    >
    > I know the changes are pretty trivial and it should work...


    Tomo, this is the patch I added.

    --
    Jens Axboe

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  11. Re: 2.6.23-rc9 boot failure (megaraid?)

    On Thu, 4 Oct 2007 09:28:34 +0200
    Jens Axboe wrote:

    > On Thu, Oct 04 2007, FUJITA Tomonori wrote:
    > > On Wed, 3 Oct 2007 17:32:55 -0600
    > > "Patro, Sumant" wrote:
    > >
    > > >
    > > >
    > > > > -----Original Message-----
    > > > > From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp]
    > > > > Sent: Tuesday, October 02, 2007 5:01 PM
    > > > > To: James.Bottomley@SteelEye.com
    > > > > Cc: bunk@kernel.org; bwindle@fint.org;
    > > > > linux-kernel@vger.kernel.org; jens.axboe@oracle.com;
    > > > > fujita.tomonori@lab.ntt.co.jp; Patro, Sumant; DL-MegaRAID
    > > > > Linux; linux-scsi@vger.kernel.org
    > > > > Subject: Re: 2.6.23-rc9 boot failure (megaraid?)
    > > > >
    > > > > On Tue, 02 Oct 2007 15:38:13 -0500
    > > > > James Bottomley wrote:
    > > > >
    > > > > > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
    > > > > > > Cc's added, the complete bug report is at
    > > > > > > http://lkml.org/lkml/2007/10/2/243
    > > > > > >
    > > > > > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
    > > > > > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
    > > > > > > >
    > > > > > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
    > > > > > > >...
    > > > > > >
    > > > > > > Thanks for your report.
    > > > > > >
    > > > > > > Diff'ing the dmesg's shows:
    > > > > > >
    > > > > > > <-- snip -->
    > > > > > >
    > > > > > > scsi0: scanning scsi channel 4 [P0] for physical devices.
    > > > > > > scsi0: scanning scsi channel 5 [P1] for physical devices.
    > > > > > > st: Version 20070203, fixed bufsize 32768, s/g segs 256 -sd
    > > > > > > 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
    > > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    > > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    > > > > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
    > > > > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
    > > > > cache: write
    > > > > > > through -sd 0:0:0:0: [sda] 17547264 512-byte hardware
    > > > > sectors (8984
    > > > > > > MB)
    > > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    > > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    > > > > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
    > > > > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
    > > > > cache: write
    > > > > > > through
    > > > > > > sda: sda1
    > > > > > > + sda: p1 exceeds device capacity
    > > > > > >
    > > > > > > <-- snip -->
    > > > > > >
    > > > > > > - case MEGA_BULK_DATA:
    > > > > > > - if (scb->cmd->use_sg == 0)
    > > > > > > - length = scb->cmd->request_bufflen;
    > > > > > > - else {
    > > > > > > - struct scatterlist *sgl =
    > > > > > > - (struct scatterlist
    > > > > *)scb->cmd->request_buffer;
    > > > > > > - length = sgl->length;
    > > > > > > - }
    > > > > > > - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
    > > > > > > - length, scb->dma_direction);
    > > > > > > - break;
    > > > > > > -
    > > > > >
    > > > > > This is the problem piece I think. We've reintroduced a
    > > > > very old bug:
    > > > > >
    > > > > > commit 51c928c34fa7cff38df584ad01de988805877dba
    > > > > > Author: James Bottomley
    > > > > > Date: Sat Oct 1 09:38:05 2005 -0500
    > > > > >
    > > > > > [SCSI] Legacy MegaRAID: Fix READ CAPACITY
    > > > > >
    > > > > > Some Legacy megaraid cards can't actually cope with the
    > > > > scatter/gather
    > > > > > version of the READ CAPACITY command (which is what we
    > > > > now send them
    > > > > > since altering all SCSI internal I/O to go via the
    > > > > block layer). Fix
    > > > > > this (and a few other broken megaraid driver
    > > > > assumptions) by sending
    > > > > > the non-sg version of the command if the sg list only
    > > > > has a single
    > > > > > element.
    > > > > >
    > > > > > Signed-off-by: James Bottomley
    > > > > >
    > > > > > So what we have to do is put back the check for use_sg == 1
    > > > > and send
    > > > > > that as a bulk transfer command.
    > > > >
    > > > > Sorry about this. Can this fix the problem?
    > > > >
    > > > > Thanks,
    > > > >
    > > > >
    > > > > diff --git a/drivers/scsi/megaraid.c
    > > > > b/drivers/scsi/megaraid.c index 3907f67..da56163 100644
    > > > > --- a/drivers/scsi/megaraid.c
    > > > > +++ b/drivers/scsi/megaraid.c
    > > > > @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter,
    > > > > scb_t *scb, u32 *buf, u32 *len)
    > > > >
    > > > > *len = 0;
    > > > >
    > > > > + if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
    > > > > + sg = scsi_sglist(cmd);
    > > > > + scb->dma_h_bulkdata = sg_dma_address(sg);
    > > > > + *buf = (u32)scb->dma_h_bulkdata;
    > > > > + *len = sg_dma_len(sg);
    > > > > + return 0;
    > > > > + }
    > > > > +
    > > > > scsi_for_each_sg(cmd, sg, sgcnt, idx) {
    > > > > if (adapter->has_64bit_addr) {
    > > > > scb->sgl64[idx].address = sg_dma_address(sg);
    > > > >
    > > >
    > > >
    > > > With this patch I see the correct logical disk size reported.
    > > > Thanks.

    > >
    > > Great, thanks for testing!
    > >
    > > Can you try the following patch instead of the above patch?
    > >
    > > http://marc.info/?l=linux-scsi&m=119137033016550&w=2
    > >
    > >
    > > I know the changes are pretty trivial and it should work...

    >
    > Tomo, this is the patch I added.


    Thanks. I thought that it will be sent via scsi-misc because the scsi
    accessor patch introduced this bug. But either is ok with me.

    BTW, please add my sign-off.

    -
    [SCSI] megaraid_old: fix scatter/gather for legacy megaraid cards

    Some legacy megaraid cards (!has_64bit_addr case) can't cope with the
    catter/gather version of the READ CAPACITY command. We need to send
    the non-sg version of the command if the sg list only as a single
    element.

    commit 3f6270ef76f2ce5c134615a470685d6c2a66c07e reintroduced this bug,
    which was fixed long ago (commit 51c928c34fa7cff38df584ad01de988805877dba).

    Signed-off-by: FUJITA Tomonori
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  12. Re: 2.6.23-rc9 boot failure (megaraid?)

    On Thu, Oct 04 2007, FUJITA Tomonori wrote:
    > On Thu, 4 Oct 2007 09:28:34 +0200
    > Jens Axboe wrote:
    >
    > > On Thu, Oct 04 2007, FUJITA Tomonori wrote:
    > > > On Wed, 3 Oct 2007 17:32:55 -0600
    > > > "Patro, Sumant" wrote:
    > > >
    > > > >
    > > > >
    > > > > > -----Original Message-----
    > > > > > From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp]
    > > > > > Sent: Tuesday, October 02, 2007 5:01 PM
    > > > > > To: James.Bottomley@SteelEye.com
    > > > > > Cc: bunk@kernel.org; bwindle@fint.org;
    > > > > > linux-kernel@vger.kernel.org; jens.axboe@oracle.com;
    > > > > > fujita.tomonori@lab.ntt.co.jp; Patro, Sumant; DL-MegaRAID
    > > > > > Linux; linux-scsi@vger.kernel.org
    > > > > > Subject: Re: 2.6.23-rc9 boot failure (megaraid?)
    > > > > >
    > > > > > On Tue, 02 Oct 2007 15:38:13 -0500
    > > > > > James Bottomley wrote:
    > > > > >
    > > > > > > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
    > > > > > > > Cc's added, the complete bug report is at
    > > > > > > > http://lkml.org/lkml/2007/10/2/243
    > > > > > > >
    > > > > > > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
    > > > > > > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
    > > > > > > > >
    > > > > > > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
    > > > > > > > >...
    > > > > > > >
    > > > > > > > Thanks for your report.
    > > > > > > >
    > > > > > > > Diff'ing the dmesg's shows:
    > > > > > > >
    > > > > > > > <-- snip -->
    > > > > > > >
    > > > > > > > scsi0: scanning scsi channel 4 [P0] for physical devices.
    > > > > > > > scsi0: scanning scsi channel 5 [P1] for physical devices.
    > > > > > > > st: Version 20070203, fixed bufsize 32768, s/g segs 256 -sd
    > > > > > > > 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
    > > > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    > > > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    > > > > > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
    > > > > > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
    > > > > > cache: write
    > > > > > > > through -sd 0:0:0:0: [sda] 17547264 512-byte hardware
    > > > > > sectors (8984
    > > > > > > > MB)
    > > > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    > > > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    > > > > > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
    > > > > > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
    > > > > > cache: write
    > > > > > > > through
    > > > > > > > sda: sda1
    > > > > > > > + sda: p1 exceeds device capacity
    > > > > > > >
    > > > > > > > <-- snip -->
    > > > > > > >
    > > > > > > > - case MEGA_BULK_DATA:
    > > > > > > > - if (scb->cmd->use_sg == 0)
    > > > > > > > - length = scb->cmd->request_bufflen;
    > > > > > > > - else {
    > > > > > > > - struct scatterlist *sgl =
    > > > > > > > - (struct scatterlist
    > > > > > *)scb->cmd->request_buffer;
    > > > > > > > - length = sgl->length;
    > > > > > > > - }
    > > > > > > > - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
    > > > > > > > - length, scb->dma_direction);
    > > > > > > > - break;
    > > > > > > > -
    > > > > > >
    > > > > > > This is the problem piece I think. We've reintroduced a
    > > > > > very old bug:
    > > > > > >
    > > > > > > commit 51c928c34fa7cff38df584ad01de988805877dba
    > > > > > > Author: James Bottomley
    > > > > > > Date: Sat Oct 1 09:38:05 2005 -0500
    > > > > > >
    > > > > > > [SCSI] Legacy MegaRAID: Fix READ CAPACITY
    > > > > > >
    > > > > > > Some Legacy megaraid cards can't actually cope with the
    > > > > > scatter/gather
    > > > > > > version of the READ CAPACITY command (which is what we
    > > > > > now send them
    > > > > > > since altering all SCSI internal I/O to go via the
    > > > > > block layer). Fix
    > > > > > > this (and a few other broken megaraid driver
    > > > > > assumptions) by sending
    > > > > > > the non-sg version of the command if the sg list only
    > > > > > has a single
    > > > > > > element.
    > > > > > >
    > > > > > > Signed-off-by: James Bottomley
    > > > > > >
    > > > > > > So what we have to do is put back the check for use_sg == 1
    > > > > > and send
    > > > > > > that as a bulk transfer command.
    > > > > >
    > > > > > Sorry about this. Can this fix the problem?
    > > > > >
    > > > > > Thanks,
    > > > > >
    > > > > >
    > > > > > diff --git a/drivers/scsi/megaraid.c
    > > > > > b/drivers/scsi/megaraid.c index 3907f67..da56163 100644
    > > > > > --- a/drivers/scsi/megaraid.c
    > > > > > +++ b/drivers/scsi/megaraid.c
    > > > > > @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter,
    > > > > > scb_t *scb, u32 *buf, u32 *len)
    > > > > >
    > > > > > *len = 0;
    > > > > >
    > > > > > + if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
    > > > > > + sg = scsi_sglist(cmd);
    > > > > > + scb->dma_h_bulkdata = sg_dma_address(sg);
    > > > > > + *buf = (u32)scb->dma_h_bulkdata;
    > > > > > + *len = sg_dma_len(sg);
    > > > > > + return 0;
    > > > > > + }
    > > > > > +
    > > > > > scsi_for_each_sg(cmd, sg, sgcnt, idx) {
    > > > > > if (adapter->has_64bit_addr) {
    > > > > > scb->sgl64[idx].address = sg_dma_address(sg);
    > > > > >
    > > > >
    > > > >
    > > > > With this patch I see the correct logical disk size reported.
    > > > > Thanks.
    > > >
    > > > Great, thanks for testing!
    > > >
    > > > Can you try the following patch instead of the above patch?
    > > >
    > > > http://marc.info/?l=linux-scsi&m=119137033016550&w=2
    > > >
    > > >
    > > > I know the changes are pretty trivial and it should work...

    > >
    > > Tomo, this is the patch I added.

    >
    > Thanks. I thought that it will be sent via scsi-misc because the scsi
    > accessor patch introduced this bug. But either is ok with me.


    If it only affects the driver _after_ the scsi accessor patch and as
    such doesn't screw over git-block, then I'll drop it for sure.

    --
    Jens Axboe

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  13. Re: 2.6.23-rc9 boot failure (megaraid?)

    On Thu, Oct 04, 2007 at 09:28:34AM +0200, Jens Axboe wrote:
    >...
    > Tomo, this is the patch I added.


    Please excuse my comment in case this was already clear:

    You are aware that this bug is a regression in 2.6.23-rc and the patch
    should therefore go to Linus ASAP and not after the release of 2.6.23?

    > Jens Axboe


    cu
    Adrian

    --

    "Is there not promise of rain?" Ling Tan asked suddenly out
    of the darkness. There had been need of rain for many days.
    "Only a promise," Lao Er said.
    Pearl S. Buck - Dragon Seed

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  14. Re: 2.6.23-rc9 boot failure (megaraid?)

    On Thu, Oct 04 2007, FUJITA Tomonori wrote:
    > On Thu, 4 Oct 2007 12:48:58 +0200
    > Adrian Bunk wrote:
    >
    > > On Thu, Oct 04, 2007 at 09:28:34AM +0200, Jens Axboe wrote:
    > > >...
    > > > Tomo, this is the patch I added.

    > >
    > > Please excuse my comment in case this was already clear:
    > >
    > > You are aware that this bug is a regression in 2.6.23-rc and the patch
    > > should therefore go to Linus ASAP and not after the release of 2.6.23?

    >
    > Oops, you are right. This should go via scsi-rc-fixes tree ASAP.


    Irk, the scsi accessor stuff is already in, I forgot and thought it was
    pending for 2.6.24. So rush the patch upstream please!

    --
    Jens Axboe

    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  15. Re: 2.6.23-rc9 boot failure (megaraid?)

    On Thu, 4 Oct 2007 12:48:58 +0200
    Adrian Bunk wrote:

    > On Thu, Oct 04, 2007 at 09:28:34AM +0200, Jens Axboe wrote:
    > >...
    > > Tomo, this is the patch I added.

    >
    > Please excuse my comment in case this was already clear:
    >
    > You are aware that this bug is a regression in 2.6.23-rc and the patch
    > should therefore go to Linus ASAP and not after the release of 2.6.23?


    Oops, you are right. This should go via scsi-rc-fixes tree ASAP.
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

  16. Re: 2.6.23-rc9 boot failure (megaraid?)

    On Thu, 2007-10-04 at 12:36 +0200, Jens Axboe wrote:
    > On Thu, Oct 04 2007, FUJITA Tomonori wrote:
    > > On Thu, 4 Oct 2007 09:28:34 +0200
    > > Jens Axboe wrote:
    > >
    > > > On Thu, Oct 04 2007, FUJITA Tomonori wrote:
    > > > > On Wed, 3 Oct 2007 17:32:55 -0600
    > > > > "Patro, Sumant" wrote:
    > > > >
    > > > > >
    > > > > >
    > > > > > > -----Original Message-----
    > > > > > > From: FUJITA Tomonori [mailto:fujita.tomonori@lab.ntt.co.jp]
    > > > > > > Sent: Tuesday, October 02, 2007 5:01 PM
    > > > > > > To: James.Bottomley@SteelEye.com
    > > > > > > Cc: bunk@kernel.org; bwindle@fint.org;
    > > > > > > linux-kernel@vger.kernel.org; jens.axboe@oracle.com;
    > > > > > > fujita.tomonori@lab.ntt.co.jp; Patro, Sumant; DL-MegaRAID
    > > > > > > Linux; linux-scsi@vger.kernel.org
    > > > > > > Subject: Re: 2.6.23-rc9 boot failure (megaraid?)
    > > > > > >
    > > > > > > On Tue, 02 Oct 2007 15:38:13 -0500
    > > > > > > James Bottomley wrote:
    > > > > > >
    > > > > > > > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
    > > > > > > > > Cc's added, the complete bug report is at
    > > > > > > > > http://lkml.org/lkml/2007/10/2/243
    > > > > > > > >
    > > > > > > > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
    > > > > > > > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
    > > > > > > > > >
    > > > > > > > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
    > > > > > > > > >...
    > > > > > > > >
    > > > > > > > > Thanks for your report.
    > > > > > > > >
    > > > > > > > > Diff'ing the dmesg's shows:
    > > > > > > > >
    > > > > > > > > <-- snip -->
    > > > > > > > >
    > > > > > > > > scsi0: scanning scsi channel 4 [P0] for physical devices.
    > > > > > > > > scsi0: scanning scsi channel 5 [P1] for physical devices.
    > > > > > > > > st: Version 20070203, fixed bufsize 32768, s/g segs 256 -sd
    > > > > > > > > 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
    > > > > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    > > > > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    > > > > > > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
    > > > > > > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
    > > > > > > cache: write
    > > > > > > > > through -sd 0:0:0:0: [sda] 17547264 512-byte hardware
    > > > > > > sectors (8984
    > > > > > > > > MB)
    > > > > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
    > > > > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
    > > > > > > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
    > > > > > > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
    > > > > > > cache: write
    > > > > > > > > through
    > > > > > > > > sda: sda1
    > > > > > > > > + sda: p1 exceeds device capacity
    > > > > > > > >
    > > > > > > > > <-- snip -->
    > > > > > > > >
    > > > > > > > > - case MEGA_BULK_DATA:
    > > > > > > > > - if (scb->cmd->use_sg == 0)
    > > > > > > > > - length = scb->cmd->request_bufflen;
    > > > > > > > > - else {
    > > > > > > > > - struct scatterlist *sgl =
    > > > > > > > > - (struct scatterlist
    > > > > > > *)scb->cmd->request_buffer;
    > > > > > > > > - length = sgl->length;
    > > > > > > > > - }
    > > > > > > > > - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
    > > > > > > > > - length, scb->dma_direction);
    > > > > > > > > - break;
    > > > > > > > > -
    > > > > > > >
    > > > > > > > This is the problem piece I think. We've reintroduced a
    > > > > > > very old bug:
    > > > > > > >
    > > > > > > > commit 51c928c34fa7cff38df584ad01de988805877dba
    > > > > > > > Author: James Bottomley
    > > > > > > > Date: Sat Oct 1 09:38:05 2005 -0500
    > > > > > > >
    > > > > > > > [SCSI] Legacy MegaRAID: Fix READ CAPACITY
    > > > > > > >
    > > > > > > > Some Legacy megaraid cards can't actually cope with the
    > > > > > > scatter/gather
    > > > > > > > version of the READ CAPACITY command (which is what we
    > > > > > > now send them
    > > > > > > > since altering all SCSI internal I/O to go via the
    > > > > > > block layer). Fix
    > > > > > > > this (and a few other broken megaraid driver
    > > > > > > assumptions) by sending
    > > > > > > > the non-sg version of the command if the sg list only
    > > > > > > has a single
    > > > > > > > element.
    > > > > > > >
    > > > > > > > Signed-off-by: James Bottomley
    > > > > > > >
    > > > > > > > So what we have to do is put back the check for use_sg == 1
    > > > > > > and send
    > > > > > > > that as a bulk transfer command.
    > > > > > >
    > > > > > > Sorry about this. Can this fix the problem?
    > > > > > >
    > > > > > > Thanks,
    > > > > > >
    > > > > > >
    > > > > > > diff --git a/drivers/scsi/megaraid.c
    > > > > > > b/drivers/scsi/megaraid.c index 3907f67..da56163 100644
    > > > > > > --- a/drivers/scsi/megaraid.c
    > > > > > > +++ b/drivers/scsi/megaraid.c
    > > > > > > @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter,
    > > > > > > scb_t *scb, u32 *buf, u32 *len)
    > > > > > >
    > > > > > > *len = 0;
    > > > > > >
    > > > > > > + if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
    > > > > > > + sg = scsi_sglist(cmd);
    > > > > > > + scb->dma_h_bulkdata = sg_dma_address(sg);
    > > > > > > + *buf = (u32)scb->dma_h_bulkdata;
    > > > > > > + *len = sg_dma_len(sg);
    > > > > > > + return 0;
    > > > > > > + }
    > > > > > > +
    > > > > > > scsi_for_each_sg(cmd, sg, sgcnt, idx) {
    > > > > > > if (adapter->has_64bit_addr) {
    > > > > > > scb->sgl64[idx].address = sg_dma_address(sg);
    > > > > > >
    > > > > >
    > > > > >
    > > > > > With this patch I see the correct logical disk size reported.
    > > > > > Thanks.
    > > > >
    > > > > Great, thanks for testing!
    > > > >
    > > > > Can you try the following patch instead of the above patch?
    > > > >
    > > > > http://marc.info/?l=linux-scsi&m=119137033016550&w=2
    > > > >
    > > > >
    > > > > I know the changes are pretty trivial and it should work...
    > > >
    > > > Tomo, this is the patch I added.

    > >
    > > Thanks. I thought that it will be sent via scsi-misc because the scsi
    > > accessor patch introduced this bug. But either is ok with me.

    >
    > If it only affects the driver _after_ the scsi accessor patch and as
    > such doesn't screw over git-block, then I'll drop it for sure.


    No, this is a release critical fix ... I'll roll it up and send it in
    for 2.6.23.

    James


    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/

+ Reply to Thread