xvm I/O errors - SGI

This is a discussion on xvm I/O errors - SGI ; I'm having a hard time deciphering the xvm errors that's appearing in my Origin200's syslog. It's obvious that there's a bad disk in the FC Clariion array somewhere, but I'm not sure which disk needs to be replaced. The error ...

+ Reply to Thread
Results 1 to 4 of 4

Thread: xvm I/O errors

  1. xvm I/O errors


    I'm having a hard time deciphering the xvm errors that's appearing in my
    Origin200's syslog. It's obvious that there's a bad disk in the FC
    Clariion array somewhere, but I'm not sure which disk needs to be
    replaced.

    The error messages I'm getting are:


    Jul 30 08:49:57 6A:tristan unix: |$(3)<6>dksc4d2vol: SCSI driver error:
    Command timed out

    Jul 30 08:49:57 4A:tristan unix: WARNING: XVM: READ I/O error - errno=5
    dev=0xf4 bp=0xa8000000256a5500 b_flags=0xd b_addr=0xa800000025f86000
    b_pages=0x0 b_blkno=0xc08e80, ior=0xa800000047284300
    io_flags=0xc000000000004009 io_resid=0xa800000000000000
    io_error=0xa8000000be030001

    Jul 30 08:49:57 4A:tristan unix: WARNING: XVM: original buffer for
    bp=0xa8000000256a5500 is topbp=0xa800000000766788 flags=0x9
    b_addr=0xa800000025f86000 b_pages=0x0 b_blkno=0x784f200 b_iodone=0x0

    Jul 30 08:49:57 1A:tristan unix: ALERT: I/O error in filesystem
    ("/data03") meta-data dev 0x43001a9 block 0x784f200

    Jul 30 08:49:57 1A:tristan unix: ALERT: I/O error in filesystem
    ("/data03") meta-data dev 0x43001a9 block 0x784f200

    Jul 30 08:49:57 2A:tristan unix: ("xfs_trans_read_buf") b_error 5
    b_bcount 8192 b_resid 0 Jul 30 08:49:57 2A:tristan unix:
    ("xfs_trans_read_buf") b_error 5 b_bcount 8192 b_resid 0



    I'd assume that xvm would tell me which disk was going bad, but I can't
    tell from the error messages - I presume that it's one of the devices as
    mentioned, but which disk does that refer to in the array?

    The Origin200 is running 6.5.24m, and the attached array is used as a JBOD
    configured as one single stripe out of ten 18GB 10,000rpm FC disks. I have
    two spare disks on hand and ready to put into the array, but I don't know
    which disk to replace.

    Any clues, anyone?

    --
    Steven Harrison
    Unix Systems Administrator

    0 OK, 0:1

  2. Re: xvm I/O errors

    Steven Harrison wrote:
    > I'm having a hard time deciphering the xvm errors that's appearing in my
    > Origin200's syslog. It's obvious that there's a bad disk in the FC
    > Clariion array somewhere, but I'm not sure which disk needs to be
    > replaced.
    >
    > The error messages I'm getting are:
    >
    >
    > Jul 30 08:49:57 6A:tristan unix: |$(3)<6>dksc4d2vol: SCSI driver error:
    > Command timed out
    >
    > Jul 30 08:49:57 4A:tristan unix: WARNING: XVM: READ I/O error - errno=5
    > dev=0xf4 bp=0xa8000000256a5500 b_flags=0xd b_addr=0xa800000025f86000
    > b_pages=0x0 b_blkno=0xc08e80, ior=0xa800000047284300
    > io_flags=0xc000000000004009 io_resid=0xa800000000000000
    > io_error=0xa8000000be030001
    >
    > Jul 30 08:49:57 4A:tristan unix: WARNING: XVM: original buffer for
    > bp=0xa8000000256a5500 is topbp=0xa800000000766788 flags=0x9
    > b_addr=0xa800000025f86000 b_pages=0x0 b_blkno=0x784f200 b_iodone=0x0
    >
    > Jul 30 08:49:57 1A:tristan unix: ALERT: I/O error in filesystem
    > ("/data03") meta-data dev 0x43001a9 block 0x784f200
    >
    > Jul 30 08:49:57 1A:tristan unix: ALERT: I/O error in filesystem
    > ("/data03") meta-data dev 0x43001a9 block 0x784f200
    >
    > Jul 30 08:49:57 2A:tristan unix: ("xfs_trans_read_buf") b_error 5
    > b_bcount 8192 b_resid 0 Jul 30 08:49:57 2A:tristan unix:
    > ("xfs_trans_read_buf") b_error 5 b_bcount 8192 b_resid 0
    >
    >
    >
    > I'd assume that xvm would tell me which disk was going bad, but I can't
    > tell from the error messages - I presume that it's one of the devices as
    > mentioned, but which disk does that refer to in the array?
    >
    > The Origin200 is running 6.5.24m, and the attached array is used as a JBOD
    > configured as one single stripe out of ten 18GB 10,000rpm FC disks. I have
    > two spare disks on hand and ready to put into the array, but I don't know
    > which disk to replace.
    >
    > Any clues, anyone?
    >

    If a disk is bad an amber led will show up above it on the array. Or
    atleast on my clariion this is the case.
    This is a DPE not sure if the DAE does this as well.

    Zach

  3. Re: xvm I/O errors

    Zach McDanel wrote:
    > Steven Harrison wrote:
    >
    >> I'm having a hard time deciphering the xvm errors that's appearing in my
    >> Origin200's syslog. It's obvious that there's a bad disk in the FC
    >> Clariion array somewhere, but I'm not sure which disk needs to be
    >> replaced.
    >>
    >> The error messages I'm getting are:
    >>
    >>
    >> Jul 30 08:49:57 6A:tristan unix: |$(3)<6>dksc4d2vol: SCSI driver error:
    >> Command timed out
    >>
    >> Jul 30 08:49:57 4A:tristan unix: WARNING: XVM: READ I/O error - errno=5
    >> dev=0xf4 bp=0xa8000000256a5500 b_flags=0xd b_addr=0xa800000025f86000
    >> b_pages=0x0 b_blkno=0xc08e80, ior=0xa800000047284300
    >> io_flags=0xc000000000004009 io_resid=0xa800000000000000
    >> io_error=0xa8000000be030001
    >>
    >> Jul 30 08:49:57 4A:tristan unix: WARNING: XVM: original buffer for
    >> bp=0xa8000000256a5500 is topbp=0xa800000000766788 flags=0x9
    >> b_addr=0xa800000025f86000 b_pages=0x0 b_blkno=0x784f200 b_iodone=0x0
    >>
    >> Jul 30 08:49:57 1A:tristan unix: ALERT: I/O error in filesystem
    >> ("/data03") meta-data dev 0x43001a9 block 0x784f200
    >>
    >> Jul 30 08:49:57 1A:tristan unix: ALERT: I/O error in filesystem
    >> ("/data03") meta-data dev 0x43001a9 block 0x784f200
    >>
    >> Jul 30 08:49:57 2A:tristan unix: ("xfs_trans_read_buf") b_error 5
    >> b_bcount 8192 b_resid 0 Jul 30 08:49:57 2A:tristan unix:
    >> ("xfs_trans_read_buf") b_error 5 b_bcount 8192 b_resid 0
    >>
    >>
    >>
    >> I'd assume that xvm would tell me which disk was going bad, but I can't
    >> tell from the error messages - I presume that it's one of the devices as
    >> mentioned, but which disk does that refer to in the array?
    >>
    >> The Origin200 is running 6.5.24m, and the attached array is used as a
    >> JBOD
    >> configured as one single stripe out of ten 18GB 10,000rpm FC disks. I
    >> have
    >> two spare disks on hand and ready to put into the array, but I don't know
    >> which disk to replace.
    >>
    >> Any clues, anyone?
    >>

    > If a disk is bad an amber led will show up above it on the array. Or
    > atleast on my clariion this is the case.
    > This is a DPE not sure if the DAE does this as well.
    >
    > Zach


    Hi,

    U need to replace disc 2 on controller 4 .....

    Reqards,
    Cor


  4. Re: xvm I/O errors


    On Fri, 30 Jul 2004, Zach McDanel wrote:

    > If a disk is bad an amber led will show up above it on the array. Or
    > atleast on my clariion this is the case.
    > This is a DPE not sure if the DAE does this as well.


    I did indeed get an amber LED, but only after a power cycle of both the
    array and the Origin.

    The disk was replaced, and data was restored - Thanks to all who helped!

    --
    Steven Harrison
    Unix Systems Administrator

    9 STOP statement, 0:1

+ Reply to Thread