Tape Drive Status Incorrectly Reported on L180 Library - Veritas Net Backup

This is a discussion on Tape Drive Status Incorrectly Reported on L180 Library - Veritas Net Backup ; I have a StorageTek L180 library with IBM LTO-2 drives connected to a Sun media/master server via fibre channel. I have six drives and I recently migrated to a new media/master server. I'm having a problem with two of the ...

+ Reply to Thread
Results 1 to 3 of 3

Thread: Tape Drive Status Incorrectly Reported on L180 Library

  1. Tape Drive Status Incorrectly Reported on L180 Library


    I have a StorageTek L180 library with IBM LTO-2 drives connected to
    a Sun media/master server via fibre channel. I have six drives and I recently
    migrated to a new media/master server. I'm having a problem with two of the
    drives in the system, they show up at the PROM when I run 'probe-scsi-all'
    and in the OS when I run 'cfgadm -al' and 'sgscan tape' and show up in robtest.
    However when I load a tape into them with robtest, or when NetBackup loads
    a tape into them the tape seemingly disappears, a media not present error
    is returned and the drive downs itself.
    I have rebuilt the drives starting at the fiber channel level by deleting
    the Fabric_WWN_Map file and running cfgadm with the 'force_configure' option,
    rebuilt my rmt drivers by running sg.build to build a new st.conf, removing
    the existing /dev/rmt drivers, editing /kernel/drv/st.conf and doing a reconfiguration
    reboot, reinstalled the sg drivers with the sg.build and sg.install commands
    and run the device configuration wizard in NetBackup to detect the drives.
    The drives are properly detected, but they don't work. Below is an example
    of what happens when I try to load a tape with robtest, note how the drive
    can correctly return the barcode of the loaded tape, but as soon as I try
    to unload the tape the command fails with a 'media not present' error. Any
    help that anyone can give me in resolving this would be greatly appreciated.
    These drives are less than three months old so I have a hard time believing
    that I've had two of them fail at the same time.

    Thanks,

    Jamie Jamison



  2. Re: Tape Drive Status Incorrectly Reported on L180 Library


    Jamie


    Sounds like youve done a good lump of investigation already but its clear
    theres something still amiss.

    You'll need to break down the problem(s) further, isolate , divide and conquer.
    Ive got a setup very similar to yours: L180/LTO2/Sun/SAN (and shared storage).

    Heres what I'd do: if youre sharing the LTOs, start with the master server,
    forget about the rest for now. Manually put a tape in a drive, query all
    the drives by hand with mt and determine the rmt to each of the tapes in
    turn. This way you'll know exactly which rmt is which . Use mt -f rewoffl
    to dismount each tape from each drive in turn. At this point you'll know
    if all drives are correctly defined and you'll know if your drives can correctly
    handle tapes.

    Repeat for all drives that are goverened by shared-storage, issuing the mt
    commands on each host in turn. You'll then have a breakdown of all devices
    as seen from all servers and hence configuring each device from each shared
    server should be clear. And equally any zoning /config problems should be
    evident and in need of a fix. You'll know that theres no reason why a drive
    should have the same rmt across each of the media servers if shared.

    Once these are all ironed out, you should be able to clearly delineate between
    the tape devices and the robot...you know the tape drives are all working,
    move on to the robot. If the robot cant mount tapes (and it does sound that
    way: it sounds like its putting them in the wrong place if they 'disappear')
    then its either the robot thats incorrectly configured or its the definition
    of the robot config in netbackup that is wrong. I recall theres a convoluted
    set of numbering and labelling methods for the robot and its drives and Ive
    fallen foul of this - the tapes head off to the wrong drives.
    You can down all drives but one and examine each one in isolation and see
    if you can mount to it...from each of the shared media servers...then you'll
    know whats right and whats not and hopefully see the problems .

    Regards,Jim

    "Jamie Jamison" wrote:
    >
    > I have a StorageTek L180 library with IBM LTO-2 drives connected

    to
    >a Sun media/master server via fibre channel. I have six drives and I recently
    >migrated to a new media/master server. I'm having a problem with two of

    the
    >drives in the system, they show up at the PROM when I run 'probe-scsi-all'
    >and in the OS when I run 'cfgadm -al' and 'sgscan tape' and show up in robtest.
    >However when I load a tape into them with robtest, or when NetBackup loads
    >a tape into them the tape seemingly disappears, a media not present error
    >is returned and the drive downs itself.
    > I have rebuilt the drives starting at the fiber channel level by deleting
    >the Fabric_WWN_Map file and running cfgadm with the 'force_configure' option,
    >rebuilt my rmt drivers by running sg.build to build a new st.conf, removing
    >the existing /dev/rmt drivers, editing /kernel/drv/st.conf and doing a reconfiguration
    >reboot, reinstalled the sg drivers with the sg.build and sg.install commands
    >and run the device configuration wizard in NetBackup to detect the drives.
    >The drives are properly detected, but they don't work. Below is an example
    >of what happens when I try to load a tape with robtest, note how the drive
    >can correctly return the barcode of the loaded tape, but as soon as I try
    >to unload the tape the command fails with a 'media not present' error. Any
    >help that anyone can give me in resolving this would be greatly appreciated.
    >These drives are less than three months old so I have a hard time believing
    >that I've had two of them fail at the same time.
    >
    >Thanks,
    >
    >Jamie Jamison
    >
    >



  3. Re: Tape Drive Status Incorrectly Reported on L180 Library


    sounds like a lot to do, a clean sweep. I just hate to see you go through
    all this procedures and then hit the point at the very end. worse you could
    mess up things that already worked.

    I'm not trying to discredit Jim's methods. Before you go on the mission,
    try doing away with the switches, etc and try to connect the problem drives
    directly to the HBA if you can. then rmt -blah rewind ...this is equal to
    unload. If works, check your cable sittings

    "jim dalton" wrote:
    >
    >Jamie
    >
    >
    >Sounds like youve done a good lump of investigation already but its clear
    >theres something still amiss.
    >
    >You'll need to break down the problem(s) further, isolate , divide and conquer.
    >Ive got a setup very similar to yours: L180/LTO2/Sun/SAN (and shared storage).
    >
    >Heres what I'd do: if youre sharing the LTOs, start with the master server,
    >forget about the rest for now. Manually put a tape in a drive, query all
    >the drives by hand with mt and determine the rmt to each of the tapes in
    >turn. This way you'll know exactly which rmt is which . Use mt -f rewoffl
    >to dismount each tape from each drive in turn. At this point you'll know
    >if all drives are correctly defined and you'll know if your drives can correctly
    >handle tapes.
    >
    >Repeat for all drives that are goverened by shared-storage, issuing the

    mt
    >commands on each host in turn. You'll then have a breakdown of all devices
    >as seen from all servers and hence configuring each device from each shared
    >server should be clear. And equally any zoning /config problems should be
    >evident and in need of a fix. You'll know that theres no reason why a drive
    >should have the same rmt across each of the media servers if shared.
    >
    >Once these are all ironed out, you should be able to clearly delineate between
    >the tape devices and the robot...you know the tape drives are all working,
    >move on to the robot. If the robot cant mount tapes (and it does sound that
    >way: it sounds like its putting them in the wrong place if they 'disappear')
    >then its either the robot thats incorrectly configured or its the definition
    >of the robot config in netbackup that is wrong. I recall theres a convoluted
    >set of numbering and labelling methods for the robot and its drives and

    Ive
    >fallen foul of this - the tapes head off to the wrong drives.
    >You can down all drives but one and examine each one in isolation and see
    >if you can mount to it...from each of the shared media servers...then you'll
    >know whats right and whats not and hopefully see the problems .
    >
    >Regards,Jim
    >
    >"Jamie Jamison" wrote:
    >>
    >> I have a StorageTek L180 library with IBM LTO-2 drives connected

    >to
    >>a Sun media/master server via fibre channel. I have six drives and I recently
    >>migrated to a new media/master server. I'm having a problem with two of

    >the
    >>drives in the system, they show up at the PROM when I run 'probe-scsi-all'
    >>and in the OS when I run 'cfgadm -al' and 'sgscan tape' and show up in

    robtest.
    >>However when I load a tape into them with robtest, or when NetBackup loads
    >>a tape into them the tape seemingly disappears, a media not present error
    >>is returned and the drive downs itself.
    >> I have rebuilt the drives starting at the fiber channel level by deleting
    >>the Fabric_WWN_Map file and running cfgadm with the 'force_configure' option,
    >>rebuilt my rmt drivers by running sg.build to build a new st.conf, removing
    >>the existing /dev/rmt drivers, editing /kernel/drv/st.conf and doing a

    reconfiguration
    >>reboot, reinstalled the sg drivers with the sg.build and sg.install commands
    >>and run the device configuration wizard in NetBackup to detect the drives.
    >>The drives are properly detected, but they don't work. Below is an example
    >>of what happens when I try to load a tape with robtest, note how the drive
    >>can correctly return the barcode of the loaded tape, but as soon as I try
    >>to unload the tape the command fails with a 'media not present' error.

    Any
    >>help that anyone can give me in resolving this would be greatly appreciated.
    >>These drives are less than three months old so I have a hard time believing
    >>that I've had two of them fail at the same time.
    >>
    >>Thanks,
    >>
    >>Jamie Jamison
    >>
    >>

    >



+ Reply to Thread