Are Suns fussy about fibre channel disks?? - SUN

This is a discussion on Are Suns fussy about fibre channel disks?? - SUN ; In comp.sys.sun.admin Cydrome Leader wrote: > I prefer preventative maintenance, not cleaning up larger messes later. A few "metattach" or "metareplace" are a larger mess? -- Daniel...

+ Reply to Thread
Page 2 of 3 FirstFirst 1 2 3 LastLast
Results 21 to 40 of 47

Thread: Are Suns fussy about fibre channel disks??

  1. Re: Are Suns fussy about fibre channel disks??

    In comp.sys.sun.admin Cydrome Leader wrote:
    > I prefer preventative maintenance, not cleaning up larger messes later.


    A few "metattach" or "metareplace" are a larger mess?

    --
    Daniel

  2. Re: Are Suns fussy about fibre channel disks??

    In comp.sys.sun.hardware Daniel Rock wrote:
    > In comp.sys.sun.admin Cydrome Leader wrote:
    >> I prefer preventative maintenance, not cleaning up larger messes later.

    >
    > A few "metattach" or "metareplace" are a larger mess?


    I guess if you're bored and have nothing better to do with a computer than
    stuff it full of blatantly broken drives from a junk pile, then rebuild
    the data you're just going to lose anyways next week, because you're
    probably also using RAM that's "mostly" OK and SCSI card that "sort of
    works" on a system board that's "almost always fine" with power supplies
    with fans that "usually" spin, then go for it.

    Some people have slightly different standards, and know that drives don't
    fix themselves, and always just get worse and should be replaced at the
    first signs of trouble, at your convenience, not when they finally do
    catastrophically fail.







  3. Re: Are Suns fussy about fibre channel disks??

    In comp.sys.sun.admin Cydrome Leader wrote:
    > In comp.sys.sun.hardware Daniel Rock wrote:
    >> In comp.sys.sun.admin Cydrome Leader wrote:
    >>> I prefer preventative maintenance, not cleaning up larger messes later.

    >>
    >> A few "metattach" or "metareplace" are a larger mess?

    >
    > Some people have slightly different standards, and know that drives don't
    > fix themselves, and always just get worse and should be replaced at the
    > first signs of trouble, at your convenience, not when they finally do
    > catastrophically fail.


    Do you just replace a flat tire or the entire car?


    Let's calculate the probability of a total failure...

    Normal SCSI drives have a AFR of ~3%. Let's say the AFR of these drives
    is 10 times higher (i.e. 30%). Let's also assume it takes on average 48 hours
    to replace a broken drive.

    What is the probability that two drives fail within 48 hours?

    The probability is ~0.05% p.a. (0.3 * 0.3 * (2/365))


    BTW this is the SMART output of one of the drives:

    Device: SEAGATE SX3146807FC Version: D010
    Device type: disk
    Transport protocol: Fibre channel (FCP-2)
    Device supports SMART and is Enabled
    Temperature Warning Disabled or Not Supported
    SMART Health Status: OK

    Elements in grown defect list: 8
    Vendor (Seagate) cache information
    Blocks sent to initiator = 323870666916662
    Vendor (Seagate/Hitachi) factory information
    number of hours powered up = 27478.45
    number of minutes until next internal SMART test = 10

  4. Re: Are Suns fussy about fibre channel disks??

    Daniel Rock wrote:
    >
    >
    > Let's calculate the probability of a total failure...
    >
    > Normal SCSI drives have a AFR of ~3%. Let's say the AFR of these drives
    > is 10 times higher (i.e. 30%). Let's also assume it takes on average 48 hours
    > to replace a broken drive.
    >
    > What is the probability that two drives fail within 48 hours?
    >
    > The probability is ~0.05% p.a. (0.3 * 0.3 * (2/365))


    You've assumed that drives fail independently of each other. If the
    drives have a higher probability of failure following some event (e.g.,
    a power cycle), then your calculation is flawed. Take another example;
    the drive has a 30% chance of not spinning up after a power cycle. The
    probability of a catastrophic failure is
    0.3 * 0.3 * P(power cycle)
    Since you know you will power cycle at some point, you have a 9% chance
    of losing your data at that point. Not a risk I'd take.

  5. Re: Are Suns fussy about fibre channel disks??

    In comp.sys.sun.admin Douglas O'Neal wrote:
    > Since you know you will power cycle at some point, you have a 9% chance
    > of losing your data at that point. Not a risk I'd take.


    You are assuming that the drive will never again spin up after a power
    cycle.

    This assumption is flawed.

    --
    Daniel

  6. Re: Are Suns fussy about fibre channel disks??

    In comp.sys.sun.hardware Daniel Rock wrote:
    > In comp.sys.sun.admin Cydrome Leader wrote:
    >> In comp.sys.sun.hardware Daniel Rock wrote:
    >>> In comp.sys.sun.admin Cydrome Leader wrote:
    >>>> I prefer preventative maintenance, not cleaning up larger messes later.
    >>>
    >>> A few "metattach" or "metareplace" are a larger mess?

    >>
    >> Some people have slightly different standards, and know that drives don't
    >> fix themselves, and always just get worse and should be replaced at the
    >> first signs of trouble, at your convenience, not when they finally do
    >> catastrophically fail.

    >
    > Do you just replace a flat tire or the entire car?


    the car/stires analogy is best summed up as the person that uses broken
    drives puts leaky tires on their car, and hopes it doesn't go flat, and if
    it does, they're fine with three good tires- and they saved a a few
    dollars because they're witty.

    > Let's calculate the probability of a total failure...
    >
    > Normal SCSI drives have a AFR of ~3%. Let's say the AFR of these drives
    > is 10 times higher (i.e. 30%). Let's also assume it takes on average 48 hours
    > to replace a broken drive.
    >
    > What is the probability that two drives fail within 48 hours?


    more than you'd expect. I've seen plenty of double disk failures.

    > The probability is ~0.05% p.a. (0.3 * 0.3 * (2/365))


    I can simplify that equation into:

    it's stupid to put broken disks back into a machine, no matter what
    nonsense math you try to justify it with.

    >
    > BTW this is the SMART output of one of the drives:
    >
    > Device: SEAGATE SX3146807FC Version: D010
    > Device type: disk
    > Transport protocol: Fibre channel (FCP-2)
    > Device supports SMART and is Enabled
    > Temperature Warning Disabled or Not Supported
    > SMART Health Status: OK
    >
    > Elements in grown defect list: 8
    > Vendor (Seagate) cache information
    > Blocks sent to initiator = 323870666916662
    > Vendor (Seagate/Hitachi) factory information
    > number of hours powered up = 27478.45
    > number of minutes until next internal SMART test = 10


    You're blinding yourself.

    You know the drive doesn't always spin up. No amount of smart data cancels
    that out.

    just throw the drive out or RMA it.

  7. Re: Are Suns fussy about fibre channel disks??

    In comp.sys.sun.admin Cydrome Leader wrote:
    > just throw the drive out or RMA it.


    Why should I pay for it?

    --
    Daniel

  8. Re: Are Suns fussy about fibre channel disks??

    Daniel Rock wrote:
    > In comp.sys.sun.admin Douglas O'Neal wrote:
    >> Since you know you will power cycle at some point, you have a 9% chance
    >> of losing your data at that point. Not a risk I'd take.

    >
    > You are assuming that the drive will never again spin up after a power
    > cycle.
    >
    > This assumption is flawed.


    Agreed, my probability is too high. But the point is that the 0.05%
    catastrophic probability you calculated is too low. And if we take
    a number somewhere in the middle, say 0.5% chance of catastrophic
    failure per power cycle, that would be way too high for me to trust
    with critical data.

  9. Re: Are Suns fussy about fibre channel disks??

    Daniel Rock wrote:
    > In comp.sys.sun.admin Cydrome Leader wrote:
    >> In comp.sys.sun.hardware Daniel Rock wrote:
    >>> In comp.sys.sun.admin Cydrome Leader wrote:
    >>>> I prefer preventative maintenance, not cleaning up larger messes later.
    >>> A few "metattach" or "metareplace" are a larger mess?

    >> Some people have slightly different standards, and know that drives don't
    >> fix themselves, and always just get worse and should be replaced at the
    >> first signs of trouble, at your convenience, not when they finally do
    >> catastrophically fail.

    >
    > Do you just replace a flat tire or the entire car?
    >
    >
    > Let's calculate the probability of a total failure...
    >
    > Normal SCSI drives have a AFR of ~3%. Let's say the AFR of these drives
    > is 10 times higher (i.e. 30%). Let's also assume it takes on average 48 hours
    > to replace a broken drive.


    MTBF is around 800kh for SCSI disks giving AFR=1.095%

    > What is the probability that two drives fail within 48 hours?
    >
    > The probability is ~0.05% p.a. (0.3 * 0.3 * (2/365))


    0.00657% is more correct from above.


    >
    > BTW this is the SMART output of one of the drives:
    >
    > Device: SEAGATE SX3146807FC Version: D010


    But this disk has 1200000 h MTBF so here we have an AFR of
    and your probability therefore 0.0000292%

    Quite a difference...


    http://www.seagate.com/support/disc/...3146807fc.html


    > Device type: disk
    > Transport protocol: Fibre channel (FCP-2)
    > Device supports SMART and is Enabled
    > Temperature Warning Disabled or Not Supported
    > SMART Health Status: OK
    >
    > Elements in grown defect list: 8
    > Vendor (Seagate) cache information
    > Blocks sent to initiator = 323870666916662
    > Vendor (Seagate/Hitachi) factory information
    > number of hours powered up = 27478.45
    > number of minutes until next internal SMART test = 10


  10. Re: Are Suns fussy about fibre channel disks??

    In comp.sys.sun.admin Douglas O'Neal wrote:
    > that would be way too high for me to trust with critical data.


    Who said there was critical data?

    --
    Daniel

  11. Re: Are Suns fussy about fibre channel disks??

    In comp.sys.sun.hardware Daniel Rock wrote:
    > In comp.sys.sun.admin Cydrome Leader wrote:
    >> just throw the drive out or RMA it.

    >
    > Why should I pay for it?
    >


    You're a cheap hobbiest, you shouldn't pay for anything.

    do what suits your needs best.



  12. Re: Are Suns fussy about fibre channel disks??

    Thommy M. wrote:

    >> Let's calculate the probability of a total failure...
    >>
    >> Normal SCSI drives have a AFR of ~3%. Let's say the AFR of these drives
    >> is 10 times higher (i.e. 30%). Let's also assume it takes on average 48 hours
    >> to replace a broken drive.

    >
    > MTBF is around 800kh for SCSI disks giving AFR=1.095%
    >
    >> What is the probability that two drives fail within 48 hours?
    >>
    >> The probability is ~0.05% p.a. (0.3 * 0.3 * (2/365))

    >
    > 0.00657% is more correct from above.



    You need to be careful in interpreting MTBF of disks. The MTBF is based
    on the assumption that the disk will be replaced (even if working) at
    the end of the service life, which is typically 5 years for a SCSI disk
    - I found that on the Seagate web site once.

    A MTBF of 1,000,000 hours does *not* mean the disks will last an
    average time of 1,000,000 hours or 114 years if you switch them on and
    never replace them. They will on average last a LOT less than that.
    Although I have no data on it, I doubt any single disk would be working
    114 years later!

    During that 5 years, the disk is likely to be under warranty anyway.

    I think the point at which one disposes of disks depends on ones
    circumstances. If it's an important server in your company, it might be
    wise to replace them every 5 years. If its on a less important system,
    you might not do so until logs indicate a problem. If it's for a home
    machine, and not one use to store important information, one might
    tolerate a few errors.

    I don't know how many Suns are used by hobbyists, but I suspect there
    are quite a few. A previous employer had a site licence for a piece of
    software, which allowed one to use a copy at home. I asked for a SPARC
    licence for use at home, and was initially declined this as "a Sun SPARC
    is not considered a home computer". After some discussions they agreed
    as a "one-off".

    I would not personally use a disk that did not reliably spin up (even at
    home as a scratch disk), but I would not criticise someone who felt in
    their circumstances that was appropriate. Clearly anyone doing that on a
    important server in their company would need their head tested!







  13. Re: Are Suns fussy about fibre channel disks??

    On 2007-10-03, Dave wrote:

    > If it's an important server in your company, it might be
    > wise to replace them every 5 years.


    You jest. We have over 3000 Unix servers. Wild guesstimate, 12,000 disks.
    Replace 2400 disks a year? Nonsense.

    --
    "Religion poisons everything."
    [email me at huge {at} huge (dot) org uk]

  14. Re: Are Suns fussy about fibre channel disks??

    In comp.sys.sun.admin Huge wrote:
    > On 2007-10-03, Dave wrote:
    >
    >> If it's an important server in your company, it might be
    >> wise to replace them every 5 years.

    >
    > You jest. We have over 3000 Unix servers. Wild guesstimate, 12,000 disks.
    > Replace 2400 disks a year? Nonsense.


    Just let the machines age and you will be replacing that many at some
    point.

  15. Re: Are Suns fussy about fibre channel disks??

    Daniel Rock wrote:
    > In comp.sys.sun.admin Cydrome Leader wrote:
    >> just throw the drive out or RMA it.

    >
    > Why should I pay for it?
    >

    Rock on Daniel, I think you know more about disks than the others know
    about cars :-)
    /Jorgen

  16. Re: Are Suns fussy about fibre channel disks??

    In comp.sys.sun.hardware Jorgen Moquist wrote:
    > Daniel Rock wrote:
    >> In comp.sys.sun.admin Cydrome Leader wrote:
    >>> just throw the drive out or RMA it.

    >>
    >> Why should I pay for it?
    >>

    > Rock on Daniel, I think you know more about disks than the others know
    > about cars :-)
    > /Jorgen


    yup, it's always best to use broken parts, and when things do fail, to do
    nothing. Problems with machines only get better with time, they're self
    healing.

  17. Re: Are Suns fussy about fibre channel disks??

    Cydrome Leader wrote:
    > In comp.sys.sun.hardware Jorgen Moquist wrote:
    >> Daniel Rock wrote:
    >>> In comp.sys.sun.admin Cydrome Leader wrote:
    >>>> just throw the drive out or RMA it.
    >>> Why should I pay for it?
    >>>

    >> Rock on Daniel, I think you know more about disks than the others know
    >> about cars :-)
    >> /Jorgen

    >
    > yup, it's always best to use broken parts, and when things do fail, to do
    > nothing. Problems with machines only get better with time, they're self
    > healing.

    scsi and fcal disks are "selfhealing", lots of spare tracks/cyls.
    replacement and cacheing tables, one spare sector per cyl and two spare
    cyls per surface as i recall.
    and several copies of the bootcode/os/rtc.
    very easy to monitoring grown defect list or ioerrors or use SMART.
    can only see one scary situation, if 2 drives are manufactured the same day.
    well if having backups :-)
    /jorgen

  18. Re: Are Suns fussy about fibre channel disks??

    In comp.sys.sun.admin Jorgen Moquist wrote:
    > Cydrome Leader wrote:
    >> In comp.sys.sun.hardware Jorgen Moquist wrote:
    >>> Daniel Rock wrote:
    >>>> In comp.sys.sun.admin Cydrome Leader wrote:
    >>>>> just throw the drive out or RMA it.
    >>>> Why should I pay for it?
    >>>>
    >>> Rock on Daniel, I think you know more about disks than the others know
    >>> about cars :-)
    >>> /Jorgen

    >>
    >> yup, it's always best to use broken parts, and when things do fail, to do
    >> nothing. Problems with machines only get better with time, they're self
    >> healing.

    > scsi and fcal disks are "selfhealing", lots of spare tracks/cyls.
    > replacement and cacheing tables, one spare sector per cyl and two spare
    > cyls per surface as i recall.


    None of this keeps errors from happening in the first place. Spare sectors
    don't make unrecoverable read errors not happen. It's also pretty known
    that once you start to see errors (and that means the drive has warning
    you because there's something wrong), things only go downhill from there.

    these media errors tend to grow. I know this is link to PC magazine of all
    places, but it's a short link

    http://www.pcmag.com/encyclopedia_te...i=55545,00.asp


    > and several copies of the bootcode/os/rtc.
    > very easy to monitoring grown defect list or ioerrors or use SMART.
    > can only see one scary situation, if 2 drives are manufactured the same day.
    > well if having backups :-)
    > /jorgen


    Plenty of drive problems are mechanical. Having 9000% spare data on
    platters doesn't help if you crashed a head or your disk won't spin up.

    drives don't heal themselves. They never improve the state they're in.
    If they start to throw errors, replace them.



  19. Re: Are Suns fussy about fibre channel disks??

    Cydrome Leader wrote:
    > In comp.sys.sun.admin Jorgen Moquist wrote:
    >
    >>Cydrome Leader wrote:
    >>
    >>>In comp.sys.sun.hardware Jorgen Moquist wrote:
    >>>
    >>>>Daniel Rock wrote:
    >>>>
    >>>>>In comp.sys.sun.admin Cydrome Leader wrote:
    >>>>>
    >>>>>>just throw the drive out or RMA it.
    >>>>>
    >>>>>Why should I pay for it?
    >>>>>
    >>>>
    >>>>Rock on Daniel, I think you know more about disks than the others know
    >>>>about cars :-)
    >>>>/Jorgen
    >>>
    >>>yup, it's always best to use broken parts, and when things do fail, to do
    >>>nothing. Problems with machines only get better with time, they're self
    >>>healing.

    >>
    >>scsi and fcal disks are "selfhealing", lots of spare tracks/cyls.
    >>replacement and cacheing tables, one spare sector per cyl and two spare
    >>cyls per surface as i recall.

    >
    >
    > None of this keeps errors from happening in the first place. Spare sectors
    > don't make unrecoverable read errors not happen. It's also pretty known
    > that once you start to see errors (and that means the drive has warning
    > you because there's something wrong), things only go downhill from there.
    >


    Disk drives can and do survive a block becoming unreadable. SCSI drives
    can "revector" a bad block. In some operating systems, the disk driver
    works in conjunction with the disk to copy data from a questionable
    block to a replacement block. This looks, to the user, like "self
    healing". If a bad block is revectored, it's not an indication of a
    serious problem.

    What IS an indication of a serious problem is A PATTERN of bad blocks
    being revectored. When you see that, it's time to replace the the disk.
    Do it NOW! Tomorrow may be too late.


  20. Re: Are Suns fussy about fibre channel disks??

    In comp.sys.sun.admin Richard B. Gilbert wrote:
    > Cydrome Leader wrote:
    >> In comp.sys.sun.admin Jorgen Moquist wrote:
    >>
    >>>Cydrome Leader wrote:
    >>>
    >>>>In comp.sys.sun.hardware Jorgen Moquist wrote:
    >>>>
    >>>>>Daniel Rock wrote:
    >>>>>
    >>>>>>In comp.sys.sun.admin Cydrome Leader wrote:
    >>>>>>
    >>>>>>>just throw the drive out or RMA it.
    >>>>>>
    >>>>>>Why should I pay for it?
    >>>>>>
    >>>>>
    >>>>>Rock on Daniel, I think you know more about disks than the others know
    >>>>>about cars :-)
    >>>>>/Jorgen
    >>>>
    >>>>yup, it's always best to use broken parts, and when things do fail, to do
    >>>>nothing. Problems with machines only get better with time, they're self
    >>>>healing.
    >>>
    >>>scsi and fcal disks are "selfhealing", lots of spare tracks/cyls.
    >>>replacement and cacheing tables, one spare sector per cyl and two spare
    >>>cyls per surface as i recall.

    >>
    >>
    >> None of this keeps errors from happening in the first place. Spare sectors
    >> don't make unrecoverable read errors not happen. It's also pretty known
    >> that once you start to see errors (and that means the drive has warning
    >> you because there's something wrong), things only go downhill from there.
    >>

    >
    > Disk drives can and do survive a block becoming unreadable. SCSI drives
    > can "revector" a bad block. In some operating systems, the disk driver
    > works in conjunction with the disk to copy data from a questionable
    > block to a replacement block. This looks, to the user, like "self
    > healing". If a bad block is revectored, it's not an indication of a
    > serious problem.
    >
    > What IS an indication of a serious problem is A PATTERN of bad blocks
    > being revectored. When you see that, it's time to replace the the disk.
    > Do it NOW! Tomorrow may be too late.
    >


    and it generally always is a patter of failing blocks, not one random one
    and then things are great again for years.



+ Reply to Thread
Page 2 of 3 FirstFirst 1 2 3 LastLast