SCSI vs SATA hard disks - Hardware

This is a discussion on SCSI vs SATA hard disks - Hardware ; Aragorn, Thanks for the clear explanations. I'm much more on top of the situation now. -- Haines Brown, KB1GRM...

+ Reply to Thread
Page 3 of 3 FirstFirst 1 2 3
Results 41 to 53 of 53

Thread: SCSI vs SATA hard disks

  1. Re: SCSI vs SATA hard disks

    Aragorn,

    Thanks for the clear explanations. I'm much more on top of the situation
    now.

    --

    Haines Brown, KB1GRM




  2. Re: SCSI vs SATA hard disks

    Aragorn writes:


    >I'm not so sure that's a market trend. It just so happens to be that SCSI
    >is no longer considered useful in the home and office desktop market, but
    >servers are most definitely still using SCSI.


    >However, the SCSI that's being used and marketed today is no longer of the
    >parallel variant. Just as parallel ATA had to make way for serial ATA,
    >SCSI has by now already started making way for serial attached SCSI (SAS)
    >and iSCSI for storage area networks.


    I'm sorry, but you'll have go a long way to convince me the parentage of
    SAS is anything but:

    MarketDroid 1): Damn, everyone is buying SATA drives; the price is
    falling and we are screwed. How do we come up with a way to charge a
    premium without really doing a lot of work?

    MD2: I've got it! We'll rebadge SATA into something with SCSI in the
    name, so it sounds beefier... hmmm that's it.. Serial Attached SCSI.
    We save the investment in ""SCSI"" and build up the hype around it.

    MD1: But SATA really has some pluses.. Are we going to ignore them?

    MD2: We'll use Gate's ploy -- extend and embrace! We'll tweek some
    SATA specs here and there, adding some things we can talk up. But
    we'll save a bundle on connectors alone.

    ......



    In the past, SCSI server drives brought you two things: performance and
    reliability. [Think of those 9 GB Barracudas..].

    Now the issues are: Does SAS really do that much over SATA, for your
    case? And: Does paying SAS prices really give you more reliable drives,
    or just different electronics?

    --
    A host is a host from coast to coast.................wb8foz@nrk.com
    & no one will talk to a host that's close........[v].(301) 56-LINUX
    Unless the host (that isn't close).........................pob 1433
    is busy, hung or dead....................................20915-1433

  3. Re: SCSI vs SATA hard disks

    On Thursday 25 September 2008 19:14, someone identifying as *David Lesher*
    wrote in /comp.os.linux.hardware:/

    > Aragorn writes:
    >
    >> I'm not so sure that's a market trend. It just so happens to be that
    >> SCSI is no longer considered useful in the home and office desktop
    >> market, but servers are most definitely still using SCSI.
    >>
    >> However, the SCSI that's being used and marketed today is no longer of
    >> the parallel variant. Just as parallel ATA had to make way for serial
    >> ATA, SCSI has by now already started making way for serial attached SCSI
    >> (SAS) and iSCSI for storage area networks.

    >
    > I'm sorry, but you'll have go a long way to convince me the parentage of
    > SAS is anything but:
    >
    > MarketDroid 1): Damn, everyone is buying SATA drives; the price is
    > falling and we are screwed. How do we come up with a way to charge a
    > premium without really doing a lot of work?
    >
    > MD2: I've got it! We'll rebadge SATA into something with SCSI in the
    > name, so it sounds beefier... hmmm that's it.. Serial Attached SCSI.
    > We save the investment in ""SCSI"" and build up the hype around it.
    >
    > MD1: But SATA really has some pluses.. Are we going to ignore them?
    >
    > MD2: We'll use Gate's ploy -- extend and embrace! We'll tweek some
    > SATA specs here and there, adding some things we can talk up. But
    > we'll save a bundle on connectors alone.


    As with everything, technology is mainly developed to get marketed rather
    than for progress, but SAS is far more than what you describe above.

    The serialization of SCSI does offer some benefits with regard to large
    enterprises and data centers, and it all falls within the spirit of
    extending the possibilities of SCSI, e.g. there is also iSCSI now, which is
    a SCSI tunnel over ethernet.

    > In the past, SCSI server drives brought you two things: performance and
    > reliability. [Think of those 9 GB Barracudas..].


    It still does. The drives themselves - or at least, the ones I know - are
    basically the same as the U320 drives, but their maximum throughput is
    higher, whereas you could end up with a bottleneck on parallel SCSI chains.

    > Now the issues are: Does SAS really do that much over SATA, for your
    > case? And: Does paying SAS prices really give you more reliable drives,
    > or just different electronics?


    SAS drives *are* SCSI drives, so they do have all the goodies that SCSI
    comes with - e.g. ECC, logging, tagged command queueing - whereas SATA is
    actually nothing other than a serialized ATA drive in which an attempt was
    made to make ATA/IDE more SCSI-like.

    Enterprise-grade SATA drives are probably just or nearly as reliable as
    SAS/SCSI, but they lack the features that made SCSI stand out. SATA still
    is ATA, don't forget that. ;-) Also, not all SATA drives - not even in the
    enterprise-grade range - are fit to be used in RAID arrays, while SAS
    drives all are RAID-rated.

    On the other hand, if you care more about cost-effectiveness than features,
    then SATA offers (far) more storage per Dollar/Euro than SCSI. But then
    again, this was already the case for PATA - aka IDE, although SATA is IDE
    as well - versus parallel SCSI.

    So the bottom line is that if you're thinking about marketing scams, the
    scam would rather rest with SATA than with SAS, because SATA was intended
    to mimic SCSI over an IDE bus, but still has to rely on the SATA-specific
    NCQ (native command queueing) over the SCSI-specific TCQ (tagged command
    queueing), because TCQ on SATA sucks. Also, the difference in retail price
    between a SAS disk and an U320 SCSI disk is mainly negligible.

    --
    *Aragorn*
    (registered GNU/Linux user #223157)

  4. Re: SCSI vs SATA hard disks

    Haines Brown wrote:
    > ebenZEROONE@verizon.net (Hactar) writes:
    >
    >> In article <87hc89bmfm.fsf@teufel.hartford-hwp.com>,
    >> Haines Brown wroteready worko), there
    >>> Would a move to a SATA 3.0 Gb/s drive such as the Seagate Barracuda mean
    >>> that I will henceforth have to accept drive unreliability?

    >> _All_ drives are unreliable to some degree. The ultimate in
    >> computer-readable reliability is probably Tyvek punched tape.

    >
    > Yes, but my subjective impression is that there is a very wide
    > difference in reliability. Of the dozen SCSI drives I've used over the
    > years, only one failed on me; reading on line discussions and reviews,
    > it seems that SATA drives fail regularly.


    I've been using SATA for the last few years and haven't had any issues
    with them. I would recommend them because they provide high data
    transfer rates and high RPMs with low cost. Not to mention their
    connectors are so small that you can have 6 plugged into a very small
    area on the motherboard and it's still easy to manage. The cables
    themselves are also smaller than the older PATA cables which I also love.

    I bought 2 320GB SATA2 drives 2 years ago on Newegg. I put them in a
    mirror and have had no issues. My PC is on 24x7 too. About a year ago I
    got 2 80GB SATA2 drives and put them into a strip array (on the same
    controller as the other 2 320GB drives. Again, no issues to date.

  5. Re: SCSI vs SATA hard disks

    Hactar wrote:
    > In article <87d4ixbd77.fsf@teufel.hartford-hwp.com>,
    > Haines Brown wrote:
    >> ebenZEROONE@verizon.net (Hactar) writes:
    >>
    >>> In article <87hc89bmfm.fsf@teufel.hartford-hwp.com>,
    >>> Haines Brown wroteready worko), there
    >>>> Would a move to a SATA 3.0 Gb/s drive such as the Seagate Barracuda mean
    >>>> that I will henceforth have to accept drive unreliability?
    >>> _All_ drives are unreliable to some degree. The ultimate in
    >>> computer-readable reliability is probably Tyvek punched tape.

    >> Yes, but my subjective impression is that there is a very wide
    >> difference in reliability. Of the dozen SCSI drives I've used over the
    >> years, only one failed on me; reading on line discussions and reviews,
    >> it seems that SATA drives fail regularly.
    >>
    >> I guess my question comes down to, why should one bother these days with
    >> the added expense of SCSI hard disks?

    >
    > It is my impression (which may be false and/or out of date) that the
    > instances of drive hardware that are matched with SCSI controllers are the
    > more reliable (longer-lasting) ones.


    This is a common misconception. The interface is irrelevant, the 'mean
    time failure rate' is most certainly relevant. Essentially you pay more
    for a disk with a longer mean time failure rate, meaning its less likely
    to fail. And yes, SCSI is dying.

  6. Re: SCSI vs SATA hard disks

    criten wrote:
    > Hactar wrote:
    >> In article <87d4ixbd77.fsf@teufel.hartford-hwp.com>,
    >> Haines Brown wrote:
    >>> ebenZEROONE@verizon.net (Hactar) writes:
    >>>
    >>>> In article <87hc89bmfm.fsf@teufel.hartford-hwp.com>,
    >>>> Haines Brown wroteready worko), there
    >>>>> Would a move to a SATA 3.0 Gb/s drive such as the Seagate Barracuda mean
    >>>>> that I will henceforth have to accept drive unreliability?
    >>>> _All_ drives are unreliable to some degree. The ultimate in
    >>>> computer-readable reliability is probably Tyvek punched tape.
    >>> Yes, but my subjective impression is that there is a very wide
    >>> difference in reliability. Of the dozen SCSI drives I've used over the
    >>> years, only one failed on me; reading on line discussions and reviews,
    >>> it seems that SATA drives fail regularly.
    >>>
    >>> I guess my question comes down to, why should one bother these days with
    >>> the added expense of SCSI hard disks?

    >>
    >> It is my impression (which may be false and/or out of date) that the
    >> instances of drive hardware that are matched with SCSI controllers are the
    >> more reliable (longer-lasting) ones.


    > This is a common misconception. The interface is irrelevant,


    Well it makes a difference, just hard to say if the interface
    improves the life of the drive.

    > the 'mean
    > time failure rate' is most certainly relevant. Essentially you pay more
    > for a disk with a longer mean time failure rate, meaning its less likely
    > to fail.


    HAHAHAHHA!! Thanks, I needed a good laugh.....

    Sorry, Not putting you down, Just the numbers they toss out.

    I used to work for a company what wrote software for figuring
    out the MTBF (Mean Time Between Failures) and spent a lot of
    time working with reliability engineers. MTBF is a bit of a
    guess at best.

    Okay, Lets compare two seagate drives:

    Barracuda 7200.10 SATA 3.0Gb/s 500-GB Hard Drive (ST3500630AS)

    MTBF 700,000 hours (79 years!)


    Barracuda ES.2 SAS 3.0-Gb/s 500-GB Hard Drive (ST3500620SS)

    MTBF 1,200,000 hours (136 years!)

    From the numbers about you might think the SAS drive is going to
    last twice as long. I don't think either of these drives are going
    to last 50+ years unused in storage let alone in a running system!
    You might see 5-7 years of 24/7 running, at best, before they're
    going to start to drop like flies.

    To get MTBF you multiply the failure rate of the parts together.
    So the more parts you use is likely to lower the MTBF. You can tweak
    the numbers by using fewer or higher quality parts. If you can
    half the number of parts the MTBF will get better, so 50 average parts
    worse MTBF, 25 average parts better MTBF. But 50 high quality parts
    could have a better MTBF then 25 average parts. You can use a few
    super lower failure rate parts and lots of low quality parts and
    get a better MTBF (on paper) then using all average parts and it
    will fail more often then the MTBF would make you think. Then there
    other factors like operating temperature, humidity, and environment
    (dirty office/clean computer room/inside a flight computer in a
    jet/etc).

    At best you can figure from the MTBF that either they're using
    better or fewer parts. But with something thats at least 10 times off
    (7 vs. 70 years) you can't judge by MTBF.

    MTBF looks great to marketing/sales people but not much real world
    value for users.

    --
    Barry Keeney
    Chaos Consulting
    email barrykchaoscon.com

    "Rap is Square Dancing gone terribly, terribly Wrong...."

  7. Re: SCSI vs SATA hard disks



    On Mon, 20 Oct 2008, Barry Keeney wrote:

    > criten wrote:
    >> Hactar wrote:
    >>> In article <87d4ixbd77.fsf@teufel.hartford-hwp.com>,
    >>> Haines Brown wrote:
    >>>> ebenZEROONE@verizon.net (Hactar) writes:
    >>>>
    >>>>> In article <87hc89bmfm.fsf@teufel.hartford-hwp.com>,
    >>>>> Haines Brown wroteready worko), there
    >>>>>> Would a move to a SATA 3.0 Gb/s drive such as the Seagate Barracuda mean
    >>>>>> that I will henceforth have to accept drive unreliability?
    >>>>> _All_ drives are unreliable to some degree. The ultimate in
    >>>>> computer-readable reliability is probably Tyvek punched tape.
    >>>> Yes, but my subjective impression is that there is a very wide
    >>>> difference in reliability. Of the dozen SCSI drives I've used over the
    >>>> years, only one failed on me; reading on line discussions and reviews,
    >>>> it seems that SATA drives fail regularly.
    >>>>
    >>>> I guess my question comes down to, why should one bother these days with
    >>>> the added expense of SCSI hard disks?
    >>>
    >>> It is my impression (which may be false and/or out of date) that the
    >>> instances of drive hardware that are matched with SCSI controllers are the
    >>> more reliable (longer-lasting) ones.

    >
    >> This is a common misconception. The interface is irrelevant,

    >
    > Well it makes a difference, just hard to say if the interface
    > improves the life of the drive.
    >
    >> the 'mean
    >> time failure rate' is most certainly relevant. Essentially you pay more
    >> for a disk with a longer mean time failure rate, meaning its less likely
    >> to fail.

    >
    > HAHAHAHHA!! Thanks, I needed a good laugh.....
    >
    > Sorry, Not putting you down, Just the numbers they toss out.
    >
    > I used to work for a company what wrote software for figuring
    > out the MTBF (Mean Time Between Failures) and spent a lot of
    > time working with reliability engineers. MTBF is a bit of a
    > guess at best.


    It's a pity you did not learn what MTBF actually refers to.


    >
    > Okay, Lets compare two seagate drives:
    >
    > Barracuda 7200.10 SATA 3.0Gb/s 500-GB Hard Drive (ST3500630AS)
    >
    > MTBF 700,000 hours (79 years!)
    >
    >
    > Barracuda ES.2 SAS 3.0-Gb/s 500-GB Hard Drive (ST3500620SS)
    >
    > MTBF 1,200,000 hours (136 years!)
    >
    > From the numbers about you might think the SAS drive is going to
    > last twice as long. I don't think either of these drives are going
    > to last 50+ years unused in storage let alone in a running system!
    > You might see 5-7 years of 24/7 running, at best, before they're
    > going to start to drop like flies.
    >



    That's not what MTBF is intended to measure. You are claiming that MTBF
    should equal lifetime and it does not.

    Essentially, MTBF measures the likelihood of a random failure, NOT an
    end-of-life failure. Arguably, MTBF is only useful to people who run large
    datacenters with many disks -- they can use MTBF to estimate the failure
    rate of their drives.


  8. Re: SCSI vs SATA hard disks

    Whoever wrote:


    > On Mon, 20 Oct 2008, Barry Keeney wrote:


    >> criten wrote:
    >>> Hactar wrote:
    >>>> In article <87d4ixbd77.fsf@teufel.hartford-hwp.com>,
    >>>> Haines Brown wrote:
    >>>>> ebenZEROONE@verizon.net (Hactar) writes:
    >>>>>
    >>>>>> In article <87hc89bmfm.fsf@teufel.hartford-hwp.com>,
    >>>>>> Haines Brown wroteready worko), there
    >>>>>>> Would a move to a SATA 3.0 Gb/s drive such as the Seagate Barracuda mean
    >>>>>>> that I will henceforth have to accept drive unreliability?
    >>>>>> _All_ drives are unreliable to some degree. The ultimate in
    >>>>>> computer-readable reliability is probably Tyvek punched tape.
    >>>>> Yes, but my subjective impression is that there is a very wide
    >>>>> difference in reliability. Of the dozen SCSI drives I've used over the
    >>>>> years, only one failed on me; reading on line discussions and reviews,
    >>>>> it seems that SATA drives fail regularly.
    >>>>>
    >>>>> I guess my question comes down to, why should one bother these days with
    >>>>> the added expense of SCSI hard disks?
    >>>>
    >>>> It is my impression (which may be false and/or out of date) that the
    >>>> instances of drive hardware that are matched with SCSI controllers are the
    >>>> more reliable (longer-lasting) ones.

    >>
    >>> This is a common misconception. The interface is irrelevant,

    >>
    >> Well it makes a difference, just hard to say if the interface
    >> improves the life of the drive.
    >>
    >>> the 'mean
    >>> time failure rate' is most certainly relevant. Essentially you pay more
    >>> for a disk with a longer mean time failure rate, meaning its less likely
    >>> to fail.

    >>
    >> HAHAHAHHA!! Thanks, I needed a good laugh.....
    >>
    >> Sorry, Not putting you down, Just the numbers they toss out.
    >>
    >> I used to work for a company what wrote software for figuring
    >> out the MTBF (Mean Time Between Failures) and spent a lot of
    >> time working with reliability engineers. MTBF is a bit of a
    >> guess at best.


    > It's a pity you did not learn what MTBF actually refers to.


    It's the "average time between failures of a system" that's what
    it means.

    I wasn't saying *HOW* it should be used in the big picture.

    >>
    >> Okay, Lets compare two seagate drives:
    >>
    >> Barracuda 7200.10 SATA 3.0Gb/s 500-GB Hard Drive (ST3500630AS)
    >>
    >> MTBF 700,000 hours (79 years!)
    >>
    >>
    >> Barracuda ES.2 SAS 3.0-Gb/s 500-GB Hard Drive (ST3500620SS)
    >>
    >> MTBF 1,200,000 hours (136 years!)
    >>
    >> From the numbers about you might think the SAS drive is going to
    >> last twice as long. I don't think either of these drives are going
    >> to last 50+ years unused in storage let alone in a running system!
    >> You might see 5-7 years of 24/7 running, at best, before they're
    >> going to start to drop like flies.
    >>



    > That's not what MTBF is intended to measure. You are claiming that MTBF
    > should equal lifetime and it does not.


    No, That's not my claim, I know it's not. When you only see the MTBF
    number it's easy to jump to the idea about how long something might last.

    I'm claiming the value of the "MTBF" is just about useless. There are
    much better ways for life cycle analysis/modelling. MTBF is great to
    toss out but has no real value by itself out of any context.

    Without knowing how the value for MTBF was calculated you can't know
    it's usefulness. Was it from a steady failure model like the Mil standards
    or something else? Whats the data behind the MTBF? How did they
    get this number? Did they do any real run testing or just run the
    numbers (ideal temp/operating conditions) that gets the best MTBF number?
    "Hmmm if we run the drive at a temp of -5C, the calculations say the MTBF
    is 1,200,000 hours. That's within the listed operating range."

    I'm not claiming Seagate or any other drive company is lying, cheating or
    trying to mislead people, they are just putting out the info they have
    that puts their products in the best light, idea enviroment/best possible
    results. You might find a paper on how they do their testing but it'll take
    some digging to figure out how they got their MTBF or MTTF numbers for
    a drive.

    > Essentially, MTBF measures the likelihood of a random failure, NOT an
    > end-of-life failure. Arguably, MTBF is only useful to people who run large
    > datacenters with many disks -- they can use MTBF to estimate the failure
    > rate of their drives.


    No, MTBF is the *AVERAGE* time between failures. That's why I hate
    seeing it used in marketing and specs sheets. It's not the real average, not
    even close (for hard drives anyway). It's not real data from years of
    running the drives, they don't have the time to run the drives for years
    before sending them to market to get the real numbers. It's just, at best,
    educated guessing using known data about the parts.

    If you take a 1000 new drives and run them until each fail the average
    you get won't be anything like 1,200,000 hours, even if you toss out
    numbers first 100 failures and only use the 900 longest lasting drives
    data.

    MTBF can be useful during the early design of new devices/electronics.
    If I get a value of MTBF of 1000 hours and I need atleast 2000 hours I
    need to rework the design or use other methods to figure out why it's
    low and fix the design.

    MTBF isn't useful by itself. The Annual Failure Rate(AFR) might be more
    useful, depending how they figured that out but no details on this either.
    (AFR for the ST3500630AS is 0.34%, ST3500620SS is 0.73% )


    How do I decide on a drive vendor?

    I use warranties and how the company deals with warranty
    repairs/replacement for drives as a guild. Not going to be
    the only thing I look at but it has been useful to me.

    A Short warranty - 3 years or less
    Paying to upgrade the warranty to 4 or 5 years.
    Limits for warranty replacement (only one warranty replacement, etc)
    Having to pay shipping costs

    These are possible problems and the drive might not be as good as
    others or it's going costs more over the long run.

    Is the warranty for their drives in their external cases the
    same as internal drives?

    If the maker can't build a case/drive combo that they will stand
    behind as long as an internal drive, maybe I should look elsewhere.

    --
    Barry Keeney
    Chaos Consulting
    email barrykchaoscon.com

    "Rap is Square Dancing gone terribly, terribly Wrong...."

  9. Re: SCSI vs SATA hard disks



    On Mon, 20 Oct 2008, Barry Keeney wrote:

    > Whoever wrote:
    >
    >
    >> On Mon, 20 Oct 2008, Barry Keeney wrote:

    >
    >>> criten wrote:
    >>>> Hactar wrote:
    >>>>> In article <87d4ixbd77.fsf@teufel.hartford-hwp.com>,
    >>>>> Haines Brown wrote:
    >>>>>> ebenZEROONE@verizon.net (Hactar) writes:
    >>>>>>
    >>>>>>> In article <87hc89bmfm.fsf@teufel.hartford-hwp.com>,
    >>>>>>> Haines Brown wroteready worko), there
    >>>>>>>> Would a move to a SATA 3.0 Gb/s drive such as the Seagate Barracuda mean
    >>>>>>>> that I will henceforth have to accept drive unreliability?
    >>>>>>> _All_ drives are unreliable to some degree. The ultimate in
    >>>>>>> computer-readable reliability is probably Tyvek punched tape.
    >>>>>> Yes, but my subjective impression is that there is a very wide
    >>>>>> difference in reliability. Of the dozen SCSI drives I've used over the
    >>>>>> years, only one failed on me; reading on line discussions and reviews,
    >>>>>> it seems that SATA drives fail regularly.
    >>>>>>
    >>>>>> I guess my question comes down to, why should one bother these days with
    >>>>>> the added expense of SCSI hard disks?
    >>>>>
    >>>>> It is my impression (which may be false and/or out of date) that the
    >>>>> instances of drive hardware that are matched with SCSI controllers are the
    >>>>> more reliable (longer-lasting) ones.
    >>>
    >>>> This is a common misconception. The interface is irrelevant,
    >>>
    >>> Well it makes a difference, just hard to say if the interface
    >>> improves the life of the drive.
    >>>
    >>>> the 'mean
    >>>> time failure rate' is most certainly relevant. Essentially you pay more
    >>>> for a disk with a longer mean time failure rate, meaning its less likely
    >>>> to fail.
    >>>
    >>> HAHAHAHHA!! Thanks, I needed a good laugh.....
    >>>
    >>> Sorry, Not putting you down, Just the numbers they toss out.
    >>>
    >>> I used to work for a company what wrote software for figuring
    >>> out the MTBF (Mean Time Between Failures) and spent a lot of
    >>> time working with reliability engineers. MTBF is a bit of a
    >>> guess at best.

    >
    >> It's a pity you did not learn what MTBF actually refers to.

    >
    > It's the "average time between failures of a system" that's what
    > it means.
    >
    > I wasn't saying *HOW* it should be used in the big picture.
    >
    >>>
    >>> Okay, Lets compare two seagate drives:
    >>>
    >>> Barracuda 7200.10 SATA 3.0Gb/s 500-GB Hard Drive (ST3500630AS)
    >>>
    >>> MTBF 700,000 hours (79 years!)
    >>>
    >>>
    >>> Barracuda ES.2 SAS 3.0-Gb/s 500-GB Hard Drive (ST3500620SS)
    >>>
    >>> MTBF 1,200,000 hours (136 years!)
    >>>
    >>> From the numbers about you might think the SAS drive is going to
    >>> last twice as long. I don't think either of these drives are going
    >>> to last 50+ years unused in storage let alone in a running system!
    >>> You might see 5-7 years of 24/7 running, at best, before they're
    >>> going to start to drop like flies.
    >>>

    >
    >
    >> That's not what MTBF is intended to measure. You are claiming that MTBF
    >> should equal lifetime and it does not.

    >
    > No, That's not my claim, I know it's not. When you only see the MTBF
    > number it's easy to jump to the idea about how long something might last.
    >
    > I'm claiming the value of the "MTBF" is just about useless. There are
    > much better ways for life cycle analysis/modelling. MTBF is great to
    > toss out but has no real value by itself out of any context.
    >
    > Without knowing how the value for MTBF was calculated you can't know
    > it's usefulness. Was it from a steady failure model like the Mil standards
    > or something else? Whats the data behind the MTBF? How did they
    > get this number? Did they do any real run testing or just run the
    > numbers (ideal temp/operating conditions) that gets the best MTBF number?
    > "Hmmm if we run the drive at a temp of -5C, the calculations say the MTBF
    > is 1,200,000 hours. That's within the listed operating range."
    >
    > I'm not claiming Seagate or any other drive company is lying, cheating or
    > trying to mislead people, they are just putting out the info they have
    > that puts their products in the best light, idea enviroment/best possible
    > results. You might find a paper on how they do their testing but it'll take
    > some digging to figure out how they got their MTBF or MTTF numbers for
    > a drive.
    >
    >> Essentially, MTBF measures the likelihood of a random failure, NOT an
    >> end-of-life failure. Arguably, MTBF is only useful to people who run large
    >> datacenters with many disks -- they can use MTBF to estimate the failure
    >> rate of their drives.

    >
    > No, MTBF is the *AVERAGE* time between failures. That's why I hate
    > seeing it used in marketing and specs sheets. It's not the real average, not
    > even close (for hard drives anyway). It's not real data from years of
    > running the drives, they don't have the time to run the drives for years
    > before sending them to market to get the real numbers. It's just, at best,
    > educated guessing using known data about the parts.
    >
    > If you take a 1000 new drives and run them until each fail the average
    > you get won't be anything like 1,200,000 hours, even if you toss out
    > numbers first 100 failures and only use the 900 longest lasting drives
    > data.


    Again, you show that you don't really understand MTBF. Most drives will
    fail because they reach end-of-life (they wear out). This is irrelevent to
    MTBF.

    Instead, if you took 1,200 drives, on average, you would expect one to
    fail every 1000 hours, assuming that you: 1. Ignore early failures and 2.
    swap out the drives before they wear out (without counting these
    swapped-out drives as failures).

    For the average user, the lifetime of the drive is more important. I'm not
    aware of drive manufacturers providing this information to consumers,
    however, like you, I believe it can be inferred from the warranties
    provided with the drives.


  10. Re: SCSI vs SATA hard disks

    Whoever wrote:


    > On Mon, 20 Oct 2008, Barry Keeney wrote:


    >> Whoever wrote:
    >>
    >>
    >>> On Mon, 20 Oct 2008, Barry Keeney wrote:

    >>
    >>>> criten wrote:
    >>>>> Hactar wrote:
    >>>>>> In article <87d4ixbd77.fsf@teufel.hartford-hwp.com>,
    >>>>>> Haines Brown wrote:
    >>>>>>> ebenZEROONE@verizon.net (Hactar) writes:
    >>>>>>>
    >>>>>>>> In article <87hc89bmfm.fsf@teufel.hartford-hwp.com>,
    >>>>>>>> Haines Brown wroteready worko), there
    >>>>>>>>> Would a move to a SATA 3.0 Gb/s drive such as the Seagate Barracuda mean
    >>>>>>>>> that I will henceforth have to accept drive unreliability?
    >>>>>>>> _All_ drives are unreliable to some degree. The ultimate in
    >>>>>>>> computer-readable reliability is probably Tyvek punched tape.
    >>>>>>> Yes, but my subjective impression is that there is a very wide
    >>>>>>> difference in reliability. Of the dozen SCSI drives I've used over the
    >>>>>>> years, only one failed on me; reading on line discussions and reviews,
    >>>>>>> it seems that SATA drives fail regularly.
    >>>>>>>
    >>>>>>> I guess my question comes down to, why should one bother these days with
    >>>>>>> the added expense of SCSI hard disks?
    >>>>>>
    >>>>>> It is my impression (which may be false and/or out of date) that the
    >>>>>> instances of drive hardware that are matched with SCSI controllers are the
    >>>>>> more reliable (longer-lasting) ones.
    >>>>
    >>>>> This is a common misconception. The interface is irrelevant,
    >>>>
    >>>> Well it makes a difference, just hard to say if the interface
    >>>> improves the life of the drive.
    >>>>
    >>>>> the 'mean
    >>>>> time failure rate' is most certainly relevant. Essentially you pay more
    >>>>> for a disk with a longer mean time failure rate, meaning its less likely
    >>>>> to fail.
    >>>>
    >>>> HAHAHAHHA!! Thanks, I needed a good laugh.....
    >>>>
    >>>> Sorry, Not putting you down, Just the numbers they toss out.
    >>>>
    >>>> I used to work for a company what wrote software for figuring
    >>>> out the MTBF (Mean Time Between Failures) and spent a lot of
    >>>> time working with reliability engineers. MTBF is a bit of a
    >>>> guess at best.

    >>
    >>> It's a pity you did not learn what MTBF actually refers to.

    >>
    >> It's the "average time between failures of a system" that's what
    >> it means.
    >>
    >> I wasn't saying *HOW* it should be used in the big picture.
    >>
    >>>>
    >>>> Okay, Lets compare two seagate drives:
    >>>>
    >>>> Barracuda 7200.10 SATA 3.0Gb/s 500-GB Hard Drive (ST3500630AS)
    >>>>
    >>>> MTBF 700,000 hours (79 years!)
    >>>>
    >>>>
    >>>> Barracuda ES.2 SAS 3.0-Gb/s 500-GB Hard Drive (ST3500620SS)
    >>>>
    >>>> MTBF 1,200,000 hours (136 years!)
    >>>>
    >>>> From the numbers about you might think the SAS drive is going to
    >>>> last twice as long. I don't think either of these drives are going
    >>>> to last 50+ years unused in storage let alone in a running system!
    >>>> You might see 5-7 years of 24/7 running, at best, before they're
    >>>> going to start to drop like flies.
    >>>>

    >>
    >>
    >>> That's not what MTBF is intended to measure. You are claiming that MTBF
    >>> should equal lifetime and it does not.

    >>
    >> No, That's not my claim, I know it's not. When you only see the MTBF
    >> number it's easy to jump to the idea about how long something might last.
    >>
    >> I'm claiming the value of the "MTBF" is just about useless. There are
    >> much better ways for life cycle analysis/modelling. MTBF is great to
    >> toss out but has no real value by itself out of any context.
    >>
    >> Without knowing how the value for MTBF was calculated you can't know
    >> it's usefulness. Was it from a steady failure model like the Mil standards
    >> or something else? Whats the data behind the MTBF? How did they
    >> get this number? Did they do any real run testing or just run the
    >> numbers (ideal temp/operating conditions) that gets the best MTBF number?
    >> "Hmmm if we run the drive at a temp of -5C, the calculations say the MTBF
    >> is 1,200,000 hours. That's within the listed operating range."
    >>
    >> I'm not claiming Seagate or any other drive company is lying, cheating or
    >> trying to mislead people, they are just putting out the info they have
    >> that puts their products in the best light, idea enviroment/best possible
    >> results. You might find a paper on how they do their testing but it'll take
    >> some digging to figure out how they got their MTBF or MTTF numbers for
    >> a drive.
    >>
    >>> Essentially, MTBF measures the likelihood of a random failure, NOT an
    >>> end-of-life failure. Arguably, MTBF is only useful to people who run large
    >>> datacenters with many disks -- they can use MTBF to estimate the failure
    >>> rate of their drives.

    >>
    >> No, MTBF is the *AVERAGE* time between failures. That's why I hate
    >> seeing it used in marketing and specs sheets. It's not the real average, not
    >> even close (for hard drives anyway). It's not real data from years of
    >> running the drives, they don't have the time to run the drives for years
    >> before sending them to market to get the real numbers. It's just, at best,
    >> educated guessing using known data about the parts.
    >>
    >> If you take a 1000 new drives and run them until each fail the average
    >> you get won't be anything like 1,200,000 hours, even if you toss out
    >> numbers first 100 failures and only use the 900 longest lasting drives
    >> data.


    > Again, you show that you don't really understand MTBF. Most drives will
    > fail because they reach end-of-life (they wear out). This is irrelevent to
    > MTBF.


    Something "wearing out" is a failure! Just because you don't repair
    the failed part/system doesn't mean is doesn't count as a failure! It
    should be call MTTF (Mean Time To Fail) if it's not going to be repaired.

    > Instead, if you took 1,200 drives, on average, you would expect one to
    > fail every 1000 hours, assuming that you: 1. Ignore early failures and 2.
    > swap out the drives before they wear out (without counting these
    > swapped-out drives as failures).


    No thats 1000 hrs MTBF/MTTF, with only one failure that's the only real
    data you have. The other 1199 drives haven't failed so you can't
    expand their "not failing" for MTBF. Failure rate yes, 1 per 1000 hours.
    (the problem with a small failure sample size)

    Okay, let's use "your" numbers. 1 per 1000hrs and drives don't "wear out"
    just fail.

    So one drive fails after 1000hrs, next at 2000hrs, third at 3000hrs, etc.
    At that rate after 10 years you'd have a total of *only* ~88 of the orignal
    1200 fail (8766hrs/year). With less the 10% of the drives failing, you have
    a total of 3,916,000hrs of operation of the 88 drives with a MTBF of 44,500hrs
    or 5.08 years.(But are 1100+ drives really going to be running after
    10 years?)

    Now run the numbers up to 100 failures. Thats a total of 5,050,000hrs
    or MTBF of 50,500hrs or 5.76years. (after running the test for 11.40 years!)

    Before you jump on this, remember the other 1100 drives are still
    running.... AND I'm using *Your* numbers.

    At this rate the MTBF is still below what the real data, your
    numbers are generating. If we started out with only a 100 drives
    it would be the real MTBF for this sample. (50,500hrs/5.76years)

    It's going to take 136+ years before all 1200 drives should
    fail. (at 1 per 1000hrs or 8.766 per year.)

    Keep running the numbers and MTBF might grows up to numbers like
    1,200,000 hours. Depending on sample size and a flat failure rate.

    That's the problem of a flat failure rates and the models that use
    them. Doesn't deal with the higher numbers of failures at the beginning
    (infant mortality) and near the end of life. (aka "Bath tub" curve
    failure rates)

    Now if the drive maker has a 5 year warranty and wants atleast 90%
    to make it to 5 years you'd want the MTBF/MTTF to be around 40,000hrs
    or 4.56yrs. (excluding early life failures and raising failure rates
    with age)

    So MTBF's of 1,200,000 are worthless without info used to get the number.

    > For the average user, the lifetime of the drive is more important. I'm not
    > aware of drive manufacturers providing this information to consumers,


    Well if you're a big computer maker like Dell/HP/etc you're going
    to want detailed specs on parts before you decide to use them and/or
    a warranty that works for the price point you're looking for.
    They don't want their name to be hurt because they used a cheap
    drive thats fails too often.

    > however, like you, I believe it can be inferred from the warranties
    > provided with the drives.


    You've got to figure they've done the math and know how much the
    warranties are going to cost per unit and they still expect a profit
    over the product life/warranty life.

    --
    Barry Keeney
    Chaos Consulting
    email barryk@chaoscon.com

    "Rap is Square Dancing gone terribly, terribly Wrong...."

  11. Re: SCSI vs SATA hard disks



    On Tue, 21 Oct 2008, Barry Keeney wrote:

    > Whoever wrote:
    >
    >
    >> On Mon, 20 Oct 2008, Barry Keeney wrote:

    >
    >>> Whoever wrote:
    >>>
    >>>
    >>>> On Mon, 20 Oct 2008, Barry Keeney wrote:
    >>>
    >>>>> criten wrote:
    >>>>>> Hactar wrote:
    >>>>>>> In article <87d4ixbd77.fsf@teufel.hartford-hwp.com>,
    >>>>>>> Haines Brown wrote:
    >>>>>>>> ebenZEROONE@verizon.net (Hactar) writes:
    >>>>>>>>
    >>>>>>>>> In article <87hc89bmfm.fsf@teufel.hartford-hwp.com>,
    >>>>>>>>> Haines Brown wroteready worko), there
    >>>>>>>>>> Would a move to a SATA 3.0 Gb/s drive such as the Seagate Barracuda mean
    >>>>>>>>>> that I will henceforth have to accept drive unreliability?
    >>>>>>>>> _All_ drives are unreliable to some degree. The ultimate in
    >>>>>>>>> computer-readable reliability is probably Tyvek punched tape.
    >>>>>>>> Yes, but my subjective impression is that there is a very wide
    >>>>>>>> difference in reliability. Of the dozen SCSI drives I've used over the
    >>>>>>>> years, only one failed on me; reading on line discussions and reviews,
    >>>>>>>> it seems that SATA drives fail regularly.
    >>>>>>>>
    >>>>>>>> I guess my question comes down to, why should one bother these days with
    >>>>>>>> the added expense of SCSI hard disks?
    >>>>>>>
    >>>>>>> It is my impression (which may be false and/or out of date) that the
    >>>>>>> instances of drive hardware that are matched with SCSI controllers are the
    >>>>>>> more reliable (longer-lasting) ones.
    >>>>>
    >>>>>> This is a common misconception. The interface is irrelevant,
    >>>>>
    >>>>> Well it makes a difference, just hard to say if the interface
    >>>>> improves the life of the drive.
    >>>>>
    >>>>>> the 'mean
    >>>>>> time failure rate' is most certainly relevant. Essentially you pay more
    >>>>>> for a disk with a longer mean time failure rate, meaning its less likely
    >>>>>> to fail.
    >>>>>
    >>>>> HAHAHAHHA!! Thanks, I needed a good laugh.....
    >>>>>
    >>>>> Sorry, Not putting you down, Just the numbers they toss out.
    >>>>>
    >>>>> I used to work for a company what wrote software for figuring
    >>>>> out the MTBF (Mean Time Between Failures) and spent a lot of
    >>>>> time working with reliability engineers. MTBF is a bit of a
    >>>>> guess at best.
    >>>
    >>>> It's a pity you did not learn what MTBF actually refers to.
    >>>
    >>> It's the "average time between failures of a system" that's what
    >>> it means.
    >>>
    >>> I wasn't saying *HOW* it should be used in the big picture.
    >>>
    >>>>>
    >>>>> Okay, Lets compare two seagate drives:
    >>>>>
    >>>>> Barracuda 7200.10 SATA 3.0Gb/s 500-GB Hard Drive (ST3500630AS)
    >>>>>
    >>>>> MTBF 700,000 hours (79 years!)
    >>>>>
    >>>>>
    >>>>> Barracuda ES.2 SAS 3.0-Gb/s 500-GB Hard Drive (ST3500620SS)
    >>>>>
    >>>>> MTBF 1,200,000 hours (136 years!)
    >>>>>
    >>>>> From the numbers about you might think the SAS drive is going to
    >>>>> last twice as long. I don't think either of these drives are going
    >>>>> to last 50+ years unused in storage let alone in a running system!
    >>>>> You might see 5-7 years of 24/7 running, at best, before they're
    >>>>> going to start to drop like flies.
    >>>>>
    >>>
    >>>
    >>>> That's not what MTBF is intended to measure. You are claiming that MTBF
    >>>> should equal lifetime and it does not.
    >>>
    >>> No, That's not my claim, I know it's not. When you only see the MTBF
    >>> number it's easy to jump to the idea about how long something might last.
    >>>
    >>> I'm claiming the value of the "MTBF" is just about useless. There are
    >>> much better ways for life cycle analysis/modelling. MTBF is great to
    >>> toss out but has no real value by itself out of any context.
    >>>
    >>> Without knowing how the value for MTBF was calculated you can't know
    >>> it's usefulness. Was it from a steady failure model like the Mil standards
    >>> or something else? Whats the data behind the MTBF? How did they
    >>> get this number? Did they do any real run testing or just run the
    >>> numbers (ideal temp/operating conditions) that gets the best MTBF number?
    >>> "Hmmm if we run the drive at a temp of -5C, the calculations say the MTBF
    >>> is 1,200,000 hours. That's within the listed operating range."
    >>>
    >>> I'm not claiming Seagate or any other drive company is lying, cheating or
    >>> trying to mislead people, they are just putting out the info they have
    >>> that puts their products in the best light, idea enviroment/best possible
    >>> results. You might find a paper on how they do their testing but it'll take
    >>> some digging to figure out how they got their MTBF or MTTF numbers for
    >>> a drive.
    >>>
    >>>> Essentially, MTBF measures the likelihood of a random failure, NOT an
    >>>> end-of-life failure. Arguably, MTBF is only useful to people who run large
    >>>> datacenters with many disks -- they can use MTBF to estimate the failure
    >>>> rate of their drives.
    >>>
    >>> No, MTBF is the *AVERAGE* time between failures. That's why I hate
    >>> seeing it used in marketing and specs sheets. It's not the real average, not
    >>> even close (for hard drives anyway). It's not real data from years of
    >>> running the drives, they don't have the time to run the drives for years
    >>> before sending them to market to get the real numbers. It's just, at best,
    >>> educated guessing using known data about the parts.
    >>>
    >>> If you take a 1000 new drives and run them until each fail the average
    >>> you get won't be anything like 1,200,000 hours, even if you toss out
    >>> numbers first 100 failures and only use the 900 longest lasting drives
    >>> data.

    >
    >> Again, you show that you don't really understand MTBF. Most drives will
    >> fail because they reach end-of-life (they wear out). This is irrelevent to
    >> MTBF.

    >
    > Something "wearing out" is a failure! Just because you don't repair
    > the failed part/system doesn't mean is doesn't count as a failure! It
    > should be call MTTF (Mean Time To Fail) if it's not going to be repaired.
    >
    >> Instead, if you took 1,200 drives, on average, you would expect one to
    >> fail every 1000 hours, assuming that you: 1. Ignore early failures and 2.
    >> swap out the drives before they wear out (without counting these
    >> swapped-out drives as failures).

    >
    > No thats 1000 hrs MTBF/MTTF, with only one failure that's the only real
    > data you have. The other 1199 drives haven't failed so you can't
    > expand their "not failing" for MTBF. Failure rate yes, 1 per 1000 hours.
    > (the problem with a small failure sample size)
    >
    > Okay, let's use "your" numbers. 1 per 1000hrs and drives don't "wear out"
    > just fail.
    >
    > So one drive fails after 1000hrs, next at 2000hrs, third at 3000hrs, etc.
    > At that rate after 10 years you'd have a total of *only* ~88 of the orignal
    > 1200 fail (8766hrs/year). With less the 10% of the drives failing, you have
    > a total of 3,916,000hrs of operation of the 88 drives with a MTBF of 44,500hrs
    > or 5.08 years.(But are 1100+ drives really going to be running after
    > 10 years?)
    >
    > Now run the numbers up to 100 failures. Thats a total of 5,050,000hrs
    > or MTBF of 50,500hrs or 5.76years. (after running the test for 11.40 years!)
    >
    > Before you jump on this, remember the other 1100 drives are still
    > running.... AND I'm using *Your* numbers.


    Ah, I see the reason for your poor understanding of MTBF, you have poor
    reading comprehension skills also. What is it about "2.
    swap out the drives before they wear out (without counting these
    swapped-out drives as failures)." that you don't understand? In my
    scenario, not one of the "other 1100 drives" would be running, because
    they would have been swapped out.

    >
    > At this rate the MTBF is still below what the real data, your
    > numbers are generating. If we started out with only a 100 drives
    > it would be the real MTBF for this sample. (50,500hrs/5.76years)
    >
    > It's going to take 136+ years before all 1200 drives should
    > fail. (at 1 per 1000hrs or 8.766 per year.)


    Again, look up the definition of MTBF. It assumes that you replace or
    repair failed units, so there is no time at which "all 1200 drives ...
    fail". Instead, they have all been swapped out (because of age, not
    necessarily failure), probably many times and at the end of the
    experiment, you still have 1200 drives.

    One more comment, you might want to study some statistics. In the
    experiment you propose, there is an ever reducing number of drives in the
    experiment (as they fail), yet the rate at which drives fail is unchanged.
    This seems rather unlike -- instead, as the number of drives in the
    experiement is reduced, the number of drives that fail per month would
    also be reduced.

    I will agree with you on one thing though -- I do suspect that MTBF rates
    are artifically high.

  12. Re: SCSI vs SATA hard disks

    In article ,
    Whoever wrote:
    :
    :For the average user, the lifetime of the drive is more important. I'm not
    :aware of drive manufacturers providing this information to consumers,
    :however, like you, I believe it can be inferred from the warranties
    rovided with the drives.

    Out of curiosity, I did some calculations based on the SMART data
    for Power_On_Hours on a couple of my PATA drives:

    Drive A (ST380013A):
    ID# ATTRIBUTE_NAME VALUE WORST THRESH TYPE UPDATED RAW_VALUE
    9 Power_On_Hours 058 058 000 Old_age Always 36796

    Looks like 36796 hours might be 42% (100 - 58) of expected life.

    (36796 / .42) = 87610 hours, or 10.00 years

    Drive B (ST3500630A):
    ID# ATTRIBUTE_NAME VALUE WORST THRESH TYPE UPDATED RAW_VALUE
    9 Power_On_Hours 092 092 000 Old_age Always 7078

    (7078 / .08) = 88475 hours, or 10.10 years

    Surprisingly consistent, and strictly "FWIW", which might be not much
    since I'm making an assumption about the unknown conversion from raw
    to normalized values.

    --
    Bob Nichols AT comcast.net I am "RNichols42"

  13. Re: SCSI vs SATA hard disks

    >Whoever wrote:
    >> On Tue, 21 Oct 2008, Barry Keeney wrote:
    >>> Whoever wrote:

    >>
    >> Something "wearing out" is a failure! Just because you don't repair
    >> the failed part/system doesn't mean is doesn't count as a failure! It
    >> should be call MTTF (Mean Time To Fail) if it's not going to be repaired.
    >>
    >>> Instead, if you took 1,200 drives, on average, you would expect one to
    >>> fail every 1000 hours, assuming that you: 1. Ignore early failures and 2.
    >>> swap out the drives before they wear out (without counting these
    >>> swapped-out drives as failures).

    >>
    >> No thats 1000 hrs MTBF/MTTF, with only one failure that's the only real
    >> data you have. The other 1199 drives haven't failed so you can't
    >> expand their "not failing" for MTBF. Failure rate yes, 1 per 1000 hours.
    >> (the problem with a small failure sample size)
    >>
    >> Okay, let's use "your" numbers. 1 per 1000hrs and drives don't "wear out"
    >> just fail.
    >>
    >> So one drive fails after 1000hrs, next at 2000hrs, third at 3000hrs, etc.
    >> At that rate after 10 years you'd have a total of *only* ~88 of the orignal
    >> 1200 fail (8766hrs/year). With less the 10% of the drives failing, you have
    >> a total of 3,916,000hrs of operation of the 88 drives with a MTBF of 44,500hrs
    >> or 5.08 years.(But are 1100+ drives really going to be running after
    >> 10 years?)
    >>
    >> Now run the numbers up to 100 failures. Thats a total of 5,050,000hrs
    >> or MTBF of 50,500hrs or 5.76years. (after running the test for 11.40 years!)
    >>
    >> Before you jump on this, remember the other 1100 drives are still
    >> running.... AND I'm using *Your* numbers.


    > Ah, I see the reason for your poor understanding of MTBF, you have poor
    > reading comprehension skills also. What is it about "2.
    > swap out the drives before they wear out (without counting these
    > swapped-out drives as failures)." that you don't understand? In my
    > scenario, not one of the "other 1100 drives" would be running, because
    > they would have been swapped out.


    No it just shows you don't know much about failure analysis.

    You don't wait until the product is out with customers to figure
    out how long it's going to last. You test BEFORE going to market.

    To figure out failure rates you can test or use known data.
    If you're testing you need to run *UNTIL* failure. You can't
    get failure data/rates if you're replacing your test sample before
    they fail. "We didn't have any fail, they'll last forever!" :^)

    If you're not using some sort of stress testing (higher temps/etc)
    then you can just run them until failure. You pick a sample size
    run them until you have enough failures to figure out failure rates
    and MTBF.

    Your scenario might be how some companies run there data centers,
    but I haven't seen this. I've seen replacing everything every 3-5 years
    with newer tech and replacing failed parts as needed as more the norm.

    Once you buy the product, you can test how you like, but it
    doesn't sound like testing but more like running in an production
    environment were you want to prevent failures.

    >> At this rate the MTBF is still below what the real data, your
    >> numbers are generating. If we started out with only a 100 drives
    >> it would be the real MTBF for this sample. (50,500hrs/5.76years)
    >>
    >> It's going to take 136+ years before all 1200 drives should
    >> fail. (at 1 per 1000hrs or 8.766 per year.)


    > Again, look up the definition of MTBF. It assumes that you replace or
    > repair failed units, so there is no time at which "all 1200 drives ...
    > fail". Instead, they have all been swapped out (because of age, not
    > necessarily failure), probably many times and at the end of the
    > experiment, you still have 1200 drives.


    Not in "run to failure testing", you don't replace the failures.
    You continue to run the remaining units until either the time for the
    test is over or you reach the number of failures required, depending
    on the test parameters. If you replace a failed part, that parts start
    time is different and you'll need to keep track of this. If you're running
    a 1000 hour test this replacement will still need to run the 1000 hours,
    not whatever hours remain. Otherwise it wouldn't have been stress the
    same as the orignal units in the sample.

    > One more comment, you might want to study some statistics. In the
    > experiment you propose, there is an ever reducing number of drives in the
    > experiment (as they fail), yet the rate at which drives fail is unchanged.
    > This seems rather unlike -- instead, as the number of drives in the
    > experiement is reduced, the number of drives that fail per month would
    > also be reduced.


    Since we haven't run the real world test we can only model it using
    known info. 1200 drives, fixed failure rate of 1 per 1000hrs.

    With fixed failure rate models, failure rate is constant, number of
    units doesn't matter. You've got 10, one fails per hour, after 5 hours
    you'll have 5, Not what you'd see in the real world true. So you do get
    the results I used. Again, a problem with fixed failure rate models.
    (Hmm I keep saying this but you don't seem to notice it....)

    In Real testing this *shouldn't* be the case. You'll expect to get a
    spike of failures near the start, lower in the middle and raising when
    nearing end of life. I *think* we can agree on this?

    > I will agree with you on one thing though -- I do suspect that MTBF rates
    > are artifically high.


    Well there is that.....

    --
    Barry Keeney
    Chaos Consulting
    email barryk<@>chaoscon.com

    "Rap is Square Dancing gone terribly, terribly Wrong...."

+ Reply to Thread
Page 3 of 3 FirstFirst 1 2 3