SCSI vs SATA hard disks - Hardware
This is a discussion on SCSI vs SATA hard disks - Hardware ; Aragorn,
Thanks for the clear explanations. I'm much more on top of the situation
now.
--
Haines Brown, KB1GRM...
-
Re: SCSI vs SATA hard disks
Aragorn,
Thanks for the clear explanations. I'm much more on top of the situation
now.
--
Haines Brown, KB1GRM
-
Re: SCSI vs SATA hard disks
Aragorn writes:
>I'm not so sure that's a market trend. It just so happens to be that SCSI
>is no longer considered useful in the home and office desktop market, but
>servers are most definitely still using SCSI.
>However, the SCSI that's being used and marketed today is no longer of the
>parallel variant. Just as parallel ATA had to make way for serial ATA,
>SCSI has by now already started making way for serial attached SCSI (SAS)
>and iSCSI for storage area networks.
I'm sorry, but you'll have go a long way to convince me the parentage of
SAS is anything but:
MarketDroid 1): Damn, everyone is buying SATA drives; the price is
falling and we are screwed. How do we come up with a way to charge a
premium without really doing a lot of work?
MD2: I've got it! We'll rebadge SATA into something with SCSI in the
name, so it sounds beefier... hmmm that's it.. Serial Attached SCSI.
We save the investment in ""SCSI"" and build up the hype around it.
MD1: But SATA really has some pluses.. Are we going to ignore them?
MD2: We'll use Gate's ploy -- extend and embrace! We'll tweek some
SATA specs here and there, adding some things we can talk up. But
we'll save a bundle on connectors alone.
......
In the past, SCSI server drives brought you two things: performance and
reliability. [Think of those 9 GB Barracudas..].
Now the issues are: Does SAS really do that much over SATA, for your
case? And: Does paying SAS prices really give you more reliable drives,
or just different electronics?
--
A host is a host from coast to coast.................wb8foz@nrk.com
& no one will talk to a host that's close........[v].(301) 56-LINUX
Unless the host (that isn't close).........................pob 1433
is busy, hung or dead....................................20915-1433
-
Re: SCSI vs SATA hard disks
On Thursday 25 September 2008 19:14, someone identifying as *David Lesher*
wrote in /comp.os.linux.hardware:/
> Aragorn writes:
>
>> I'm not so sure that's a market trend. It just so happens to be that
>> SCSI is no longer considered useful in the home and office desktop
>> market, but servers are most definitely still using SCSI.
>>
>> However, the SCSI that's being used and marketed today is no longer of
>> the parallel variant. Just as parallel ATA had to make way for serial
>> ATA, SCSI has by now already started making way for serial attached SCSI
>> (SAS) and iSCSI for storage area networks.
>
> I'm sorry, but you'll have go a long way to convince me the parentage of
> SAS is anything but:
>
> MarketDroid 1): Damn, everyone is buying SATA drives; the price is
> falling and we are screwed. How do we come up with a way to charge a
> premium without really doing a lot of work?
>
> MD2: I've got it! We'll rebadge SATA into something with SCSI in the
> name, so it sounds beefier... hmmm that's it.. Serial Attached SCSI.
> We save the investment in ""SCSI"" and build up the hype around it.
>
> MD1: But SATA really has some pluses.. Are we going to ignore them?
>
> MD2: We'll use Gate's ploy -- extend and embrace! We'll tweek some
> SATA specs here and there, adding some things we can talk up. But
> we'll save a bundle on connectors alone.
As with everything, technology is mainly developed to get marketed rather
than for progress, but SAS is far more than what you describe above.
The serialization of SCSI does offer some benefits with regard to large
enterprises and data centers, and it all falls within the spirit of
extending the possibilities of SCSI, e.g. there is also iSCSI now, which is
a SCSI tunnel over ethernet.
> In the past, SCSI server drives brought you two things: performance and
> reliability. [Think of those 9 GB Barracudas..].
It still does. The drives themselves - or at least, the ones I know - are
basically the same as the U320 drives, but their maximum throughput is
higher, whereas you could end up with a bottleneck on parallel SCSI chains.
> Now the issues are: Does SAS really do that much over SATA, for your
> case? And: Does paying SAS prices really give you more reliable drives,
> or just different electronics?
SAS drives *are* SCSI drives, so they do have all the goodies that SCSI
comes with - e.g. ECC, logging, tagged command queueing - whereas SATA is
actually nothing other than a serialized ATA drive in which an attempt was
made to make ATA/IDE more SCSI-like.
Enterprise-grade SATA drives are probably just or nearly as reliable as
SAS/SCSI, but they lack the features that made SCSI stand out. SATA still
is ATA, don't forget that. ;-) Also, not all SATA drives - not even in the
enterprise-grade range - are fit to be used in RAID arrays, while SAS
drives all are RAID-rated.
On the other hand, if you care more about cost-effectiveness than features,
then SATA offers (far) more storage per Dollar/Euro than SCSI. But then
again, this was already the case for PATA - aka IDE, although SATA is IDE
as well - versus parallel SCSI.
So the bottom line is that if you're thinking about marketing scams, the
scam would rather rest with SATA than with SAS, because SATA was intended
to mimic SCSI over an IDE bus, but still has to rely on the SATA-specific
NCQ (native command queueing) over the SCSI-specific TCQ (tagged command
queueing), because TCQ on SATA sucks. Also, the difference in retail price
between a SAS disk and an U320 SCSI disk is mainly negligible.
--
*Aragorn*
(registered GNU/Linux user #223157)
-
Re: SCSI vs SATA hard disks
Haines Brown wrote:
> ebenZEROONE@verizon.net (Hactar) writes:
>
>> In article <87hc89bmfm.fsf@teufel.hartford-hwp.com>,
>> Haines Brown wroteready worko), there
>>> Would a move to a SATA 3.0 Gb/s drive such as the Seagate Barracuda mean
>>> that I will henceforth have to accept drive unreliability?
>> _All_ drives are unreliable to some degree. The ultimate in
>> computer-readable reliability is probably Tyvek punched tape.
>
> Yes, but my subjective impression is that there is a very wide
> difference in reliability. Of the dozen SCSI drives I've used over the
> years, only one failed on me; reading on line discussions and reviews,
> it seems that SATA drives fail regularly.
I've been using SATA for the last few years and haven't had any issues
with them. I would recommend them because they provide high data
transfer rates and high RPMs with low cost. Not to mention their
connectors are so small that you can have 6 plugged into a very small
area on the motherboard and it's still easy to manage. The cables
themselves are also smaller than the older PATA cables which I also love.
I bought 2 320GB SATA2 drives 2 years ago on Newegg. I put them in a
mirror and have had no issues. My PC is on 24x7 too. About a year ago I
got 2 80GB SATA2 drives and put them into a strip array (on the same
controller as the other 2 320GB drives. Again, no issues to date.
-
Re: SCSI vs SATA hard disks
Hactar wrote:
> In article <87d4ixbd77.fsf@teufel.hartford-hwp.com>,
> Haines Brown wrote:
>> ebenZEROONE@verizon.net (Hactar) writes:
>>
>>> In article <87hc89bmfm.fsf@teufel.hartford-hwp.com>,
>>> Haines Brown wroteready worko), there
>>>> Would a move to a SATA 3.0 Gb/s drive such as the Seagate Barracuda mean
>>>> that I will henceforth have to accept drive unreliability?
>>> _All_ drives are unreliable to some degree. The ultimate in
>>> computer-readable reliability is probably Tyvek punched tape.
>> Yes, but my subjective impression is that there is a very wide
>> difference in reliability. Of the dozen SCSI drives I've used over the
>> years, only one failed on me; reading on line discussions and reviews,
>> it seems that SATA drives fail regularly.
>>
>> I guess my question comes down to, why should one bother these days with
>> the added expense of SCSI hard disks?
>
> It is my impression (which may be false and/or out of date) that the
> instances of drive hardware that are matched with SCSI controllers are the
> more reliable (longer-lasting) ones.
This is a common misconception. The interface is irrelevant, the 'mean
time failure rate' is most certainly relevant. Essentially you pay more
for a disk with a longer mean time failure rate, meaning its less likely
to fail. And yes, SCSI is dying.
-
Re: SCSI vs SATA hard disks
criten wrote:
> Hactar wrote:
>> In article <87d4ixbd77.fsf@teufel.hartford-hwp.com>,
>> Haines Brown wrote:
>>> ebenZEROONE@verizon.net (Hactar) writes:
>>>
>>>> In article <87hc89bmfm.fsf@teufel.hartford-hwp.com>,
>>>> Haines Brown wroteready worko), there
>>>>> Would a move to a SATA 3.0 Gb/s drive such as the Seagate Barracuda mean
>>>>> that I will henceforth have to accept drive unreliability?
>>>> _All_ drives are unreliable to some degree. The ultimate in
>>>> computer-readable reliability is probably Tyvek punched tape.
>>> Yes, but my subjective impression is that there is a very wide
>>> difference in reliability. Of the dozen SCSI drives I've used over the
>>> years, only one failed on me; reading on line discussions and reviews,
>>> it seems that SATA drives fail regularly.
>>>
>>> I guess my question comes down to, why should one bother these days with
>>> the added expense of SCSI hard disks?
>>
>> It is my impression (which may be false and/or out of date) that the
>> instances of drive hardware that are matched with SCSI controllers are the
>> more reliable (longer-lasting) ones.
> This is a common misconception. The interface is irrelevant,
Well it makes a difference, just hard to say if the interface
improves the life of the drive.
> the 'mean
> time failure rate' is most certainly relevant. Essentially you pay more
> for a disk with a longer mean time failure rate, meaning its less likely
> to fail.
HAHAHAHHA!! Thanks, I needed a good laugh.....
Sorry, Not putting you down, Just the numbers they toss out.
I used to work for a company what wrote software for figuring
out the MTBF (Mean Time Between Failures) and spent a lot of
time working with reliability engineers. MTBF is a bit of a
guess at best.
Okay, Lets compare two seagate drives:
Barracuda 7200.10 SATA 3.0Gb/s 500-GB Hard Drive (ST3500630AS)
MTBF 700,000 hours (79 years!)
Barracuda ES.2 SAS 3.0-Gb/s 500-GB Hard Drive (ST3500620SS)
MTBF 1,200,000 hours (136 years!)
From the numbers about you might think the SAS drive is going to
last twice as long. I don't think either of these drives are going
to last 50+ years unused in storage let alone in a running system!
You might see 5-7 years of 24/7 running, at best, before they're
going to start to drop like flies.
To get MTBF you multiply the failure rate of the parts together.
So the more parts you use is likely to lower the MTBF. You can tweak
the numbers by using fewer or higher quality parts. If you can
half the number of parts the MTBF will get better, so 50 average parts
worse MTBF, 25 average parts better MTBF. But 50 high quality parts
could have a better MTBF then 25 average parts. You can use a few
super lower failure rate parts and lots of low quality parts and
get a better MTBF (on paper) then using all average parts and it
will fail more often then the MTBF would make you think. Then there
other factors like operating temperature, humidity, and environment
(dirty office/clean computer room/inside a flight computer in a
jet/etc).
At best you can figure from the MTBF that either they're using
better or fewer parts. But with something thats at least 10 times off
(7 vs. 70 years) you can't judge by MTBF.
MTBF looks great to marketing/sales people but not much real world
value for users.
--
Barry Keeney
Chaos Consulting
email barrykchaoscon.com
"Rap is Square Dancing gone terribly, terribly Wrong...."
-
Re: SCSI vs SATA hard disks
On Mon, 20 Oct 2008, Barry Keeney wrote:
> criten wrote:
>> Hactar wrote:
>>> In article <87d4ixbd77.fsf@teufel.hartford-hwp.com>,
>>> Haines Brown wrote:
>>>> ebenZEROONE@verizon.net (Hactar) writes:
>>>>
>>>>> In article <87hc89bmfm.fsf@teufel.hartford-hwp.com>,
>>>>> Haines Brown wroteready worko), there
>>>>>> Would a move to a SATA 3.0 Gb/s drive such as the Seagate Barracuda mean
>>>>>> that I will henceforth have to accept drive unreliability?
>>>>> _All_ drives are unreliable to some degree. The ultimate in
>>>>> computer-readable reliability is probably Tyvek punched tape.
>>>> Yes, but my subjective impression is that there is a very wide
>>>> difference in reliability. Of the dozen SCSI drives I've used over the
>>>> years, only one failed on me; reading on line discussions and reviews,
>>>> it seems that SATA drives fail regularly.
>>>>
>>>> I guess my question comes down to, why should one bother these days with
>>>> the added expense of SCSI hard disks?
>>>
>>> It is my impression (which may be false and/or out of date) that the
>>> instances of drive hardware that are matched with SCSI controllers are the
>>> more reliable (longer-lasting) ones.
>
>> This is a common misconception. The interface is irrelevant,
>
> Well it makes a difference, just hard to say if the interface
> improves the life of the drive.
>
>> the 'mean
>> time failure rate' is most certainly relevant. Essentially you pay more
>> for a disk with a longer mean time failure rate, meaning its less likely
>> to fail.
>
> HAHAHAHHA!! Thanks, I needed a good laugh.....
>
> Sorry, Not putting you down, Just the numbers they toss out.
>
> I used to work for a company what wrote software for figuring
> out the MTBF (Mean Time Between Failures) and spent a lot of
> time working with reliability engineers. MTBF is a bit of a
> guess at best.
It's a pity you did not learn what MTBF actually refers to.
>
> Okay, Lets compare two seagate drives:
>
> Barracuda 7200.10 SATA 3.0Gb/s 500-GB Hard Drive (ST3500630AS)
>
> MTBF 700,000 hours (79 years!)
>
>
> Barracuda ES.2 SAS 3.0-Gb/s 500-GB Hard Drive (ST3500620SS)
>
> MTBF 1,200,000 hours (136 years!)
>
> From the numbers about you might think the SAS drive is going to
> last twice as long. I don't think either of these drives are going
> to last 50+ years unused in storage let alone in a running system!
> You might see 5-7 years of 24/7 running, at best, before they're
> going to start to drop like flies.
>
That's not what MTBF is intended to measure. You are claiming that MTBF
should equal lifetime and it does not.
Essentially, MTBF measures the likelihood of a random failure, NOT an
end-of-life failure. Arguably, MTBF is only useful to people who run large
datacenters with many disks -- they can use MTBF to estimate the failure
rate of their drives.
-
Re: SCSI vs SATA hard disks
Whoever wrote:
> On Mon, 20 Oct 2008, Barry Keeney wrote:
>> criten wrote:
>>> Hactar wrote:
>>>> In article <87d4ixbd77.fsf@teufel.hartford-hwp.com>,
>>>> Haines Brown wrote:
>>>>> ebenZEROONE@verizon.net (Hactar) writes:
>>>>>
>>>>>> In article <87hc89bmfm.fsf@teufel.hartford-hwp.com>,
>>>>>> Haines Brown wroteready worko), there
>>>>>>> Would a move to a SATA 3.0 Gb/s drive such as the Seagate Barracuda mean
>>>>>>> that I will henceforth have to accept drive unreliability?
>>>>>> _All_ drives are unreliable to some degree. The ultimate in
>>>>>> computer-readable reliability is probably Tyvek punched tape.
>>>>> Yes, but my subjective impression is that there is a very wide
>>>>> difference in reliability. Of the dozen SCSI drives I've used over the
>>>>> years, only one failed on me; reading on line discussions and reviews,
>>>>> it seems that SATA drives fail regularly.
>>>>>
>>>>> I guess my question comes down to, why should one bother these days with
>>>>> the added expense of SCSI hard disks?
>>>>
>>>> It is my impression (which may be false and/or out of date) that the
>>>> instances of drive hardware that are matched with SCSI controllers are the
>>>> more reliable (longer-lasting) ones.
>>
>>> This is a common misconception. The interface is irrelevant,
>>
>> Well it makes a difference, just hard to say if the interface
>> improves the life of the drive.
>>
>>> the 'mean
>>> time failure rate' is most certainly relevant. Essentially you pay more
>>> for a disk with a longer mean time failure rate, meaning its less likely
>>> to fail.
>>
>> HAHAHAHHA!! Thanks, I needed a good laugh.....
>>
>> Sorry, Not putting you down, Just the numbers they toss out.
>>
>> I used to work for a company what wrote software for figuring
>> out the MTBF (Mean Time Between Failures) and spent a lot of
>> time working with reliability engineers. MTBF is a bit of a
>> guess at best.
> It's a pity you did not learn what MTBF actually refers to.
It's the "average time between failures of a system" that's what
it means.
I wasn't saying *HOW* it should be used in the big picture.
>>
>> Okay, Lets compare two seagate drives:
>>
>> Barracuda 7200.10 SATA 3.0Gb/s 500-GB Hard Drive (ST3500630AS)
>>
>> MTBF 700,000 hours (79 years!)
>>
>>
>> Barracuda ES.2 SAS 3.0-Gb/s 500-GB Hard Drive (ST3500620SS)
>>
>> MTBF 1,200,000 hours (136 years!)
>>
>> From the numbers about you might think the SAS drive is going to
>> last twice as long. I don't think either of these drives are going
>> to last 50+ years unused in storage let alone in a running system!
>> You might see 5-7 years of 24/7 running, at best, before they're
>> going to start to drop like flies.
>>
> That's not what MTBF is intended to measure. You are claiming that MTBF
> should equal lifetime and it does not.
No, That's not my claim, I know it's not. When you only see the MTBF
number it's easy to jump to the idea about how long something might last.
I'm claiming the value of the "MTBF" is just about useless. There are
much better ways for life cycle analysis/modelling. MTBF is great to
toss out but has no real value by itself out of any context.
Without knowing how the value for MTBF was calculated you can't know
it's usefulness. Was it from a steady failure model like the Mil standards
or something else? Whats the data behind the MTBF? How did they
get this number? Did they do any real run testing or just run the
numbers (ideal temp/operating conditions) that gets the best MTBF number?
"Hmmm if we run the drive at a temp of -5C, the calculations say the MTBF
is 1,200,000 hours. That's within the listed operating range."
I'm not claiming Seagate or any other drive company is lying, cheating or
trying to mislead people, they are just putting out the info they have
that puts their products in the best light, idea enviroment/best possible
results. You might find a paper on how they do their testing but it'll take
some digging to figure out how they got their MTBF or MTTF numbers for
a drive.
> Essentially, MTBF measures the likelihood of a random failure, NOT an
> end-of-life failure. Arguably, MTBF is only useful to people who run large
> datacenters with many disks -- they can use MTBF to estimate the failure
> rate of their drives.
No, MTBF is the *AVERAGE* time between failures. That's why I hate
seeing it used in marketing and specs sheets. It's not the real average, not
even close (for hard drives anyway). It's not real data from years of
running the drives, they don't have the time to run the drives for years
before sending them to market to get the real numbers. It's just, at best,
educated guessing using known data about the parts.
If you take a 1000 new drives and run them until each fail the average
you get won't be anything like 1,200,000 hours, even if you toss out
numbers first 100 failures and only use the 900 longest lasting drives
data.
MTBF can be useful during the early design of new devices/electronics.
If I get a value of MTBF of 1000 hours and I need atleast 2000 hours I
need to rework the design or use other methods to figure out why it's
low and fix the design.
MTBF isn't useful by itself. The Annual Failure Rate(AFR) might be more
useful, depending how they figured that out but no details on this either.
(AFR for the ST3500630AS is 0.34%, ST3500620SS is 0.73% )
How do I decide on a drive vendor?
I use warranties and how the company deals with warranty
repairs/replacement for drives as a guild. Not going to be
the only thing I look at but it has been useful to me.
A Short warranty - 3 years or less
Paying to upgrade the warranty to 4 or 5 years.
Limits for warranty replacement (only one warranty replacement, etc)
Having to pay shipping costs
These are possible problems and the drive might not be as good as
others or it's going costs more over the long run.
Is the warranty for their drives in their external cases the
same as internal drives?
If the maker can't build a case/drive combo that they will stand
behind as long as an internal drive, maybe I should look elsewhere.
--
Barry Keeney
Chaos Consulting
email barrykchaoscon.com
"Rap is Square Dancing gone terribly, terribly Wrong...."
-
Re: SCSI vs SATA hard disks
On Mon, 20 Oct 2008, Barry Keeney wrote:
> Whoever wrote:
>
>
>> On Mon, 20 Oct 2008, Barry Keeney wrote:
>
>>> criten wrote:
>>>> Hactar wrote:
>>>>> In article <87d4ixbd77.fsf@teufel.hartford-hwp.com>,
>>>>> Haines Brown wrote:
>>>>>> ebenZEROONE@verizon.net (Hactar) writes:
>>>>>>
>>>>>>> In article <87hc89bmfm.fsf@teufel.hartford-hwp.com>,
>>>>>>> Haines Brown wroteready worko), there
>>>>>>>> Would a move to a SATA 3.0 Gb/s drive such as the Seagate Barracuda mean
>>>>>>>> that I will henceforth have to accept drive unreliability?
>>>>>>> _All_ drives are unreliable to some degree. The ultimate in
>>>>>>> computer-readable reliability is probably Tyvek punched tape.
>>>>>> Yes, but my subjective impression is that there is a very wide
>>>>>> difference in reliability. Of the dozen SCSI drives I've used over the
>>>>>> years, only one failed on me; reading on line discussions and reviews,
>>>>>> it seems that SATA drives fail regularly.
>>>>>>
>>>>>> I guess my question comes down to, why should one bother these days with
>>>>>> the added expense of SCSI hard disks?
>>>>>
>>>>> It is my impression (which may be false and/or out of date) that the
>>>>> instances of drive hardware that are matched with SCSI controllers are the
>>>>> more reliable (longer-lasting) ones.
>>>
>>>> This is a common misconception. The interface is irrelevant,
>>>
>>> Well it makes a difference, just hard to say if the interface
>>> improves the life of the drive.
>>>
>>>> the 'mean
>>>> time failure rate' is most certainly relevant. Essentially you pay more
>>>> for a disk with a longer mean time failure rate, meaning its less likely
>>>> to fail.
>>>
>>> HAHAHAHHA!! Thanks, I needed a good laugh.....
>>>
>>> Sorry, Not putting you down, Just the numbers they toss out.
>>>
>>> I used to work for a company what wrote software for figuring
>>> out the MTBF (Mean Time Between Failures) and spent a lot of
>>> time working with reliability engineers. MTBF is a bit of a
>>> guess at best.
>
>> It's a pity you did not learn what MTBF actually refers to.
>
> It's the "average time between failures of a system" that's what
> it means.
>
> I wasn't saying *HOW* it should be used in the big picture.
>
>>>
>>> Okay, Lets compare two seagate drives:
>>>
>>> Barracuda 7200.10 SATA 3.0Gb/s 500-GB Hard Drive (ST3500630AS)
>>>
>>> MTBF 700,000 hours (79 years!)
>>>
>>>
>>> Barracuda ES.2 SAS 3.0-Gb/s 500-GB Hard Drive (ST3500620SS)
>>>
>>> MTBF 1,200,000 hours (136 years!)
>>>
>>> From the numbers about you might think the SAS drive is going to
>>> last twice as long. I don't think either of these drives are going
>>> to last 50+ years unused in storage let alone in a running system!
>>> You might see 5-7 years of 24/7 running, at best, before they're
>>> going to start to drop like flies.
>>>
>
>
>> That's not what MTBF is intended to measure. You are claiming that MTBF
>> should equal lifetime and it does not.
>
> No, That's not my claim, I know it's not. When you only see the MTBF
> number it's easy to jump to the idea about how long something might last.
>
> I'm claiming the value of the "MTBF" is just about useless. There are
> much better ways for life cycle analysis/modelling. MTBF is great to
> toss out but has no real value by itself out of any context.
>
> Without knowing how the value for MTBF was calculated you can't know
> it's usefulness. Was it from a steady failure model like the Mil standards
> or something else? Whats the data behind the MTBF? How did they
> get this number? Did they do any real run testing or just run the
> numbers (ideal temp/operating conditions) that gets the best MTBF number?
> "Hmmm if we run the drive at a temp of -5C, the calculations say the MTBF
> is 1,200,000 hours. That's within the listed operating range."
>
> I'm not claiming Seagate or any other drive company is lying, cheating or
> trying to mislead people, they are just putting out the info they have
> that puts their products in the best light, idea enviroment/best possible
> results. You might find a paper on how they do their testing but it'll take
> some digging to figure out how they got their MTBF or MTTF numbers for
> a drive.
>
>> Essentially, MTBF measures the likelihood of a random failure, NOT an
>> end-of-life failure. Arguably, MTBF is only useful to people who run large
>> datacenters with many disks -- they can use MTBF to estimate the failure
>> rate of their drives.
>
> No, MTBF is the *AVERAGE* time between failures. That's why I hate
> seeing it used in marketing and specs sheets. It's not the real average, not
> even close (for hard drives anyway). It's not real data from years of
> running the drives, they don't have the time to run the drives for years
> before sending them to market to get the real numbers. It's just, at best,
> educated guessing using known data about the parts.
>
> If you take a 1000 new drives and run them until each fail the average
> you get won't be anything like 1,200,000 hours, even if you toss out
> numbers first 100 failures and only use the 900 longest lasting drives
> data.
Again, you show that you don't really understand MTBF. Most drives will
fail because they reach end-of-life (they wear out). This is irrelevent to
MTBF.
Instead, if you took 1,200 drives, on average, you would expect one to
fail every 1000 hours, assuming that you: 1. Ignore early failures and 2.
swap out the drives before they wear out (without counting these
swapped-out drives as failures).
For the average user, the lifetime of the drive is more important. I'm not
aware of drive manufacturers providing this information to consumers,
however, like you, I believe it can be inferred from the warranties
provided with the drives.
-
Re: SCSI vs SATA hard disks
Whoever wrote:
> On Mon, 20 Oct 2008, Barry Keeney wrote:
>> Whoever wrote:
>>
>>
>>> On Mon, 20 Oct 2008, Barry Keeney wrote:
>>
>>>> criten wrote:
>>>>> Hactar wrote:
>>>>>> In article <87d4ixbd77.fsf@teufel.hartford-hwp.com>,
>>>>>> Haines Brown wrote:
>>>>>>> ebenZEROONE@verizon.net (Hactar) writes:
>>>>>>>
>>>>>>>> In article <87hc89bmfm.fsf@teufel.hartford-hwp.com>,
>>>>>>>> Haines Brown wroteready worko), there
>>>>>>>>> Would a move to a SATA 3.0 Gb/s drive such as the Seagate Barracuda mean
>>>>>>>>> that I will henceforth have to accept drive unreliability?
>>>>>>>> _All_ drives are unreliable to some degree. The ultimate in
>>>>>>>> computer-readable reliability is probably Tyvek punched tape.
>>>>>>> Yes, but my subjective impression is that there is a very wide
>>>>>>> difference in reliability. Of the dozen SCSI drives I've used over the
>>>>>>> years, only one failed on me; reading on line discussions and reviews,
>>>>>>> it seems that SATA drives fail regularly.
>>>>>>>
>>>>>>> I guess my question comes down to, why should one bother these days with
>>>>>>> the added expense of SCSI hard disks?
>>>>>>
>>>>>> It is my impression (which may be false and/or out of date) that the
>>>>>> instances of drive hardware that are matched with SCSI controllers are the
>>>>>> more reliable (longer-lasting) ones.
>>>>
>>>>> This is a common misconception. The interface is irrelevant,
>>>>
>>>> Well it makes a difference, just hard to say if the interface
>>>> improves the life of the drive.
>>>>
>>>>> the 'mean
>>>>> time failure rate' is most certainly relevant. Essentially you pay more
>>>>> for a disk with a longer mean time failure rate, meaning its less likely
>>>>> to fail.
>>>>
>>>> HAHAHAHHA!! Thanks, I needed a good laugh.....
>>>>
>>>> Sorry, Not putting you down, Just the numbers they toss out.
>>>>
>>>> I used to work for a company what wrote software for figuring
>>>> out the MTBF (Mean Time Between Failures) and spent a lot of
>>>> time working with reliability engineers. MTBF is a bit of a
>>>> guess at best.
>>
>>> It's a pity you did not learn what MTBF actually refers to.
>>
>> It's the "average time between failures of a system" that's what
>> it means.
>>
>> I wasn't saying *HOW* it should be used in the big picture.
>>
>>>>
>>>> Okay, Lets compare two seagate drives:
>>>>
>>>> Barracuda 7200.10 SATA 3.0Gb/s 500-GB Hard Drive (ST3500630AS)
>>>>
>>>> MTBF 700,000 hours (79 years!)
>>>>
>>>>
>>>> Barracuda ES.2 SAS 3.0-Gb/s 500-GB Hard Drive (ST3500620SS)
>>>>
>>>> MTBF 1,200,000 hours (136 years!)
>>>>
>>>> From the numbers about you might think the SAS drive is going to
>>>> last twice as long. I don't think either of these drives are going
>>>> to last 50+ years unused in storage let alone in a running system!
>>>> You might see 5-7 years of 24/7 running, at best, before they're
>>>> going to start to drop like flies.
>>>>
>>
>>
>>> That's not what MTBF is intended to measure. You are claiming that MTBF
>>> should equal lifetime and it does not.
>>
>> No, That's not my claim, I know it's not. When you only see the MTBF
>> number it's easy to jump to the idea about how long something might last.
>>
>> I'm claiming the value of the "MTBF" is just about useless. There are
>> much better ways for life cycle analysis/modelling. MTBF is great to
>> toss out but has no real value by itself out of any context.
>>
>> Without knowing how the value for MTBF was calculated you can't know
>> it's usefulness. Was it from a steady failure model like the Mil standards
>> or something else? Whats the data behind the MTBF? How did they
>> get this number? Did they do any real run testing or just run the
>> numbers (ideal temp/operating conditions) that gets the best MTBF number?
>> "Hmmm if we run the drive at a temp of -5C, the calculations say the MTBF
>> is 1,200,000 hours. That's within the listed operating range."
>>
>> I'm not claiming Seagate or any other drive company is lying, cheating or
>> trying to mislead people, they are just putting out the info they have
>> that puts their products in the best light, idea enviroment/best possible
>> results. You might find a paper on how they do their testing but it'll take
>> some digging to figure out how they got their MTBF or MTTF numbers for
>> a drive.
>>
>>> Essentially, MTBF measures the likelihood of a random failure, NOT an
>>> end-of-life failure. Arguably, MTBF is only useful to people who run large
>>> datacenters with many disks -- they can use MTBF to estimate the failure
>>> rate of their drives.
>>
>> No, MTBF is the *AVERAGE* time between failures. That's why I hate
>> seeing it used in marketing and specs sheets. It's not the real average, not
>> even close (for hard drives anyway). It's not real data from years of
>> running the drives, they don't have the time to run the drives for years
>> before sending them to market to get the real numbers. It's just, at best,
>> educated guessing using known data about the parts.
>>
>> If you take a 1000 new drives and run them until each fail the average
>> you get won't be anything like 1,200,000 hours, even if you toss out
>> numbers first 100 failures and only use the 900 longest lasting drives
>> data.
> Again, you show that you don't really understand MTBF. Most drives will
> fail because they reach end-of-life (they wear out). This is irrelevent to
> MTBF.
Something "wearing out" is a failure! Just because you don't repair
the failed part/system doesn't mean is doesn't count as a failure! It
should be call MTTF (Mean Time To Fail) if it's not going to be repaired.
> Instead, if you took 1,200 drives, on average, you would expect one to
> fail every 1000 hours, assuming that you: 1. Ignore early failures and 2.
> swap out the drives before they wear out (without counting these
> swapped-out drives as failures).
No thats 1000 hrs MTBF/MTTF, with only one failure that's the only real
data you have. The other 1199 drives haven't failed so you can't
expand their "not failing" for MTBF. Failure rate yes, 1 per 1000 hours.
(the problem with a small failure sample size)
Okay, let's use "your" numbers. 1 per 1000hrs and drives don't "wear out"
just fail.
So one drive fails after 1000hrs, next at 2000hrs, third at 3000hrs, etc.
At that rate after 10 years you'd have a total of *only* ~88 of the orignal
1200 fail (8766hrs/year). With less the 10% of the drives failing, you have
a total of 3,916,000hrs of operation of the 88 drives with a MTBF of 44,500hrs
or 5.08 years.(But are 1100+ drives really going to be running after
10 years?)
Now run the numbers up to 100 failures. Thats a total of 5,050,000hrs
or MTBF of 50,500hrs or 5.76years. (after running the test for 11.40 years!)
Before you jump on this, remember the other 1100 drives are still
running.... AND I'm using *Your* numbers.
At this rate the MTBF is still below what the real data, your
numbers are generating. If we started out with only a 100 drives
it would be the real MTBF for this sample. (50,500hrs/5.76years)
It's going to take 136+ years before all 1200 drives should
fail. (at 1 per 1000hrs or 8.766 per year.)
Keep running the numbers and MTBF might grows up to numbers like
1,200,000 hours. Depending on sample size and a flat failure rate.
That's the problem of a flat failure rates and the models that use
them. Doesn't deal with the higher numbers of failures at the beginning
(infant mortality) and near the end of life. (aka "Bath tub" curve
failure rates)
Now if the drive maker has a 5 year warranty and wants atleast 90%
to make it to 5 years you'd want the MTBF/MTTF to be around 40,000hrs
or 4.56yrs. (excluding early life failures and raising failure rates
with age)
So MTBF's of 1,200,000 are worthless without info used to get the number.
> For the average user, the lifetime of the drive is more important. I'm not
> aware of drive manufacturers providing this information to consumers,
Well if you're a big computer maker like Dell/HP/etc you're going
to want detailed specs on parts before you decide to use them and/or
a warranty that works for the price point you're looking for.
They don't want their name to be hurt because they used a cheap
drive thats fails too often.
> however, like you, I believe it can be inferred from the warranties
> provided with the drives.
You've got to figure they've done the math and know how much the
warranties are going to cost per unit and they still expect a profit
over the product life/warranty life.
--
Barry Keeney
Chaos Consulting
email barryk@chaoscon.com
"Rap is Square Dancing gone terribly, terribly Wrong...."
-
Re: SCSI vs SATA hard disks
On Tue, 21 Oct 2008, Barry Keeney wrote:
> Whoever wrote:
>
>
>> On Mon, 20 Oct 2008, Barry Keeney wrote:
>
>>> Whoever wrote:
>>>
>>>
>>>> On Mon, 20 Oct 2008, Barry Keeney wrote:
>>>
>>>>> criten wrote:
>>>>>> Hactar wrote:
>>>>>>> In article <87d4ixbd77.fsf@teufel.hartford-hwp.com>,
>>>>>>> Haines Brown wrote:
>>>>>>>> ebenZEROONE@verizon.net (Hactar) writes:
>>>>>>>>
>>>>>>>>> In article <87hc89bmfm.fsf@teufel.hartford-hwp.com>,
>>>>>>>>> Haines Brown wroteready worko), there
>>>>>>>>>> Would a move to a SATA 3.0 Gb/s drive such as the Seagate Barracuda mean
>>>>>>>>>> that I will henceforth have to accept drive unreliability?
>>>>>>>>> _All_ drives are unreliable to some degree. The ultimate in
>>>>>>>>> computer-readable reliability is probably Tyvek punched tape.
>>>>>>>> Yes, but my subjective impression is that there is a very wide
>>>>>>>> difference in reliability. Of the dozen SCSI drives I've used over the
>>>>>>>> years, only one failed on me; reading on line discussions and reviews,
>>>>>>>> it seems that SATA drives fail regularly.
>>>>>>>>
>>>>>>>> I guess my question comes down to, why should one bother these days with
>>>>>>>> the added expense of SCSI hard disks?
>>>>>>>
>>>>>>> It is my impression (which may be false and/or out of date) that the
>>>>>>> instances of drive hardware that are matched with SCSI controllers are the
>>>>>>> more reliable (longer-lasting) ones.
>>>>>
>>>>>> This is a common misconception. The interface is irrelevant,
>>>>>
>>>>> Well it makes a difference, just hard to say if the interface
>>>>> improves the life of the drive.
>>>>>
>>>>>> the 'mean
>>>>>> time failure rate' is most certainly relevant. Essentially you pay more
>>>>>> for a disk with a longer mean time failure rate, meaning its less likely
>>>>>> to fail.
>>>>>
>>>>> HAHAHAHHA!! Thanks, I needed a good laugh.....
>>>>>
>>>>> Sorry, Not putting you down, Just the numbers they toss out.
>>>>>
>>>>> I used to work for a company what wrote software for figuring
>>>>> out the MTBF (Mean Time Between Failures) and spent a lot of
>>>>> time working with reliability engineers. MTBF is a bit of a
>>>>> guess at best.
>>>
>>>> It's a pity you did not learn what MTBF actually refers to.
>>>
>>> It's the "average time between failures of a system" that's what
>>> it means.
>>>
>>> I wasn't saying *HOW* it should be used in the big picture.
>>>
>>>>>
>>>>> Okay, Lets compare two seagate drives:
>>>>>
>>>>> Barracuda 7200.10 SATA 3.0Gb/s 500-GB Hard Drive (ST3500630AS)
>>>>>
>>>>> MTBF 700,000 hours (79 years!)
>>>>>
>>>>>
>>>>> Barracuda ES.2 SAS 3.0-Gb/s 500-GB Hard Drive (ST3500620SS)
>>>>>
>>>>> MTBF 1,200,000 hours (136 years!)
>>>>>
>>>>> From the numbers about you might think the SAS drive is going to
>>>>> last twice as long. I don't think either of these drives are going
>>>>> to last 50+ years unused in storage let alone in a running system!
>>>>> You might see 5-7 years of 24/7 running, at best, before they're
>>>>> going to start to drop like flies.
>>>>>
>>>
>>>
>>>> That's not what MTBF is intended to measure. You are claiming that MTBF
>>>> should equal lifetime and it does not.
>>>
>>> No, That's not my claim, I know it's not. When you only see the MTBF
>>> number it's easy to jump to the idea about how long something might last.
>>>
>>> I'm claiming the value of the "MTBF" is just about useless. There are
>>> much better ways for life cycle analysis/modelling. MTBF is great to
>>> toss out but has no real value by itself out of any context.
>>>
>>> Without knowing how the value for MTBF was calculated you can't know
>>> it's usefulness. Was it from a steady failure model like the Mil standards
>>> or something else? Whats the data behind the MTBF? How did they
>>> get this number? Did they do any real run testing or just run the
>>> numbers (ideal temp/operating conditions) that gets the best MTBF number?
>>> "Hmmm if we run the drive at a temp of -5C, the calculations say the MTBF
>>> is 1,200,000 hours. That's within the listed operating range."
>>>
>>> I'm not claiming Seagate or any other drive company is lying, cheating or
>>> trying to mislead people, they are just putting out the info they have
>>> that puts their products in the best light, idea enviroment/best possible
>>> results. You might find a paper on how they do their testing but it'll take
>>> some digging to figure out how they got their MTBF or MTTF numbers for
>>> a drive.
>>>
>>>> Essentially, MTBF measures the likelihood of a random failure, NOT an
>>>> end-of-life failure. Arguably, MTBF is only useful to people who run large
>>>> datacenters with many disks -- they can use MTBF to estimate the failure
>>>> rate of their drives.
>>>
>>> No, MTBF is the *AVERAGE* time between failures. That's why I hate
>>> seeing it used in marketing and specs sheets. It's not the real average, not
>>> even close (for hard drives anyway). It's not real data from years of
>>> running the drives, they don't have the time to run the drives for years
>>> before sending them to market to get the real numbers. It's just, at best,
>>> educated guessing using known data about the parts.
>>>
>>> If you take a 1000 new drives and run them until each fail the average
>>> you get won't be anything like 1,200,000 hours, even if you toss out
>>> numbers first 100 failures and only use the 900 longest lasting drives
>>> data.
>
>> Again, you show that you don't really understand MTBF. Most drives will
>> fail because they reach end-of-life (they wear out). This is irrelevent to
>> MTBF.
>
> Something "wearing out" is a failure! Just because you don't repair
> the failed part/system doesn't mean is doesn't count as a failure! It
> should be call MTTF (Mean Time To Fail) if it's not going to be repaired.
>
>> Instead, if you took 1,200 drives, on average, you would expect one to
>> fail every 1000 hours, assuming that you: 1. Ignore early failures and 2.
>> swap out the drives before they wear out (without counting these
>> swapped-out drives as failures).
>
> No thats 1000 hrs MTBF/MTTF, with only one failure that's the only real
> data you have. The other 1199 drives haven't failed so you can't
> expand their "not failing" for MTBF. Failure rate yes, 1 per 1000 hours.
> (the problem with a small failure sample size)
>
> Okay, let's use "your" numbers. 1 per 1000hrs and drives don't "wear out"
> just fail.
>
> So one drive fails after 1000hrs, next at 2000hrs, third at 3000hrs, etc.
> At that rate after 10 years you'd have a total of *only* ~88 of the orignal
> 1200 fail (8766hrs/year). With less the 10% of the drives failing, you have
> a total of 3,916,000hrs of operation of the 88 drives with a MTBF of 44,500hrs
> or 5.08 years.(But are 1100+ drives really going to be running after
> 10 years?)
>
> Now run the numbers up to 100 failures. Thats a total of 5,050,000hrs
> or MTBF of 50,500hrs or 5.76years. (after running the test for 11.40 years!)
>
> Before you jump on this, remember the other 1100 drives are still
> running.... AND I'm using *Your* numbers.
Ah, I see the reason for your poor understanding of MTBF, you have poor
reading comprehension skills also. What is it about "2.
swap out the drives before they wear out (without counting these
swapped-out drives as failures)." that you don't understand? In my
scenario, not one of the "other 1100 drives" would be running, because
they would have been swapped out.
>
> At this rate the MTBF is still below what the real data, your
> numbers are generating. If we started out with only a 100 drives
> it would be the real MTBF for this sample. (50,500hrs/5.76years)
>
> It's going to take 136+ years before all 1200 drives should
> fail. (at 1 per 1000hrs or 8.766 per year.)
Again, look up the definition of MTBF. It assumes that you replace or
repair failed units, so there is no time at which "all 1200 drives ...
fail". Instead, they have all been swapped out (because of age, not
necessarily failure), probably many times and at the end of the
experiment, you still have 1200 drives.
One more comment, you might want to study some statistics. In the
experiment you propose, there is an ever reducing number of drives in the
experiment (as they fail), yet the rate at which drives fail is unchanged.
This seems rather unlike -- instead, as the number of drives in the
experiement is reduced, the number of drives that fail per month would
also be reduced.
I will agree with you on one thing though -- I do suspect that MTBF rates
are artifically high.
-
Re: SCSI vs SATA hard disks
In article ,
Whoever wrote:
:
:For the average user, the lifetime of the drive is more important. I'm not
:aware of drive manufacturers providing this information to consumers,
:however, like you, I believe it can be inferred from the warranties
rovided with the drives.
Out of curiosity, I did some calculations based on the SMART data
for Power_On_Hours on a couple of my PATA drives:
Drive A (ST380013A):
ID# ATTRIBUTE_NAME VALUE WORST THRESH TYPE UPDATED RAW_VALUE
9 Power_On_Hours 058 058 000 Old_age Always 36796
Looks like 36796 hours might be 42% (100 - 58) of expected life.
(36796 / .42) = 87610 hours, or 10.00 years
Drive B (ST3500630A):
ID# ATTRIBUTE_NAME VALUE WORST THRESH TYPE UPDATED RAW_VALUE
9 Power_On_Hours 092 092 000 Old_age Always 7078
(7078 / .08) = 88475 hours, or 10.10 years
Surprisingly consistent, and strictly "FWIW", which might be not much
since I'm making an assumption about the unknown conversion from raw
to normalized values.
--
Bob Nichols AT comcast.net I am "RNichols42"
-
Re: SCSI vs SATA hard disks
>Whoever wrote:
>> On Tue, 21 Oct 2008, Barry Keeney wrote:
>>> Whoever wrote:
>>
>> Something "wearing out" is a failure! Just because you don't repair
>> the failed part/system doesn't mean is doesn't count as a failure! It
>> should be call MTTF (Mean Time To Fail) if it's not going to be repaired.
>>
>>> Instead, if you took 1,200 drives, on average, you would expect one to
>>> fail every 1000 hours, assuming that you: 1. Ignore early failures and 2.
>>> swap out the drives before they wear out (without counting these
>>> swapped-out drives as failures).
>>
>> No thats 1000 hrs MTBF/MTTF, with only one failure that's the only real
>> data you have. The other 1199 drives haven't failed so you can't
>> expand their "not failing" for MTBF. Failure rate yes, 1 per 1000 hours.
>> (the problem with a small failure sample size)
>>
>> Okay, let's use "your" numbers. 1 per 1000hrs and drives don't "wear out"
>> just fail.
>>
>> So one drive fails after 1000hrs, next at 2000hrs, third at 3000hrs, etc.
>> At that rate after 10 years you'd have a total of *only* ~88 of the orignal
>> 1200 fail (8766hrs/year). With less the 10% of the drives failing, you have
>> a total of 3,916,000hrs of operation of the 88 drives with a MTBF of 44,500hrs
>> or 5.08 years.(But are 1100+ drives really going to be running after
>> 10 years?)
>>
>> Now run the numbers up to 100 failures. Thats a total of 5,050,000hrs
>> or MTBF of 50,500hrs or 5.76years. (after running the test for 11.40 years!)
>>
>> Before you jump on this, remember the other 1100 drives are still
>> running.... AND I'm using *Your* numbers.
> Ah, I see the reason for your poor understanding of MTBF, you have poor
> reading comprehension skills also. What is it about "2.
> swap out the drives before they wear out (without counting these
> swapped-out drives as failures)." that you don't understand? In my
> scenario, not one of the "other 1100 drives" would be running, because
> they would have been swapped out.
No it just shows you don't know much about failure analysis.
You don't wait until the product is out with customers to figure
out how long it's going to last. You test BEFORE going to market.
To figure out failure rates you can test or use known data.
If you're testing you need to run *UNTIL* failure. You can't
get failure data/rates if you're replacing your test sample before
they fail. "We didn't have any fail, they'll last forever!" :^)
If you're not using some sort of stress testing (higher temps/etc)
then you can just run them until failure. You pick a sample size
run them until you have enough failures to figure out failure rates
and MTBF.
Your scenario might be how some companies run there data centers,
but I haven't seen this. I've seen replacing everything every 3-5 years
with newer tech and replacing failed parts as needed as more the norm.
Once you buy the product, you can test how you like, but it
doesn't sound like testing but more like running in an production
environment were you want to prevent failures.
>> At this rate the MTBF is still below what the real data, your
>> numbers are generating. If we started out with only a 100 drives
>> it would be the real MTBF for this sample. (50,500hrs/5.76years)
>>
>> It's going to take 136+ years before all 1200 drives should
>> fail. (at 1 per 1000hrs or 8.766 per year.)
> Again, look up the definition of MTBF. It assumes that you replace or
> repair failed units, so there is no time at which "all 1200 drives ...
> fail". Instead, they have all been swapped out (because of age, not
> necessarily failure), probably many times and at the end of the
> experiment, you still have 1200 drives.
Not in "run to failure testing", you don't replace the failures.
You continue to run the remaining units until either the time for the
test is over or you reach the number of failures required, depending
on the test parameters. If you replace a failed part, that parts start
time is different and you'll need to keep track of this. If you're running
a 1000 hour test this replacement will still need to run the 1000 hours,
not whatever hours remain. Otherwise it wouldn't have been stress the
same as the orignal units in the sample.
> One more comment, you might want to study some statistics. In the
> experiment you propose, there is an ever reducing number of drives in the
> experiment (as they fail), yet the rate at which drives fail is unchanged.
> This seems rather unlike -- instead, as the number of drives in the
> experiement is reduced, the number of drives that fail per month would
> also be reduced.
Since we haven't run the real world test we can only model it using
known info. 1200 drives, fixed failure rate of 1 per 1000hrs.
With fixed failure rate models, failure rate is constant, number of
units doesn't matter. You've got 10, one fails per hour, after 5 hours
you'll have 5, Not what you'd see in the real world true. So you do get
the results I used. Again, a problem with fixed failure rate models.
(Hmm I keep saying this but you don't seem to notice it....)
In Real testing this *shouldn't* be the case. You'll expect to get a
spike of failures near the start, lower in the middle and raising when
nearing end of life. I *think* we can agree on this?
> I will agree with you on one thing though -- I do suspect that MTBF rates
> are artifically high.
Well there is that.....
--
Barry Keeney
Chaos Consulting
email barryk<@>chaoscon.com
"Rap is Square Dancing gone terribly, terribly Wrong...."