Failure of external HDD's - why doesn't any manufacturer wake up to this? - Storage

This is a discussion on Failure of external HDD's - why doesn't any manufacturer wake up to this? - Storage ; A couple of years ago I bought 4 external HDD's, Maxtors, to increase storage capacity on some machines (and offer portability) in my business. After 2 failed in short order I returned the other 2 and swore not to touch ...

+ Reply to Thread
Results 1 to 8 of 8

Thread: Failure of external HDD's - why doesn't any manufacturer wake up to this?

  1. Failure of external HDD's - why doesn't any manufacturer wake up to this?

    A couple of years ago I bought 4 external HDD's, Maxtors, to increase
    storage capacity on some machines (and offer portability) in my business.
    After 2 failed in short order I returned the other 2 and swore not to
    touch external drives again.

    A couple of months ago my IT supplier convinced me to try a Lacie 2T
    external HDD (internally, 4 Hitachi 500G's in a raid 0 array). We were
    using it to store video clips after editing, pending backup to tape.
    Sure enough, it has failed and I am again in the quandry of whether to
    send it back for warranty repair - risking propagation of my sensitive
    files, and the best I can hope for, is a new, empty drive. Recovering
    the data is at least $3,500 and I may not get it all. The drive itself
    cost me $2K. ($Australian).

    I'm an engineer and it is my opinion that these units are all
    under-designed, thermally. Ok they have cooling fans but that means
    nothing. In the case of the Maxtor units, the HDD was suspended
    internally on rubber bushes and so there's no mechanical heatsinking. A
    one-inch fan sucked out a little warm air but the drive itself still ran
    hotter than one mounted inside a PC, where the chassis sucks up a fair
    amount of heat. One type I looked at didn't even have cooling fans so
    the casing was effectively a blanket!

    The Lacie drive had been left on (but mostly idle) almost continuously
    for the entire 3 months it was in service, and failed during a cold
    start. That smacks of a thermal stress failure.

    These days I always order my PC's with cooling fans mounted directly on
    the HDD bay and haven't had a failure since. Put your finger near the
    spindle of a HDD that has been running an hour or so and you'll find it
    almost too hot to touch. Friction rises exponentially with temperature
    because it's a positive feedback loop. Heat loss also rises
    exponentially with temp, which stabilises at a point where they are in
    balance - and the better the heat removal, the lower the temp.

    HDD failure is every computer user's worst nightmare. The temperature
    issues are obvious to even a novice engineer. So why, oh why, do they
    continue to underdesign these things - not just in external units, but
    inside PC's as well? Of course both manufacturers told me "you know we
    dont' get many of these back..." (Probably something to do with their
    policy that the faulty unit can't be returned and therefore you have no
    hope of getting your data back, so you're better off sending it to a data
    recovery service).

    Who else has experienced a higher failure rate on external HHD's? Or
    undercooled internals?

    As far as I can tell my only solution for reliable, portable mass storage
    will be to re-engineer a commercial unit to improve it's cooling - but
    void its warranty. Who cares, the warranty is useless when it doesn't
    cover your data.





  2. Re: Failure of external HDD's - why doesn't any manufacturer wakeup to this?

    richard wrote:

    ....

    > HDD failure is every computer user's worst nightmare. The temperature
    > issues are obvious to even a novice engineer. So why, oh why, do they
    > continue to underdesign these things - not just in external units, but
    > inside PC's as well?


    The disk manufacturers, at least, don't: the environment for which
    their disks are designed is quite publicly specified, and it's hardly
    their fault if those who use the disks don't pay attention to it.

    Failure rates of (S)ATA units have been studied by users as well as
    manufacturers: while the real-world-environment failure rates
    encountered by the former (such as the Internet Archive project and the
    - I'm sad to say now apparently late - Jim Gray at Microsoft) tend to be
    higher than the more-carefully-controlled-environment failure rates
    published by the latter, the differences aren't nearly sufficient to
    support your allegation that they're 'underdesigned' for their use.

    Perhaps you're just unlucky, or unusually hard on your disks, or the
    design of the cases you use sucks, or you're using older disks (the
    newer ones don't tend to get nearly as hot as you claim when properly
    cooled: for the past several years my internal 7200 rpm Seagates have
    idled at around 20 degrees C. below their nominal 55 degree C. maximum,
    according to S.M.A.R.T., with no special cooling arrangements such as
    you describe: the normal influx of air to the front of the drive bays
    caused by the PSU fan plus a single auxiliary 80mm. fan is more than
    sufficient for them).

    Unless the internal air flow misses the drives, you can get a pretty
    good idea of how hot disks in external cases are getting by checking
    whether the exhaust-fan air feels warm (if not, the flow is probably
    adequate to keep the disks cool). You noted that one of your external
    cases didn't have a fan, and just acted as a 'blanket'. Maybe, maybe
    not: some such cases claim to make good thermal contact with the drive
    and conduct heat efficiently to their exteriors (the USB example that I
    happen to have here unfortunately doesn't report S.M.A.R.T. attributes
    via any of the software I have readily available - anyone know of some
    software that might do this?).

    In any event, don't presume to generalize from your own experience -
    especially given the ease with which you could find broader relevant
    information.

    ....

    > Who else has experienced a higher failure rate on external HHD's? Or
    > undercooled internals?


    Undercooled disks certainly don't live as long - the rule of thumb is
    that the life halves for every 15 degree C. rise in operating
    temperature (and likely drops far faster if you exceed the nominal
    maximum). Of course, external drives may also be prone to more physical
    shock during operation than internal drives.

    The bottom line is, keep your disks reasonably cool and free from abuse
    and they won't disappoint you.

    >
    > As far as I can tell my only solution for reliable, portable mass storage
    > will be to re-engineer a commercial unit to improve it's cooling - but
    > void its warranty.


    Why not just look around until you find a commercial unit that cools its
    disks properly? With external SATA units that should be trivial (no
    problem getting the S.M.A.R.T. information there).

    - bill

  3. Re: Failure of external HDD's - why doesn't any manufacturer wakeup to this?

    In article ,
    Bill Todd wrote:
    >richard wrote:
    >
    >...
    >
    >> HDD failure is every computer user's worst nightmare. The temperature
    >> issues are obvious to even a novice engineer. So why, oh why, do they
    >> continue to underdesign these things - not just in external units, but
    >> inside PC's as well?

    >
    >The disk manufacturers, at least, don't: the environment for which
    >their disks are designed is quite publicly specified, and it's hardly
    >their fault if those who use the disks don't pay attention to it.
    >
    >Failure rates of (S)ATA units have been studied by users as well as
    >manufacturers.


    Both large scale users (computer and storage array makers) and disk
    makers have very detailed data on this. However, this data is not
    shared with competitors, nor with consumers.

    Several studies have been published in the open literature:

    > while the real-world-environment failure rates
    >encountered by the former (such as the Internet Archive project and the
    >- I'm sad to say now apparently late - Jim Gray at Microsoft)


    May Jim rest in peace. I hope he died the way he wanted to. At this
    point, that's the best we can hope for.

    In addition to his data, this year's FAST conference (go to
    www.usenet.org and search for FAST2007) had two papers on real-world
    disk failure rates. The first one (by Bianca S. from Carnegie Mellon)
    got best paper award, and has a lot of information about a variety of
    settings. The second one (from google) has some very interesting
    information about a 5-year study; unfortunately, it mixes older and
    newer disks.

    The second paper has some astonishing data: namely that disks live
    longer if you don't keep them too cold; 30 or 40 degrees are better
    than 15 or 20. We know that really cold temperatures (5 or 10
    degrees) are bad for disks, but I was quite astonished by this result.
    A very senior person from a disk manufacturer was sitting next to me
    during this talk, and was shaking his head. The above observation
    quite bady violates many things we thought we had known, and might be
    an artifact of mixing different disks that use different housings in
    the same data set. So don't take it too serious yet.

    Please note that disk lifetime is also affected by other factors.
    Such as workload: continuously seeking is bad; continuously writing
    is also bad (as it increases the risk of off-track writes). This is
    particularly true for consumer-grade disks (which is often but no
    always synonymous with IDE/SATA disks); those are typically specified
    for 40-hour-per week operation, instead of 24x7.

    Disk lifetime (and error rate) is obviously affected by temperature,
    except that the recent Google result above confuses that issue. It is
    also seriously affected by vibration, in particular for consumer-grade
    disks (again, often IDE/SATA), which can't simultaneously servo and
    transfer data. For this reason, it is important to use
    vibration-absorbing disk mounts, low-vibration fans, and isolate disks
    from other vibrating components (such as CD-ROMs and other disks).
    Mechanical shock during operation can be very very bad, so don't kick
    your computer just because your program doesn't compile.

    >Perhaps you're just unlucky, or unusually hard on your disks, or the
    >design of the cases you use sucks, or you're using older disks (the
    >newer ones don't tend to get nearly as hot as you claim when properly
    >cooled: for the past several years my internal 7200 rpm Seagates have
    >idled at around 20 degrees C. below their nominal 55 degree C. maximum,
    >according to S.M.A.R.T., with no special cooling arrangements such as
    >you describe: the normal influx of air to the front of the drive bays
    >caused by the PSU fan plus a single auxiliary 80mm. fan is more than
    >sufficient for them).


    Warning: 10K and 15K RPM SCSI/SAS/FC disks run considerably hotter
    than 5400 and 7200 RPM IDE/SATA disks. Particularly true for 2.5"
    enterprise-grade disks. All those disks should be equipped with a fan
    that guarantees good airflow.

    >In any event, don't presume to generalize from your own experience -
    >especially given the ease with which you could find broader relevant
    >information.


    If you want to generalize from your experience, your experience better
    be based on hundreds of thousands of disks. Most home computer users
    (fortunately) don't gather that kind of data.

    --
    The address in the header is invalid for obvious reasons. Please
    reconstruct the address from the information below (look for _).
    Ralph Becker-Szendy _firstname_@lr_dot_los-gatos_dot_ca.us

  4. Re: Failure of external HDD's - why doesn't any manufacturer wake up to this?

    _firstname_@lr_dot_los-gatos_dot_ca.us writes:
    > May Jim rest in peace. I hope he died the way he wanted to. At this
    > point, that's the best we can hope for.


    I prefer that he turn up alive and not too much the worse for wear.
    For those who don't know, his sailing vessel disappeared off the San
    Francisco coast about 2 weeks ago and a search has been unsuccessful
    so far. However, there weren't any really horrible weather
    conditions, known recent pirate attacks in the area, or anything like
    that, so there's some chance he'll be found. People have been rescued
    under such circumstances before, even after many months at sea.

    http://en.wikipedia.org/wiki/James_N._Gray

  5. Re: Failure of external HDD's - why doesn't any manufacturer wakeup to this?

    In article ,
    billtodd@metrocast.net opined thusly:
    >


    >Unless the internal air flow misses the drives, you can get a pretty
    >good idea of how hot disks in external cases are getting by checking
    >whether the exhaust-fan air feels warm (if not, the flow is probably
    >adequate to keep the disks cool).


    Without voiding the warranty how can you know? If the air is cool it is
    either missing the drive or doing a good job. If the air is warm it is
    either doing its job...or not, it may still be insufficient to keep the
    bearing and surface temperatures down. But a 1" fan in an enclosed space
    where there's no mechanical heatsinking, intuitively, isn't going to cut it.

    >You noted that one of your external
    >cases didn't have a fan, and just acted as a 'blanket'. Maybe, maybe
    >not: some such cases claim to make good thermal contact with the drive
    >and conduct heat efficiently to their exteriors


    This one had the drive suspended on rubber shock mounts so it really was an
    oven, a perfect way to test for heat failure.

    >In any event, don't presume to generalize from your own experience -
    >especially given the ease with which you could find broader relevant
    >information.


    Well the purpose of my post was to find out if I'm not alone in my
    experiences. Citing the above example, that brand is just going to fail over
    and over - but only I guess for users who give it a duty cycle that's
    light enough to keep its temp down.


    >The bottom line is, keep your disks reasonably cool and free from abuse
    >and they won't disappoint you.
    >
    >>
    >> As far as I can tell my only solution for reliable, portable mass storage
    >> will be to re-engineer a commercial unit to improve it's cooling - but
    >> void its warranty.

    >
    >Why not just look around until you find a commercial unit that cools its
    >disks properly? With external SATA units that should be trivial (no
    >problem getting the S.M.A.R.T. information there).


    Any suggestions? How about this S.M.A.R.T. software, does it slow down access
    time? Any recommendations on whose to use?

    thanks for the info.


  6. Re: Failure of external HDD's - why doesn't any manufacturer wake up to this?

    On Feb 16, 11:27 pm, _firstname_@lr_dot_los-gatos_dot_ca.us wrote:
    > In article ,
    > Bill Todd wrote:
    >
    > >richard wrote:

    >
    > >...

    >
    > >> HDD failure is every computer user's worst nightmare. The temperature
    > >> issues are obvious to even a novice engineer. So why, oh why, do they
    > >> continue to underdesign these things - not just in external units, but
    > >> inside PC's as well?

    > ...
    > >In any event, don't presume to generalize from your own experience -
    > >especially given the ease with which you could find broader relevant
    > >information.

    >
    > If you want to generalize from your experience, your experience better
    > be based on hundreds of thousands of disks. Most home computer users
    > (fortunately) don't gather that kind of data.


    Richard's observation that common-or-garden external enclosures are
    underdesigned is likely quite valid. Although one should perhaps spend
    commensurately with the value of one's data, not just on enclosure,
    but also on redundancy.

    It's also usually true that software could make better use of SMART
    data and other early warning signs (I believe this is on Solaris' ZFS
    and Fault Management roadmap).

    http://www.opensolaris.org/os/community/fm/
    http://blogs.sun.com/eschrock/date/20051121

    >
    > --
    > The address in the header is invalid for obvious reasons. Please
    > reconstruct the address from the information below (look for _).
    > Ralph Becker-Szendy _firstname_@lr_dot_los-gatos_dot_ca.us




  7. Re: Failure of external HDD's - why doesn't any manufacturer wakeup to this?

    richard wrote:
    > In article ,
    > billtodd@metrocast.net opined thusly:
    >
    >> Unless the internal air flow misses the drives, you can get a pretty
    >> good idea of how hot disks in external cases are getting by checking
    >> whether the exhaust-fan air feels warm (if not, the flow is probably
    >> adequate to keep the disks cool).

    >
    > Without voiding the warranty how can you know? If the air is cool it is
    > either missing the drive or doing a good job. If the air is warm


    Actually, the air shouldn't be more than luke-warm, because then the
    disk would be even warmer. Moving air feels cooler than still air, so
    if the disk is, say, 12 - 14 degrees C. above room temperature (as my
    Seagates seem to tend to run; a couple of WDs that I checked ran a
    little warmer) the exhaust air (at a slightly lower temperature) should
    barely feel warm at all.

    it is
    > either doing its job...or not, it may still be insufficient to keep the
    > bearing and surface temperatures down. But a 1" fan in an enclosed space
    > where there's no mechanical heatsinking, intuitively, isn't going to cut it.


    Our intuitions differ, then (or perhaps it's the disks we're used to
    using - mine tend to run only slightly warm to the touch).

    ....

    > Well the purpose of my post was to find out if I'm not alone in my
    > experiences. Citing the above example, that brand is just going to fail over
    > and over - but only I guess for users who give it a duty cycle that's
    > light enough to keep its temp down.


    Heavy seek loads are the worst. Video loads tend to use long accesses
    with relatively few seeks: while the disk head still has to follow the
    track, far less heat should be generated (I'd tend to suspect much
    closer to an idle level than to a heavy-seeking level).

    ....

    >> Why not just look around until you find a commercial unit that cools its
    >> disks properly? With external SATA units that should be trivial (no
    >> problem getting the S.M.A.R.T. information there).

    >
    > Any suggestions? How about this S.M.A.R.T. software, does it slow down access
    > time? Any recommendations on whose to use?


    Modern disks include firmware that monitors their own operation and
    health, one of the outputs being their internal temperature. S.M.A.R.T.
    monitoring software just interrogates the disk to get that information
    out of it: there's no overhead at all save at the times you ask the
    disk a question (which shouldn't be that often unless you want to use
    the software to monitor temperatures for unusual changes rather than
    simply check them occasionally).

    I use a small free utility called Dtemp from
    http://private.peterlink.ru/tochinov/ and just start it up once in a
    while to see how things are doing (it reports a lot of other S.M.A.R.T.
    attributes too - Seagate drives are a little strange, since they come
    from the factory with non-zero values for some failing-health
    indicators, according to another S.M.A.R.T. utility from Adenix that I
    use less often). A quick look around turned up some other free
    utilities but none that could work through a USB connection (still
    hoping that someone here knows of one) - nor did the few paid-for
    utilities that I encountered claim to do so.

    - bill

  8. Re: Failure of external HDD's - why doesn't any manufacturer wakeup to this?

    In article ,

    Thanks for that info Bill. OK since the Maxtor external drive I have on my
    home PC is out of warranty I just opened it up. Drive is suspended on
    neoprene(?) shock mounts. The only metal-to-case contact is via a copper
    earthing strip. Ventilation is a few square cm at the front and perhaps one sq
    cm at the back. Fan? Nil. This is going to cook - esp as it ages.

    billtodd@metrocast.net opined thusly:
    >


    >richard wrote:
    >> In article ,
    >> billtodd@metrocast.net opined thusly:
    >>
    >>> Unless the internal air flow misses the drives, you can get a pretty
    >>> good idea of how hot disks in external cases are getting by checking
    >>> whether the exhaust-fan air feels warm (if not, the flow is probably
    >>> adequate to keep the disks cool).

    >>
    >> Without voiding the warranty how can you know? If the air is cool it is
    >> either missing the drive or doing a good job. If the air is warm

    >
    >Actually, the air shouldn't be more than luke-warm, because then the
    >disk would be even warmer. Moving air feels cooler than still air, so
    >if the disk is, say, 12 - 14 degrees C. above room temperature (as my
    >Seagates seem to tend to run; a couple of WDs that I checked ran a
    >little warmer) the exhaust air (at a slightly lower temperature) should
    >barely feel warm at all.
    >
    > it is
    >> either doing its job...or not, it may still be insufficient to keep the
    >> bearing and surface temperatures down. But a 1" fan in an enclosed space
    >> where there's no mechanical heatsinking, intuitively, isn't going to cut it.

    >
    >Our intuitions differ, then (or perhaps it's the disks we're used to
    >using - mine tend to run only slightly warm to the touch).
    >
    >...
    >
    >> Well the purpose of my post was to find out if I'm not alone in my
    >> experiences. Citing the above example, that brand is just going to fail

    over
    >> and over - but only I guess for users who give it a duty cycle that's
    >> light enough to keep its temp down.

    >
    >Heavy seek loads are the worst. Video loads tend to use long accesses
    >with relatively few seeks: while the disk head still has to follow the
    >track, far less heat should be generated (I'd tend to suspect much
    >closer to an idle level than to a heavy-seeking level).
    >
    >...
    >
    >>> Why not just look around until you find a commercial unit that cools its
    >>> disks properly? With external SATA units that should be trivial (no
    >>> problem getting the S.M.A.R.T. information there).

    >>
    >> Any suggestions? How about this S.M.A.R.T. software, does it slow down

    access
    >> time? Any recommendations on whose to use?

    >
    >Modern disks include firmware that monitors their own operation and
    >health, one of the outputs being their internal temperature. S.M.A.R.T.
    >monitoring software just interrogates the disk to get that information
    >out of it: there's no overhead at all save at the times you ask the
    >disk a question (which shouldn't be that often unless you want to use
    >the software to monitor temperatures for unusual changes rather than
    >simply check them occasionally).
    >
    >I use a small free utility called Dtemp from
    >http://private.peterlink.ru/tochinov/ and just start it up once in a
    >while to see how things are doing (it reports a lot of other S.M.A.R.T.
    >attributes too - Seagate drives are a little strange, since they come
    >from the factory with non-zero values for some failing-health
    >indicators, according to another S.M.A.R.T. utility from Adenix that I
    >use less often). A quick look around turned up some other free
    >utilities but none that could work through a USB connection (still
    >hoping that someone here knows of one) - nor did the few paid-for
    >utilities that I encountered claim to do so.
    >
    >- bill



+ Reply to Thread