building on your own a large data storage ... - Hardware

This is a discussion on building on your own a large data storage ... - Hardware ; Hi, ~ I need to store a really large number of texts and I (could) have a number of ATA100 S.M.A.R.T.-compliant hard drives, which I would like to use to somehow build a large and safe (RAID-5?) data store ~ ...

+ Reply to Thread
Page 1 of 3 1 2 3 LastLast
Results 1 to 20 of 42

Thread: building on your own a large data storage ...

  1. building on your own a large data storage ...

    Hi,
    ~
    I need to store a really large number of texts and I (could) have a
    number of ATA100 S.M.A.R.T.-compliant hard drives, which I would like
    to use to somehow build a large and safe (RAID-5?) data store
    ~
    Now I am definitely more of a software person (at least
    occupationally) and this is what I have in mind:
    ~
    * I will have to use standard (and commercially available (meaning
    cheap ;-))) x86-based hardware and open source software
    ~
    * AFAIK you could maximally use 4 hard drives in such boxes
    ~
    * heat dissipation could become a problem with so many hard drives
    ~
    * I need a reliable and stable power supply
    ~
    Should I got for ATA or SATA drives and why?
    ~
    You could use firewire and/or USB cards to plug in that many
    harddrives. Wouldn't it be faster/better using extra ATA PCI cards?
    What else would it entail? How many such cards could Linux take?
    ~
    People in the know use software based RAID. Could you give me links
    to these kinds of discussions?
    ~
    What would be my weak/hotspot points in my kind of design?
    ~
    Any suggestions of the type of boxes/racks I should use?
    ~
    Is this more or less feasible? What am I missing here? Any other
    suggestions? or intelligent posts in which people have discussed these
    issues before? I found two in which some people have said a few right
    and some other questionable things:
    ~
    comp.sys.ibm.pc.hardware.storage: "2 TB storage solution"
    comp.arch.storage: "Homebuilt server (NAS/SAN) vs the prefab ones?
    Peformance"
    ~
    Do you know of any such "do-it-yourself" projects out there?
    ~
    thanks
    lbrtchx


  2. Re: building on your own a large data storage ...

    lbrtchx@hotmail.com wrote:

    > Hi,


    Hi

    > ~
    > I need to store a really large number of texts and I (could) have a
    > number of ATA100 S.M.A.R.T.-compliant hard drives, which I would like
    > to use to somehow build a large and safe (RAID-5?) data store
    > ~
    > Now I am definitely more of a software person (at least
    > occupationally) and this is what I have in mind:
    > ~
    > * I will have to use standard (and commercially available (meaning
    > cheap ;-))) x86-based hardware and open source software


    Linux + dm-raid5/6
    Solaris 10 x86 + ZFS
    *BSD

    can all handle software RAID on that sort of hardware.

    > ~
    > * AFAIK you could maximally use 4 hard drives in such boxes


    Many mobos (so called ATA "RAID") had 4 PATA ports so could handle 8 drives,
    in a less than ideal master/slave configuration.

    You can also add extra ports (SATA or PATA) via a PCI card.

    > * heat dissipation could become a problem with so many hard drives


    Use a good case with well placed fans, or use a drive enclosure with fans
    (sort of thing that lets you put 4 drives in 3x5.25inch bays). Have a
    browse at www.kustompcs.co.uk for some case or bay adaptor ideas (they have
    nice pictures).

    > * I need a reliable and stable power supply


    Buy a decent PSU, overrated. Tagan should be a good bet. Hexus.net did a
    fairly extreme PSU test recently which noted that some PSUs were unable to
    meet their stated supply power - worth a hunt on their site for that.

    > Should I got for ATA or SATA drives and why?


    Doesn't matter fundamentally. SATA will give you less wiring grief and you
    can use, say, an SIL3114 based PCI card for 4 cheap SATA-I (not SATA-II)
    ports on a single card.

    > You could use firewire and/or USB cards to plug in that many
    > harddrives. Wouldn't it be faster/better using extra ATA PCI cards?
    > What else would it entail? How many such cards could Linux take?


    Yuk.

    > People in the know use software based RAID. Could you give me links
    > to these kinds of discussions?


    As mentioned above, search on google for "device mapper linux" or "software
    raid linux"

    > What would be my weak/hotspot points in my kind of design?


    You need to configure it right. Don;t use the full sector count of each
    disk, back off by about 0.5% - the 500GB disk you but next year to replace
    your 500GB from this year when it blows may be 100 sectors less - I've been
    caught out like that once.

    > Any suggestions of the type of boxes/racks I should use?


    As above - have a browse. See what you like. Expensive+nice or
    cheap_cheerful? Rack or Non rack? Easy to replace drives or cheap?

    > Is this more or less feasible? What am I missing here? Any other
    > suggestions? or intelligent posts in which people have discussed these
    > issues before? I found two in which some people have said a few right
    > and some other questionable things:
    > ~
    > comp.sys.ibm.pc.hardware.storage: "2 TB storage solution"
    > comp.arch.storage: "Homebuilt server (NAS/SAN) vs the prefab ones?
    > Peformance"
    > ~
    > Do you know of any such "do-it-yourself" projects out there?


    Ask me in about a month or 2 when I've done my 1.5TB box.

    HTH

    Tim

  3. Re: building on your own a large data storage ...

    > Now consider that you are spending two weeks selecting parts, one week to put everything together and 'make it work' and two hours per week for the next three years getting the damn thing back on line.
    ~
    two weeks selecting parts?
    ~
    one week to put everything together and 'make it work'?
    ~
    I think this is way more than needs to be and I am fine with spending
    two hours per week "petting my box" so to say
    ~
    > Be smart ...

    I have learned already about so called "solutions" and "contracts". I
    remember well these guys selling "back-up solutions" to some people I
    worked for. All that was needed was a well crafted bash script and
    some intelligent housekeeping of the data
    ~
    I have always liked owning my ground, also I would like to start
    small and I think I could/can definitely do it.
    ~
    lbrtchx


  4. Re: building on your own a large data storage ...

    > > * AFAIK you could maximally use 4 hard drives in such boxes
    >
    > Many mobos (so called ATA "RAID") had 4 PATA ports so could handle 8 drives,
    > in a less than ideal master/slave configuration.
    >

    I have read about people's complaints with hardware RAID. I would
    rather go for software-based systems
    ~
    > You can also add extra ports (SATA or PATA) via a PCI card.
    >

    This is what I think I will end up doing. Getting extra PCI ATA 100
    cards
    ~

    > > * heat dissipation could become a problem with so many hard drives

    >
    > Use a good case with well placed fans, or use a drive enclosure with fans
    > (sort of thing that lets you put 4 drives in 3x5.25inch bays). Have a
    > browse at www.kustompcs.co.uk for some case or bay adaptor ideas (they have
    > nice pictures).

    ~
    I went there but I couldn't see/find a case/bay adaptor for 8 drives
    ~
    >
    > > What would be my weak/hotspot points in my kind of design?

    >
    > You need to configure it right. Don;t use the full sector count of each
    > disk, back off by about 0.5% - the 500GB disk you but next year to replace
    > your 500GB from this year when it blows may be 100 sectors less - I've been
    > caught out like that once.

    ~
    Drives are solid state magnetic platters and they like to be cool.
    keeping them at around 5 Celcious makes drive failure go away almost
    entirely
    ~
    > > Do you know of any such "do-it-yourself" projects out there?

    >
    > Ask me in about a month or 2 when I've done my 1.5TB box.

    ~
    Well, yeah! Let's go for it
    ~
    lbrtchx


  5. Re: building on your own a large data storage ...

    In comp.sys.ibm.pc.hardware.storage lbrtchx@hotmail.com wrote:
    > Hi,
    > ~
    > I need to store a really large number of texts and I (could) have a
    > number of ATA100 S.M.A.R.T.-compliant hard drives, which I would like
    > to use to somehow build a large and safe (RAID-5?) data store


    I would advise RAID5 (or 6) and two independent systems.

    > Now I am definitely more of a software person (at least
    > occupationally) and this is what I have in mind:
    > ~
    > * I will have to use standard (and commercially available (meaning
    > cheap ;-))) x86-based hardware and open source software
    > ~
    > * AFAIK you could maximally use 4 hard drives in such boxes


    No. I have a fileserver with Linux software RAID and 12 disks.
    More would be possible.

    > * heat dissipation could become a problem with so many hard drives


    You need to blow outside air past each.

    > * I need a reliable and stable power supply


    I recommend Enermax. Calculate 25-30W per disk (startup power)
    plus 200W for the system.

    > Should I got for ATA or SATA drives and why?


    Sata, cabeling. And you can get SATA controller cards for 8 or 12
    disks.

    > You could use firewire and/or USB cards to plug in that many
    > harddrives. Wouldn't it be faster/better using extra ATA PCI cards?
    > What else would it entail? How many such cards could Linux take?


    Fireqwire and USB, while a possibility, have visibility issues and
    are dreadfukllu slow when used in a RAID. Linux can take as many
    cards as the cards themselves can co-exist. In practice you
    are unlikely to need more than four (= 32 drives, if you uuse
    8 x SATA cards). I made good experiences with Promise non-RAID cards.

    > People in the know use software based RAID. Could you give me links
    > to these kinds of discussions?


    You can find some here. Best google this group for "software RAID"

    > What would be my weak/hotspot points in my kind of design?


    RAID can fail. It is not for backuops, but for reduced downtime.
    You need at least two independent (different location) copies
    of the data. You also need to check you disks reggularly (I run a
    full SMART selftest every 14 days).

    > Any suggestions of the type of boxes/racks I should use?


    Depends on the number of disks. I have the 12 disk servber in
    a Chieftec big-tower.

    > Is this more or less feasible?


    Yes.

    > What am I missing here? Any other
    > suggestions? or intelligent posts in which people have discussed these
    > issues before? I found two in which some people have said a few right
    > and some other questionable things:
    > ~
    > comp.sys.ibm.pc.hardware.storage: "2 TB storage solution"
    > comp.arch.storage: "Homebuilt server (NAS/SAN) vs the prefab ones?
    > Peformance"
    > ~
    > Do you know of any such "do-it-yourself" projects out there?


    As I said, I have a fileserver with 12 disks (about 5TB) and
    RAID5/6 in software running on Linux witout issues for some
    years now.

    Arno


  6. Re: building on your own a large data storage ...

    lbrtchx@hotmail.com wrote:

    >> > * AFAIK you could maximally use 4 hard drives in such boxes

    >>
    >> Many mobos (so called ATA "RAID") had 4 PATA ports so could handle 8
    >> drives, in a less than ideal master/slave configuration.
    >>

    > I have read about people's complaints with hardware RAID. I would
    > rather go for software-based systems


    Hi

    You don't use the "RAID", which isn't hardware RAID most of the time anyway.
    It's simply that many mobo's that are called model "???-raid" sport 4 PATA
    ports or, more so these days, loads of SATA ports.

    You actually use the PATA ports in non "RAID" mode.

    Cheers

    Tim

  7. Re: building on your own a large data storage ...


    >> Use a good case with well placed fans, or use a drive enclosure with fans
    >> (sort of thing that lets you put 4 drives in 3x5.25inch bays). Have a
    >> browse at www.kustompcs.co.uk for some case or bay adaptor ideas (they
    >> have nice pictures).

    > ~
    > I went there but I couldn't see/find a case/bay adaptor for 8 drives


    Try again,

    the LianLi PCA16A/B cases have 9 x 5.25 slots. LianLi also sell bay units
    that take upto 4 (IIRC) disks in a 3 bay square housing with a big assed
    fan built into the front (The case comes with one unit that is similar).
    This is the case that I have just ordered.

    http://www.kustompcs.co.uk/acatalog/...k_Caddies.html

    also has lots of removeable caddies for PATA drives.

    Cheers

    Tim

  8. Re: building on your own a large data storage ...

    > You also understand *how* it works, which helps a lot when/if really bad things happen.
    Do you know of any extensive info/good books about *how* it works?
    ~
    All I basically see it is as:
    ~
    * a JBOD with the same kind of disks (each jealously monitored using
    S.M.A.R.T.)
    ~
    * configuration based on RAID5/6 (monitored internally through
    checksums and proper data structures anyway and on a hardware level
    using __________?)
    ~
    * kept in a box with a powerful and stable PSU (probably fed from an
    UPS and monitored using __________?)
    ~
    * and cool (You might go LOL, but I am even planning to do some air
    tubes from an AC onto the case and monitor the cpu temp via Linux
    kernel)
    ~
    I think, as long as there are no spikes and the box is kept cool in a
    vibration free controlled environment, there shouldn't be any
    problems. To me it all can be/is ultimately reduced to Physics and
    keeping an eye on it and a mental map on it all
    ~
    lbrtchx


  9. Re: building on your own a large data storage ...

    wrote:
    > Hi,
    > ~
    > I need to store a really large number of texts and I (could) have
    > a number of ATA100 S.M.A.R.T.-compliant hard drives, which
    > I would like to use to somehow build a large and safe (RAID-5?)
    > data store
    > ~
    > Now I am definitely more of a software person (at least
    > occupationally) and this is what I have in mind:
    > ~
    > * I will have to use standard (and commercially available (meaning
    > cheap ;-))) x86-based hardware and open source software
    > ~
    > * AFAIK you could maximally use 4 hard drives in such boxes
    > ~
    > * heat dissipation could become a problem with so many hard drives
    > ~
    > * I need a reliable and stable power supply
    > ~
    > Should I got for ATA or SATA drives and why?
    > ~
    > You could use firewire and/or USB cards to plug in that many
    > harddrives. Wouldn't it be faster/better using extra ATA PCI cards?
    > What else would it entail? How many such cards could Linux take?
    > ~
    > People in the know use software based RAID. Could you give me
    > links to these kinds of discussions?
    > ~
    > What would be my weak/hotspot points in my kind of design?
    > ~
    > Any suggestions of the type of boxes/racks I should use?
    > ~
    > Is this more or less feasible? What am I missing here? Any other
    > suggestions? or intelligent posts in which people have discussed
    > these issues before? I found two in which some people have said
    > a few right and some other questionable things:
    > ~
    > comp.sys.ibm.pc.hardware.storage: "2 TB storage solution"
    > comp.arch.storage: "Homebuilt server (NAS/SAN) vs the prefab
    > ones? Peformance"
    > ~
    > Do you know of any such "do-it-yourself" projects out there?
    > ~
    > thanks
    > lbrtchx



    Remember that you will be doing the engineering on this, and
    there are a lot of unknowns. I wouldn't do it with data I couldn't
    afford to lose. But I would give this 4 SATA drive enclosure
    by Kingwin some consideration:
    http://www.kingwin.com/kf4000-bk.asp

    *TimDaniels*

  10. Re: building on your own a large data storage ...

    lbrtchx@hotmail.com coughed up some electrons that declared:

    >> You also understand *how* it works, which helps a lot when/if really bad
    >> things happen.

    > Do you know of any extensive info/good books about *how* it works?


    Alas, no - I learnt it all on the job

    > ~
    > All I basically see it is as:
    > ~
    > * a JBOD with the same kind of disks (each jealously monitored using
    > S.M.A.R.T.)
    > ~
    > * configuration based on RAID5/6


    Raid 5 = "waste" one disk, survive if one disk fails (in a set). You can do
    RAID5 with, technically, 2 disks or more in total, though most
    implementations will start with 3 disks (2 disks, then you might as well do
    RAID1 for the same effect and less CPU overhead).

    Raid 6 = "waste" 2 disks, but survive if any 2 disks fail concurrrently in a
    single set.

    > (monitored internally through
    > checksums and proper data structures anyway and on a hardware level
    > using __________?)


    Not really. The drives of course will do ECC at the lowest level, but
    glitches in your data will be missed. You need to enable a media scan at
    the RAID level that reads each stripe then validates the stripe using its
    own parity data. Linux can do that, but you have to tell it to, via a cron
    job for regular scanning.

    Solaris 10 ZFS is an exception, it does put checksum data in at a high level
    and has caught flaky hardware that simpler filesystems were happy with on
    at least one occasion.

    > ~
    > * kept in a box with a powerful and stable PSU (probably fed from an
    > UPS and monitored using __________?)


    APC have stuff that runs under linux, or did - but APC gear I think runs OK
    with opensource UPS monitoring tools - I'll have to check, it's been a
    while.

    > * and cool (You might go LOL, but I am even planning to do some air
    > tubes from an AC onto the case and monitor the cpu temp via Linux
    > kernel)
    > ~
    > I think, as long as there are no spikes and the box is kept cool in a
    > vibration free controlled environment, there shouldn't be any
    > problems. To me it all can be/is ultimately reduced to Physics and
    > keeping an eye on it and a mental map on it all


    That will help, but it doesn't guarantee the disks won't pop. But in
    practise it should be fine. What I would advise is don't just build the
    thing and run, rather build it, break it (by pulling power to a disk drive,
    the whole box etc), make sure you feel it's robust and that you are
    comfortable with putting a new disk into a degraded RAID set.

    Get to know all the tools.

    Enjoy

    Cheers

    Tim



  11. Re: building on your own a large data storage ...

    On Wed, 04 Jul 2007 04:12:06 -0700, lbrtchx@hotmail.com
    wrote:

    > Hi,
    >~
    > I need to store a really large number of texts and I (could) have a
    >number of ATA100 S.M.A.R.T.-compliant hard drives, which I would like
    >to use to somehow build a large and safe (RAID-5?) data store
    >~
    > Now I am definitely more of a software person (at least
    >occupationally) and this is what I have in mind:
    >~
    > * I will have to use standard (and commercially available (meaning
    >cheap ;-))) x86-based hardware and open source software
    >~
    > * AFAIK you could maximally use 4 hard drives in such boxes


    Typical lower-end RAID controller cards do support only 4
    drives. Decide the budget, capacity, and performance level
    you require. "Texts" doesn't seem too demanding of
    performance, but given enough clients it could become
    moreso.



    >~
    > * heat dissipation could become a problem with so many hard drives
    >~
    > * I need a reliable and stable power supply


    So far as heat you merely need a case with ample front
    intake area and to leave a bit of space inbetween each
    drive. Some cases are better designed in this respect while
    others leave practicallly no space between each and would
    then require using only every-other slot to provide that
    space.

    The amount of heat produced is going to be under 50W
    (typical), which is a manageable amount as it just means the
    cooling subsystem of the chassis needs be a slight bit
    better, but being low thermal density (50W over several
    drives is a lot of area), nothing elaborate is needed.

    Yes you'll need a reliable and stable power supply but when
    is it not the case? If constant uptime is important and the
    budget allows you could go with redundant PSU, though it
    will also require an accomodating case.

    >~
    > Should I got for ATA or SATA drives and why?


    Doesn't really matter. The performance issues another
    poster mentioned aren't very significant, on a box built for
    this purpose you'd typically have GbE on the southbridge or
    special port not sharing PCI bandwidth (on a newer board at
    least, you haven't really detailed exactly what you planned
    on doing yet), and 2 drives/channel ATA isn't bottlenecked
    in most scenarios by 133MB/s, if ever. You might instead
    investigate the RAID controller cards you might want to use
    and go with the drive type supported. For longer term use,
    maintenance (replacing drives when they fail) over several
    years, an SATA card might make it easier.


    >~
    > You could use firewire and/or USB cards to plug in that many
    >harddrives. Wouldn't it be faster/better using extra ATA PCI cards?


    You don't want firewire or USB, and yes ATA or SATA is
    faster.


    >What else would it entail? How many such cards could Linux take?


    Define the requirements before thinking about # of cards.
    It doesn't seem you will need more than one.



    >~
    > People in the know use software based RAID. Could you give me links
    >to these kinds of discussions?


    Above you mentioned RAID5. Typically that's done by a
    hardware controller due to the processing overhead. If you
    were using RAID1 software raid could be acceptible and has
    neither of the problems the other poster mentioned, all the
    same factors apply it merely offloads processing and caching
    to the rest of the system which on an older heavily loaded
    system could be a performance penalty but today systems are
    pretty fast, even used ones tend to have far more processing
    power than would be required. The main factor is that it
    support RAID5 so pick a controller card and fix the
    requirements - cost, capacity, performance *required*.


    >~
    > What would be my weak/hotspot points in my kind of design?


    The link between these and the client systems. Use GbE
    instead of 100Mb if performance matters at all. Such a
    self-made box will also use more power (typically, though
    careful planning could reduce that some) than a commercially
    made NAS box.


    >~
    > Any suggestions of the type of boxes/racks I should use?


    Depends on where you're putting them.


    >~
    > Is this more or less feasible? What am I missing here? Any other
    >suggestions?


    Just nail down the finer details of your project
    requirements then build it. Taking thought on it is good
    but it isn't a terribly complex thing to do, no one way to
    do it that is necessarily going to have a large advantage
    over another way, unless you have particular needs you
    haven't mentioned yet.

  12. Re: building on your own a large data storage ...

    lbrtchx@hotmail.com wrote

    > I need to store a really large number of texts and I (could) have a
    > number of ATA100 S.M.A.R.T.-compliant hard drives, which I would
    > like to use to somehow build a large and safe (RAID-5?) data store


    > Now I am definitely more of a software person (at least
    > occupationally) and this is what I have in mind:


    > * I will have to use standard (and commercially available (meaning
    > cheap ;-))) x86-based hardware and open source software


    > * AFAIK you could maximally use 4 hard drives in such boxes


    > * heat dissipation could become a problem with so many hard drives


    > * I need a reliable and stable power supply


    > Should I got for ATA or SATA drives


    sata

    > and why?


    They have much more future, particularly motherboard support.

    We have already seen most motherboards with just one ATA port, two drives.

    > You could use firewire and/or USB cards to plug in that many harddrives.
    > Wouldn't it be faster/better using extra ATA PCI cards?


    It would be even better to use eSATA, you get the full speed and full
    SMART diagnostic capability that you dont get with firewire and USB.

    > What else would it entail?


    Basic stuff like a case and power supply
    that can adequately cool that many drives.

    > How many such cards could Linux take?


    As many as you will ever need.

    > People in the know use software based RAID.
    > Could you give me links to these kinds of discussions?


    > What would be my weak/hotspot points in my kind of design?


    Keeping the drives adequately cool.

    > Any suggestions of the type of boxes/racks I should use?


    > Is this more or less feasible?


    Yes.

    > What am I missing here?


    That ATA has passed its useby date.

    > Any other suggestions?


    Avoid hardware raid.

    > or intelligent posts in which people have discussed these issues before?
    > I found two in which some people have said a few right and some other questionable things:


    > comp.sys.ibm.pc.hardware.storage: "2 TB storage solution"
    > comp.arch.storage: "Homebuilt server (NAS/SAN) vs the prefab ones?
    > Peformance"


    > Do you know of any such "do-it-yourself" projects out there?


    There's plenty outside usenet.



  13. Re: building on your own a large data storage ...

    lbrtchx@hotmail.com wrote:
    > Hi,
    > ~
    > I need to store a really large number of texts and I (could) have a
    > number of ATA100 S.M.A.R.T.-compliant hard drives, which I would like
    > to use to somehow build a large and safe (RAID-5?) data store


    How safe does 'safe' have to be? For example, RAID won't protect you
    against update errors caused by power failure during a disk write
    operation: only a transaction log (e.g., as in ext3fs, which can IIRC
    be configured to protect data as well as metadata and whose corrective
    updates I think should cascade properly down to the underlying software
    RAID mechanisms) or the equivalent using non-volatile RAM to capture
    updates and guarantee that they eventually will be applied in their
    entirety will do that - otherwise, the RAID will just faithfully
    replicate the error.

    Another way to approach that particular problem (and some others as well
    - e.g., clean-up after an application that doesn't take similar
    precautions is interrupted during its update activity) is to use not
    RAID but frequently-synched asynchronous replication, such that you only
    replicate known-stable-and-correct data.

    If your data is primarily write-once-then-read-only, ext3fs over a
    software RAID may be a good choice - and I think you can find Linux
    driver-level software that will even replicate to a remote
    disaster-tolerant site if need be.

    > ~
    > Now I am definitely more of a software person (at least
    > occupationally) and this is what I have in mind:
    > ~
    > * I will have to use standard (and commercially available (meaning
    > cheap ;-))) x86-based hardware and open source software
    > ~
    > * AFAIK you could maximally use 4 hard drives in such boxes


    Until quite recently most motherboards supported up to 4 ATA drives, and
    many recent motherboards support two-to-four SATA drives in addition to
    that. A few slightly less recent motherboards (e.g., some from Asus)
    supported up to 8 ATA drives (the 4 normal ones plus four more on a
    built-in RAID controller which could also just use them as a JBOD to
    supplement the 4 normally-connected drives). Some recent motherboards
    support up to 10 or 12 SATA drives (plus a couple of ATAs if you don't
    need the ATA slots for something else).

    Many standard mid-tower cases have at least 8 bays in which 3.5" drives
    can be mounted (use inexpensive spacer brackets to mount them in 5" bays).

    > ~
    > * heat dissipation could become a problem with so many hard drives


    Not really - they only draw about 10 W apiece while running (a bit less
    while idling, possibly a bit more if seeking aggressively). Just add an
    extra fan or two in the right place in the case and you'll be fine.

    They do draw more power while spinning up, however - as much as 2.8 Amps
    apiece on the 12v line: make sure your power supply can handle that
    brief additional combined load (some SATA drives can be configured for
    staggered spin-up, but I don't know whether typical motherboards support
    this).

    > ~
    > * I need a reliable and stable power supply


    If you really mean that, then you likely need a redundant dual power
    supply configuration, backed by redundant dual UPSs. If you don't
    really mean that, then you're tossing the dice anyway and could perhaps
    get by with a relatively inexpensive unit from a reputable manufacturer
    (previous-generation Ultra 'value-line' 500W PSUs and 'X-finity' 600W
    PSUs - and for that matter Ultra 'Wizard' mid-tower cases - have been
    available free after rebate at Frys.com/outpost.com over the past year,
    though I haven't noticed any in the past couple of months: neither of
    these is the line of Ultra PSUs that had problems in the past, and both
    seem to offer reasonable performance if not pressed too hard); the
    alternative is to pay $100 or more for a single PSU that still can't be
    *guaranteed* not to fail.

    > ~
    > Should I got for ATA or SATA drives and why?


    It's unlikely to matter unless having two ATA drives (master and slave)
    on a single controller won't give you the performance you need (easy
    enough to experiment beforehand with a couple of drives to find out).

    > ~
    > You could use firewire and/or USB cards to plug in that many
    > harddrives.


    Configuring removable drives in a RAID sounds like asking for trouble to
    me, but it can certainly be done. Even with fixed drives you need to
    pay attention so you're sure which drive to replace if one fails.

    > Wouldn't it be faster/better using extra ATA PCI cards?


    USB 2.0 often won't handle more than about 30 MB/sec, which is slower
    than a modern drive can stream data. Firewire is better, but still not
    quite fast enough to keep up in all situations. Of course, neither
    limitation may matter unless you *really* require that level of performance.

    > What else would it entail? How many such cards could Linux take?


    Beats me. Such cards often present themselves as SCSI devices to the
    operating system, so from that standpoint it might be able to handle
    quite a few.

    > ~
    > People in the know use software based RAID. Could you give me links
    > to these kinds of discussions?


    Google might help.

    > ~
    > What would be my weak/hotspot points in my kind of design?


    That completely depends upon what kinds of access you're performing
    ('texts' above suggests smallish and/or low-bandwidth accesses to me,
    but that may not be what you have in mind).

    > ~
    > Any suggestions of the type of boxes/racks I should use?


    Hmmm - up until now I had the impression you were talking about only a
    single box. You'll have to provide a lot more information about what
    you're trying to accomplish if it's far larger-scale than that.

    - bill

  14. Re: building on your own a large data storage ...

    Bill Todd wrote:
    > lbrtchx@hotmail.com wrote:
    >> Hi,


    Whoops - I encountered your post on comp.arch.storage (where it had
    received no responses, since you limited follow-ups) and didn't notice
    the cross-post, so wasted my time reiterating a lot of the good advice
    that you had already received here.

    Anyway, just to cover a couple of additional points:

    1. A recent study found that the temperature sweet spot for (S)ATA
    drives was 30 - 35 C. (failure rates increased modestly at both higher
    *and lower* temperatures), which are relatively easy to achieve with
    conventional cooling fans.

    2. If your data is important enough to merit RAID-6, it's probably
    important enough to require using two sites for disaster-tolerance (or
    at least two separate systems for availability, which also covers other
    single points of failure in individual boxes), in which case using
    RAID-5 (or RAID-1, if performance demands it) at both sites makes more
    sense. I don't know whether ZFS yet supports such operation (nor
    whether it's yet mature enough for you to consider 'safe'): its
    parity-RAID implementation wastes disk utilization, but its data
    integrity guarantees are impressive.

    3. My original impression was that you were talking about a personal,
    single-box system, where the occasional need to replace an internal
    drive would not be a problem. If you're instead talking about
    large-scale deployment you do indeed want to use removable drive trays
    (the cost of which varies quite a bit, and at least partly according to
    their durability).

    - bill

  15. Re: building on your own a large data storage ...


    wrote in message
    news:1183547526.111134.84700@r34g2000hsd.googlegro ups.com...
    > Hi,
    > ~
    > I need to store a really large number of texts and I (could) have a
    > number of ATA100 S.M.A.R.T.-compliant hard drives, which I would like
    > to use to somehow build a large and safe (RAID-5?) data store
    > ~
    > Now I am definitely more of a software person (at least
    > occupationally) and this is what I have in mind:
    > ~
    > * I will have to use standard (and commercially available (meaning
    > cheap ;-))) x86-based hardware and open source software
    > ~
    > * AFAIK you could maximally use 4 hard drives in such boxes
    > ~
    > * heat dissipation could become a problem with so many hard drives
    > ~
    > * I need a reliable and stable power supply
    > ~
    > Should I got for ATA or SATA drives and why?
    > ~
    > You could use firewire and/or USB cards to plug in that many
    > harddrives. Wouldn't it be faster/better using extra ATA PCI cards?
    > What else would it entail? How many such cards could Linux take?
    > ~
    > People in the know use software based RAID. Could you give me links
    > to these kinds of discussions?
    > ~
    > What would be my weak/hotspot points in my kind of design?
    > ~
    > Any suggestions of the type of boxes/racks I should use?
    > ~
    > Is this more or less feasible? What am I missing here? Any other
    > suggestions? or intelligent posts in which people have discussed these
    > issues before? I found two in which some people have said a few right
    > and some other questionable things:
    > ~
    > comp.sys.ibm.pc.hardware.storage: "2 TB storage solution"
    > comp.arch.storage: "Homebuilt server (NAS/SAN) vs the prefab ones?
    > Peformance"
    > ~
    > Do you know of any such "do-it-yourself" projects out there?
    > ~


    Have a look at OpenFiler. It's a linux based iSCSI, NFS and SMB appliance.
    Takes about 10 minutes to install and configure. I'm using it as iSCSI
    shared storage (pseudo-SAN) between a couple of ESX Servers and it works
    fine.

    --
    Kwyj.



  16. Re: building on your own a large data storage ...

    lbrtchx@hotmail.com wrote in news:1183547526.111134.84700
    @r34g2000hsd.googlegroups.com:

    [snip]

    > * I will have to use standard (and commercially available (meaning
    > cheap ;-))) x86-based hardware and open source software
    > ~
    > * AFAIK you could maximally use 4 hard drives in such boxes


    [snip]

    You could add controller cards, which would allow you to add extra
    drives. 3 PCI cards, with 4 drives per card, plus 4 drives on the
    motherboard, = 16 drives. Use a nice case, and sensible fans, to sort
    out the heat.

    You'd then run something like NASlite from floppy or USB.

    This allows you to either buy many smaller drives (hitting the optimal
    dollar per gigabyte) or buy a few big fat drives and expand storage
    later.

    But, uh, I guess someone will be along to tell me that none of this
    will work with RAID. :-(

  17. Re: building on your own a large data storage ...

    lbrtchx@hotmail.com schrieb:
    > Hi,
    > ~
    > I need to store a really large number of texts and I (could) have a
    > number of ATA100 S.M.A.R.T.-compliant hard drives, which I would like
    > to use to somehow build a large and safe (RAID-5?) data store
    > ~
    > ~
    > Should I got for ATA or SATA drives and why?
    > ~
    > You could use firewire and/or USB cards to plug in that many
    > harddrives. Wouldn't it be faster/better using extra ATA PCI cards?
    > What else would it entail? How many such cards could Linux take?
    > ~
    > People in the know use software based RAID. Could you give me links
    > to these kinds of discussions?
    > ~
    > What would be my weak/hotspot points in my kind of design?
    > ~
    > Any suggestions of the type of boxes/racks I should use?
    > ~


    Have you ever thought of using some older x86 hardware/PC?
    get all the obsolete pieces of hardware out and put the HD capacity yo
    need in.

    I have very good experience with 'FreeNAS', a FreeBSD based operating
    system tailored for storage boxes. It is designed to hold the whole
    operating system/firmware on a tiny CF-card, speaks NFS (essential for
    Linux) and also FTP and CIFS. Administration is completely done via a
    HTTP-web interface, including managing and adding new disks!

    Check the homepage here:

    http://www.freenas.org/

    Best regards,
    Ingo

  18. Re: building on your own a large data storage ...

    bealoid coughed up some electrons that declared:


    > This allows you to either buy many smaller drives (hitting the optimal
    > dollar per gigabyte) or buy a few big fat drives and expand storage
    > later.
    >
    > But, uh, I guess someone will be along to tell me that none of this
    > will work with RAID. :-(


    Au contraire: restriping to add an additional drive into RAID5 (and maybe
    RAID6, I didn't pay attention) is in the newer 2.6 kernels. Not tried it
    though. It will be slow (true of restriping on anything) and, although I
    believe contingency measures are in the code, it's not the sort of
    operation you really want your box to crash whilst it's in the middle of
    it.

    Another good option is to spec an 8 drive system, put 4 in now, 2-3 years
    later, add a new 4 drives which are probably 2-3 times bigger by then for
    the same money, and migrate the data over, remove the 1st 4 and use for
    backups or something. That cycle can then be maintained for many years or
    so, until the drives are no longer made with that interface or the system
    is decrepid.

    Tim

  19. Re: building on your own a large data storage ...

    Bill Todd writes:
    >2. If your data is important enough to merit RAID-6, it's probably
    >important enough to require using two sites for disaster-tolerance (or
    >at least two separate systems for availability, which also covers other
    >single points of failure in individual boxes), in which case using
    >RAID-5 (or RAID-1, if performance demands it) at both sites makes more
    >sense.


    In Linux DRBD gives you such replication, but
    you need fast networking (well, it depends on the required write
    bandwidth how fast; if it's mostly read-only, slow networking is ok,
    too:-) between the two sites.

    - anton
    --
    M. Anton Ertl Some things have to be seen to be believed
    anton@mips.complang.tuwien.ac.at Most things have to be believed to be seen
    http://www.complang.tuwien.ac.at/anton/home.html

  20. Re: building on your own a large data storage ...

    > You need to configure it right. Don;t use the full sector count of each
    disk, back off by about 0.5% - the 500GB disk you but next year to
    replace
    your 500GB from this year when it blows may be 100 sectors less - I've
    been
    caught out like that once.
    ~
    Aren't there any preferable regions in the disk? Should people leave
    them at the beginning, middle or end of it
    ~
    I recall from when I used to see running scandisc that bad sectors
    tend to appear in clusters
    ~
    Also I have heard people say that Western Digital drives tend to die
    slowly, while Maxtor ones die fast. To which extent are these
    differences between hard drives facts or myths?
    ~
    Good readings on hard disk failures are:
    ~
    // __ http://labs.google.com/papers/disk_failures.pdf
    ~
    // __ http://www.cs.duke.edu/~justin/papers/tacs04going.pdf
    ~
    // __ http://en.wikipedia.org/wiki/Self-Mo...ing_Technology
    ~
    // __ http://www.usenix.org/events/fast07/...tml/index.html
    ~
    // __ http://arstechnica.com/news.ars/post/20070225-8917.html
    ~
    // __ http://www.gcn.com/print/26_10/44242-1.html
    ~
    Thanks
    lbrtchx


+ Reply to Thread
Page 1 of 3 1 2 3 LastLast