building on your own a large data storage ... - Hardware

This is a discussion on building on your own a large data storage ... - Hardware ; On Wed, 04 Jul 2007 04:12:06 -0700, lbrtchx@hotmail.com wrote: > Hi, >~ > I need to store a really large number of texts and I (could) have a >number of ATA100 S.M.A.R.T.-compliant hard drives, which I would like >to use ...

+ Reply to Thread
Page 2 of 3 FirstFirst 1 2 3 LastLast
Results 21 to 40 of 42

Thread: building on your own a large data storage ...

  1. Re: building on your own a large data storage ...

    On Wed, 04 Jul 2007 04:12:06 -0700, lbrtchx@hotmail.com wrote:

    > Hi,
    >~
    > I need to store a really large number of texts and I (could) have a
    >number of ATA100 S.M.A.R.T.-compliant hard drives, which I would like
    >to use to somehow build a large and safe (RAID-5?) data store
    >~


    How safe do you want it? RAID 5 is not the safest RAID, it's
    basically better than nothing, but as RAID goes it's definitely entry
    level.
    Raid 1, Raid 1+0, Raid 5+1, Raid 6, etc. are all safer Raid levels.

    > Now I am definitely more of a software person (at least
    >occupationally) and this is what I have in mind:
    >~
    > * I will have to use standard (and commercially available (meaning
    >cheap ;-))) x86-based hardware and open source software
    >~
    > * AFAIK you could maximally use 4 hard drives in such boxes
    >~


    How big do you need this data store to be? Are we talking hundreds of
    gig's or terabytes? If terabytes, 10's or 100's?

    > * heat dissipation could become a problem with so many hard drives
    >~
    > * I need a reliable and stable power supply
    >~
    > Should I got for ATA or SATA drives and why?


    There may be technical reasons to go with SATA over ATA besides
    performance but personally I like the nicer cabling of SATA. If
    you're looking to cram it all in one PC then that will be a factor.

    >~
    > You could use firewire and/or USB cards to plug in that many
    >harddrives. Wouldn't it be faster/better using extra ATA PCI cards?
    >What else would it entail? How many such cards could Linux take?
    >~


    Most cheap RAID controllers will handle up to 4 drives. You could
    probably get two controllers and use a volume manager to pool the two.

    > People in the know use software based RAID. Could you give me links
    >to these kinds of discussions?
    >~


    What people are these? No one uses software raid except in rare or
    specialized cases. NetApp uses software raid but they're highly
    specialized. Most everyone else like BlueArc, EMC, HDS, Nexsan,
    Overland, you name it all use hardware raid controllers. Recently Sun
    came out with ZFS which is all software/file system raid but it's
    pretty new and specific to the goals of the file system.
    You can use software to pool multiple controllers together (as in a
    volume manager) but using software to raid is just so 1986.

    > What would be my weak/hotspot points in my kind of design?


    You haven't given a design, just requirements.

    >~
    > Any suggestions of the type of boxes/racks I should use?
    >~
    > Is this more or less feasible? What am I missing here? Any other
    >suggestions? or intelligent posts in which people have discussed these
    >issues before? I found two in which some people have said a few right
    >and some other questionable things:
    >~
    > comp.sys.ibm.pc.hardware.storage: "2 TB storage solution"
    > comp.arch.storage: "Homebuilt server (NAS/SAN) vs the prefab ones?
    >Peformance"
    >~


    For NAS, you could look at the previouslyt mentioned openfiler or go
    to freenas.org.
    freenas uses software raid, but it's exception is that it's free...

    ~F

  2. Re: building on your own a large data storage ...

    lbrtchx@hotmail.com coughed up some electrons that declared:

    >> You need to configure it right. Don;t use the full sector count of each

    > disk, back off by about 0.5% - the 500GB disk you but next year to
    > replace
    > your 500GB from this year when it blows may be 100 sectors less - I've
    > been
    > caught out like that once.
    > ~
    > Aren't there any preferable regions in the disk? Should people leave
    > them at the beginning, middle or end of it


    No, just leave them at the end, which is the natural thing to do if
    allocating from the start of the disk.

    > ~
    > I recall from when I used to see running scandisc that bad sectors
    > tend to appear in clusters
    > ~


    The may tend too, but they could appear anywhere.

    > Also I have heard people say that Western Digital drives tend to die
    > slowly, while Maxtor ones die fast. To which extent are these
    > differences between hard drives facts or myths?


    Personally I wouldn't touch Maxtor with a bargepole, but they were bought
    out by Seagate, who apparently are closing the Maxtor factories.

    > Good readings on hard disk failures are:
    > ~
    > // __ http://labs.google.com/papers/disk_failures.pdf
    > ~
    > // __ http://www.cs.duke.edu/~justin/papers/tacs04going.pdf
    > ~
    > // __
    >

    http://en.wikipedia.org/wiki/Self-Mo...ing_Technology
    > ~ // __
    >

    http://www.usenix.org/events/fast07/...tml/index.html
    > ~ // __ http://arstechnica.com/news.ars/post/20070225-8917.html
    > ~
    > // __ http://www.gcn.com/print/26_10/44242-1.html
    > ~
    > Thanks
    > lbrtchx



  3. Re: building on your own a large data storage ...

    Faeandar wrote:

    ......
    >> People in the know use software based RAID. Could you give me links
    >>to these kinds of discussions?
    >>~

    >
    > What people are these? No one uses software raid except in rare or
    > specialized cases. NetApp uses software raid but they're highly
    > specialized. Most everyone else like BlueArc, EMC, HDS, Nexsan,
    > Overland, you name it all use hardware raid controllers. Recently Sun
    > came out with ZFS which is all software/file system raid but it's
    > pretty new and specific to the goals of the file system.
    > You can use software to pool multiple controllers together (as in a
    > volume manager) but using software to raid is just so 1986.
    >

    We are talking about "consumer raid" here, with cheap chips and maybe
    even "bios raid" which just disguises as a hardware raid, and most of the
    time onboard controllers which die with the mobo.

    Now, a hardware raid goes with it's controller - you can't put the drives in
    elsewhere and expect the raid to be recognized again.

    A software raid relies upon information stored on the drives so you can even
    swap them between different linux versions/distributions, as long as the
    controller and raid personality is supported by the kernel. Monitoring
    (mdadm will send mail on drive failures if configured properly) a linux
    software raid is a breeze as well, while you rely on special drivers for
    hardware.

    --
    vista policy violation: Microsoft optical mouse found penguin patterns
    on mousepad. Partition scan in progress to remove offending
    incompatible products. Reactivate MS software.
    Linux 2.6.17mm,Xorg7.2/nvidia [LinuxCounter#295241,ICQ#4918962]

  4. Re: building on your own a large data storage ...

    Faeandar writes:
    >On Wed, 04 Jul 2007 04:12:06 -0700, lbrtchx@hotmail.com wrote:
    >
    >> Hi,
    >>~
    >> I need to store a really large number of texts and I (could) have a
    >>number of ATA100 S.M.A.R.T.-compliant hard drives, which I would like
    >>to use to somehow build a large and safe (RAID-5?) data store
    >>~

    >
    >How safe do you want it? RAID 5 is not the safest RAID, it's
    >basically better than nothing, but as RAID goes it's definitely entry
    >level.
    >Raid 1, Raid 1+0, Raid 5+1, Raid 6, etc. are all safer Raid levels.


    RAID 5 is good for surviving one disk failure, and RAID 1 with two
    disks (the usual configuration) is also good for surviving one disk
    failure.

    >> People in the know use software based RAID. Could you give me links
    >>to these kinds of discussions?
    >>~

    >
    >What people are these? No one uses software raid except in rare or
    >specialized cases.


    We use software RAID, and that's all I recommend except in rare and
    specialized cases. If you want to go with hardware or fake RAID, buy
    at least two controllers (or motherboards if you are using an on-board
    controller), so that you can still access the volumes when a
    controller dies.

    - anton
    --
    M. Anton Ertl Some things have to be seen to be believed
    anton@mips.complang.tuwien.ac.at Most things have to be believed to be seen
    http://www.complang.tuwien.ac.at/anton/home.html

  5. Re: building on your own a large data storage ...

    > Ask me in about a month or 2 when I've done my 1.5TB box.
    Well, yeah! Tim, you appear to know better what you are doing and I
    hope you are gonna post your findings/experience back here and/or
    sending them to me via email. After thinking about it I think I would
    go for SATA drives, but it all depends on the mobo + case
    ~
    I have given a good chunk of my time to find out and try to build on
    my own such box, what I find a bit tantalizing is how little (true and
    reliable) information you find about something that shouldn't be that
    difficult. Most of what you find is "buy-my-sh!t" kind of infomercials/
    advertorials.
    ~
    Even the google lab semi-gods complain about it, I spent some time
    reading their "findings" (http://labs.google.com/papers/
    disk_failures.pdf), but found their own paper fishy in the same sense
    ~
    * they did not really publish their info aggregated down to the drive
    manufacturers + model however they complain about other people not
    doing the same thing for "confidentiality reasons"
    ~
    * they do mention (I found it somewhat silly) the importance of
    taking into account the app usage patterns and other OS
    characteristics (including the FS used)
    ~
    To me these pieces of information are crucial. If you don't know the
    manufacturer's + model (+ even vintage/production cycles) you can't
    tell how much of the drive internal cache was being used.
    ~
    If you don't know the type of application, OS and even file system
    type (is it a DB with intensive insertions/updates, some read-only
    data or a serial append only file?)
    ~
    Also there was a question I posted some time ago on
    comp.os.linux.hardware: "smartmontool + software-based RAID" about the
    positions of hard drives that is still not clear to me. I could
    imagine some of you with more resouces out there can test this
    ~
    Something I notice is that some hard drives are placed vertically in
    some computers and horizontally in some others, but I think (actually
    it could be even proved) drives (their electro-mechanical parts) are a
    lot less taxed if they spin vertically, but I don't hear of anyone
    talking about these kinds of things ...
    ~
    I am talking about the torque needed to reach a rotational momentum
    to actually rotate the drive.
    ~
    http://en.wikipedia.org/wiki/Torque
    ~
    I would even say that there is a linear relationship (depending on
    the lever) between the torque needed to move a flat cylinder (a
    platter) horizontally and to move it vertically. I am a freaking
    physicist, if you couldn't tell ;-)
    ~
    I wonder how is it manufacterers would say it does not matter. I mean
    I would even consider earth magnetism to place the drives in certain
    ways
    ~
    They told me that it does not matter (or manufacturers say so) how
    youplace the drives, most obviously with modern fluid bearings, but I
    think it physically does. Even though drives are fed with enough power
    to spindle whichever position you place them. The amount of work they
    need to do and therefore the power they consume would not be the same
    ~
    lbrtchx


  6. Re: building on your own a large data storage ...

    lbrtchx@hotmail.com coughed up some electrons that declared:

    >> Ask me in about a month or 2 when I've done my 1.5TB box.

    > Well, yeah! Tim, you appear to know better what you are doing and I
    > hope you are gonna post your findings/experience back here and/or
    > sending them to me via email. After thinking about it I think I would
    > go for SATA drives, but it all depends on the mobo + case


    Sure thing - I will. I'm still debating whether to use Solaris 10 + ZFS
    which looks cool, but is a bit alien to me (= I will have to bugger about a
    lot testing and breaking it to feel comfortable with all my data on it) or
    to use Linux RAID which I'm used to.

    I've settled on an ASUS M2N-E mobo with Athlon 64 X2 CPU, partly because the
    board has 6 SATA-II ports (I'll add a couple more with a card to make it 8)
    and partly because that board should run Linux and Solaris 10 equally well
    by all accounts.

    > ~
    > I have given a good chunk of my time to find out and try to build on
    > my own such box, what I find a bit tantalizing is how little (true and
    > reliable) information you find about something that shouldn't be that
    > difficult. Most of what you find is "buy-my-sh!t" kind of infomercials/
    > advertorials.


    Yep - I agree. I've spent ages researching boards too.

    > Even the google lab semi-gods complain about it, I spent some time
    > reading their "findings" (http://labs.google.com/papers/
    > disk_failures.pdf), but found their own paper fishy in the same sense
    > ~
    > * they did not really publish their info aggregated down to the drive
    > manufacturers + model however they complain about other people not
    > doing the same thing for "confidentiality reasons"


    That was annoying.

    > * they do mention (I found it somewhat silly) the importance of
    > taking into account the app usage patterns and other OS
    > characteristics (including the FS used)
    > ~
    > To me these pieces of information are crucial. If you don't know the
    > manufacturer's + model (+ even vintage/production cycles) you can't
    > tell how much of the drive internal cache was being used.
    > ~
    > If you don't know the type of application, OS and even file system
    > type (is it a DB with intensive insertions/updates, some read-only
    > data or a serial append only file?)
    > ~
    > Also there was a question I posted some time ago on
    > comp.os.linux.hardware: "smartmontool + software-based RAID" about the
    > positions of hard drives that is still not clear to me. I could
    > imagine some of you with more resouces out there can test this
    > ~
    > Something I notice is that some hard drives are placed vertically in
    > some computers and horizontally in some others, but I think (actually
    > it could be even proved) drives (their electro-mechanical parts) are a
    > lot less taxed if they spin vertically, but I don't hear of anyone
    > talking about these kinds of things ...
    > ~
    > I am talking about the torque needed to reach a rotational momentum
    > to actually rotate the drive.
    > ~
    > http://en.wikipedia.org/wiki/Torque
    > ~
    > I would even say that there is a linear relationship (depending on
    > the lever) between the torque needed to move a flat cylinder (a
    > platter) horizontally and to move it vertically. I am a freaking
    > physicist, if you couldn't tell ;-)
    > ~
    > I wonder how is it manufacterers would say it does not matter. I mean
    > I would even consider earth magnetism to place the drives in certain
    > ways
    > ~
    > They told me that it does not matter (or manufacturers say so) how
    > youplace the drives, most obviously with modern fluid bearings, but I
    > think it physically does. Even though drives are fed with enough power
    > to spindle whichever position you place them. The amount of work they
    > need to do and therefore the power they consume would not be the same
    > ~
    > lbrtchx


    I've had drives in horizontal and vertical configurations at work (1000's
    drives in total. Whilst I don't disagree with the above, in practise I
    don't think it actually makes a lot of difference. The buggers will die
    randomly in small numbers occasionally whatever you do.

    I think keeping them reasonably cool is the best bet (but not freezing cold
    according to the Google report).

    Cheers

    Tim

  7. Re: building on your own a large data storage ...

    lbrtchx@hotmail.com wrote:
    > Hi,
    > ~
    > I need to store a really large number of texts and I (could) have a
    > number of ATA100 S.M.A.R.T.-compliant hard drives, which I would like
    > to use to somehow build a large and safe (RAID-5?) data store


    You don't say how large it needs to be. Raid 5 gives you
    (drives-1)*size in space. So with four drives of 500Gigs each you get
    1.5T (ignoring the difference between manufacturer size and real world
    size). Need more space, go with 750Gig drives. The 1T disks are
    probably too pricey to consider.

    > * I will have to use standard (and commercially available (meaning
    > cheap ;-))) x86-based hardware and open source software
    > ~
    > * heat dissipation could become a problem with so many hard drives
    > ~


    My motherboard supports 12 SATA drives. I'd probably need a few extra
    fans and a bigger power supply, though. If you need more than four
    drives then you are probably beyond the realm of "cheap", but you don't
    necessarily have to go to "bloody expensive".

    > ~
    > What would be my weak/hotspot points in my kind of design?


    You've covered it. Heat is going to be a big factor so you'll need a
    case with lots of cooling capability.

    A quality power supply will be essential.

    With that many drives in the box, drive failure becomes an even bigger
    factor. I'd probably go with two sets of RAID 5 or a RAID 5/1
    combination (if I'm really paranoid), but the latter is a bit costly as
    you use twice as many disks; space = (drives-1)*size/2.

    Consider off-the-shelf solutions, too. There are some decent home
    network quality filers out there. I wouldn't use them in a critical
    business job, but something like that might be good enough for your needs.

    --
    Ogre

  8. Re: building on your own a large data storage ...

    lbrtchx@hotmail.com wrote:
    > Even the google lab semi-gods complain about it, I spent some time
    > reading their "findings" (http://labs.google.com/papers/
    > disk_failures.pdf), but found their own paper fishy in the same sense
    > ~
    > * they did not really publish their info aggregated down to the drive
    > manufacturers + model however they complain about other people not
    > doing the same thing for "confidentiality reasons"


    I read through it also and thought they couldn't reveal make/model for
    that reason. OK, I only scanned it quickly...

    > * they do mention (I found it somewhat silly) the importance of
    > taking into account the app usage patterns and other OS
    > characteristics (including the FS used)


    Not silly at all. Drive usage and FS *can* impact reliability. Good
    researchers will eliminate all externities.

    > To me these pieces of information are crucial. If you don't know the
    > manufacturer's + model (+ even vintage/production cycles) you can't
    > tell how much of the drive internal cache was being used.


    Explain, please. How does the "drive internal cache" being used have
    anything to do with reliablity? Only thing I can think of would be when
    data is accessed a second time before that drive cache is flushed. I'm
    not a google engineer, but my guess is that doesn't happen much on their
    servers...

    > Something I notice is that some hard drives are placed vertically in
    > some computers and horizontally in some others, but I think (actually
    > it could be even proved) drives (their electro-mechanical parts) are a
    > lot less taxed if they spin vertically, but I don't hear of anyone
    > talking about these kinds of things ...


    I would suggest the only measurable difference would be on the bearings.

    > I am talking about the torque needed to reach a rotational momentum
    > to actually rotate the drive.
    > ~
    > http://en.wikipedia.org/wiki/Torque
    > ~
    > I would even say that there is a linear relationship (depending on
    > the lever) between the torque needed to move a flat cylinder (a
    > platter) horizontally and to move it vertically. I am a freaking
    > physicist, if you couldn't tell ;-)


    No, I couldn't tell. (B.S., Math, Physics, C.S., 1985)

    > I wonder how is it manufacterers would say it does not matter. I mean
    > I would even consider earth magnetism to place the drives in certain
    > ways


    Besides the strain on the bearings being different, the only other thing
    I can think of would be the aerodynamics of the R/W heads nead to
    compensate for both orientations. Gravity sucks afterall...

    > They told me that it does not matter (or manufacturers say so) how
    > youplace the drives, most obviously with modern fluid bearings, but I


    Which might obviate my concern about the bearings. I haven't studied
    hard drive manufacturing techniques in some years, since before fluid
    bearings. Thought solid state storage (holographic crystals, 1TB in a 1
    cubic centimeter crystal, access times < 10 ns) would be perfected "any
    day now". 12 years later, we're still using spinning metal. :-(

    > think it physically does. Even though drives are fed with enough power
    > to spindle whichever position you place them. The amount of work they
    > need to do and therefore the power they consume would not be the same


    Are you talking about friction at the bearing or something else? If
    something else explain, please. Assumptions, equations, etc...


    BTW, I'll probably be buying drives and controller(s) for a 2TB RAID5,
    maybe two, for my Mythtv settup sometime this summer. Software RAID
    (mdadm), SATA 500GB (WD5000YS
    http://www.westerndigital.com/en/pro...38&language=en
    ), either ext3 or IBM's JFS, etc...

  9. Re: building on your own a large data storage ...

    >> * I will have to use standard (and commercially available (meaning
    >> cheap ;-))) x86-based hardware and open source software
    >> ~
    >> * heat dissipation could become a problem with so many hard drives
    >> ~

    >
    > My motherboard supports 12 SATA drives. I'd probably need a few extra
    > fans and a bigger power supply, though. If you need more than four
    > drives then you are probably beyond the realm of "cheap", but you don't
    > necessarily have to go to "bloody expensive".
    >
    >> ~
    >> What would be my weak/hotspot points in my kind of design?

    >
    >


    I think that the discussion here is only half of the story:

    it is not too difficult to build a > 1TB storage system/NAS. The more
    disks are used to make up the capacity the higher the risk of failure!

    So whatever RAID solution is used, it does not substitute a regular and
    reliable backup! And with these capacities the backup-system (tape) will
    be more expensive than the whole box!

    Regards,
    Ingo

  10. Re: building on your own a large data storage ...

    lbrtchx@hotmail.com wrote:

    ....

    >> Most (S)ATA drives are specced for a relatively low duty cycle (8 hours per day), while enterprise drives are specced for 24/7 operation

    > ~
    > Hmm! Which means we should not consider SATA drives for enterprise/
    > server 24/7 boxes, right?


    Not unless they're lightly loaded (even though *operating* 24/7).

    > ~
    >>> To me these pieces of information are crucial. If you don't know the
    >>> manufacturer's + model (+ even vintage/production cycles) you can't
    >>> tell how much of the drive internal cache was being used.

    > ~
    >> That's probably almost completely irrelevant (at most, in a
    >> write-intensive environment with disk-level write-back caching turned on
    >> and lots of nominally synchronous small-write activity from the host
    >> that it could not handle in its own write-back cache it might matter some).

    > ~
    > Well, I don't think that having a discussion about something we don't
    > really know (what is actually happening in googles servers'
    > infrastructure) is worthy,


    Speak for yourself: I suspect that I have a fairly good idea of what's
    happening there.

    but there is something I could tell you a
    > few things from a programming point of view.


    That is beginning to seem increasingly unlikely.

    > ~
    > I think google (most) probably uses caches aggreesively. Why, do I
    > think so?


    Because you don't understand the effect that a far larger
    file-system-level cache has on the utility of a smallish disk cache (at
    least for read activity)?

    > ~
    > * seach engines data are more about reading than writing
    > ~
    > * as people page though the ten thousands or so hits they get they
    > are basically served with a cached (previous and next) page, as a way
    > to speed up search retrieval
    > ~
    > * many people seach for the same terms (what is in the news (e.g. the
    > pix of Abu Ghraib Prison abuses of the US gov/military (which by the
    > way google sanitized off their servers)), some celebrities crap "paris
    > hilton in and out of prison" ..., commercial stuff "is the latest ipod
    > out?" ...) over relatively long periods of time
    > ~
    >>> I would even say that there is a linear relationship (depending on
    >>> the lever) between the torque needed to move a flat cylinder (a
    >>> platter) horizontally and to move it vertically.

    > ~
    > Are you questioning Physics or me? What I stated there can be easily
    > proved, there are formulas for the torque needed to rotate a cilinder
    > horizontally and vertically


    And they have nothing to do with the static orientation of the axis of
    rotation (well, I couldn't swear that they don't in the kind of gravity
    gradient one would find in the vicinity of a black hole, but otherwise...).

    > and no, I am not a student.


    You're certainly not much of a physicist.

    - bill

  11. Re: building on your own a large data storage ...

    > > Well, I don't think that having a discussion about something we don't
    > > really know (what is actually happening in googles servers'
    > > infrastructure) is worthy,

    ~
    > Speak for yourself: I suspect that I have a fairly good idea of what's
    > happening there.

    ~
    > > and no, I am not a student.

    ~
    > You're certainly not much of a physicist

    ~
    "not much"? ... OK, Mr. "I-know-what-happens-inside-google-servers"
    define muchNESS for me. You may be right. Ja, ja, ... Also I am
    talking here in this public newsgroup more in a general than
    personally to you
    ~
    I wonder what makes you be so certain about me not being a physicist,
    but hey even though google salary might be definitely enough for you
    to lose a bet on it, I would rather ask you to convince your sugar
    momm[y|ies] (or extensively so) to come clear with "their findings".
    Not every body out there is so technically cynical
    ~
    http://www.usenix.org/events/fast07/...tml/index.html
    ~
    Unfortunately, many aspects of disk failures in real systems are not
    well understood, probably because the owners of such systems are
    reluctant to release failure data or do not gather such data. As a
    result, practitioners usually rely on vendor specified parameters,
    such as mean-time-to-failure (MTTF), to model failure processes,
    although many are skeptical of the accuracy of those models [4,5,33].
    Too much academic and corporate research is based on anecdotes and
    back of the envelope calculations, rather than empirical data [28].
    ~
    4
    J. G. Elerath.
    AFR: problems of definition, calculation and measurement in a
    commercial environment.
    In Proc. of the Annual Reliability and Maintainability Symposium,
    2000.

    5
    J. G. Elerath.
    Specifying reliability in the disk drive industry: No more MTBFs.
    In Proc. of the Annual Reliability and Maintainability Symposium,
    2000.

    33
    Jimmy Yang and Feng-Bin Sun.
    A comprehensive review of hard-disk drive reliability.
    In Proc. of the Annual Reliability and Maintainability Symposium,
    1999.

    28
    T. Schwarz, M. Baker, S. Bassi, B. Baumgart, W. Flagg, C. van
    Ingen, K. Joste, M. Manasse, and M. Shah.
    Disk failure investigations at the internet archive.
    In Work-in-Progess session, NASA/IEEE Conference on Mass Storage
    Systems and Technologies (MSST2006), 2006.
    ~
    lbrtchx


  12. Re: building on your own a large data storage ...

    ["Followup-To:" header set to comp.sys.ibm.pc.hardware.storage.]
    On 2007-07-04, lbrtchx@hotmail.com wrote:

    > I need to store a really large number of texts and I (could) have a
    > number of ATA100 S.M.A.R.T.-compliant hard drives, which I would like
    > to use to somehow build a large and safe (RAID-5?) data store
    > ~
    > * I will have to use standard (and commercially available (meaning
    > cheap ;-))) x86-based hardware and open source software
    > ~
    > * AFAIK you could maximally use 4 hard drives in such boxes


    On a motherboard with 2 IDE ports, you cannot make a 4-disk
    RAID-5 array because doing I/O on two devices on the same IDE
    port gives poor performance.

    You could make two RAID-1 arrays each having one disk on the
    primary IDE port and one on the secondary IDE port. Performance
    will still suck when you do I/O on both arrays at the same time
    but when one array is idle, the other will work OK.

    This is of course not as good as RAID-5 from a disk space/euro
    POV.

    > Should I got for ATA or SATA drives and why?


    SATA is better because 1) it doesn't have the master/slave
    issues of IDE, i.e. if you have 4 SATA ports on your
    motherboard, you *can* do a 4-disk RAID-5 array and 2)
    motherboards with 8 SATA ports are easy to find.

    > * heat dissipation could become a problem with so many hard drives


    I would not want to do it without adequate ventilation.

    > * I need a reliable and stable power supply


    Fortron FSP-400-60GLN works for me. We have had issues with
    Antec.

    > People in the know use software based RAID. Could you give me links
    > to these kinds of discussions?


    The archives of the linux-raid mailing list (the administration
    tool is called mdadm).

    > What would be my weak/hotspot points in my kind of design?


    For me, the time was spent on
    - understanding mdadm,
    - understanding the trade-offs (partioning an array of disks vs.
    making an array of partitions, using LVM or not, optimum
    granularity) and
    - hardware (how to fit 8 or more disks in a PC case with decent
    ventilation).

    > Any suggestions of the type of boxes/racks I should use?


    3ware make 3-disks-in-2-5.25"-spaces trays. They are expensive
    and the fans they use die after about a year. When a fan goes
    bad, the tray helpfully warns you about it by beeping loudly and
    constantly. The fans are not the easiest to find (60 mm or some
    such). Be prepared to hear a lot of beeping.

    I made my own trays. It was a lot of work and they look ugly but
    it was cheap and they do the job. The ventilation is superior to
    commercial trays (120 mm fan than moves a lot of air quietly and
    reliably).

    One very important thing about RAID that too many people
    overlook : don't make an N-disk array from N disks of the same
    make and model bought the same day. Our sysadmin at work did and
    both drives on a RAID-1 array failed within days of each
    other...

    --
    André Majorel
    (Counterfeit: elyzekef@sinuous.com abyfiv@demur.com)
    "Duty, honor, country" -- Douglas MacArthur
    "Travail, famille, patrie" -- Philippe Pétain

  13. Re: building on your own a large data storage ...

    lbrtchx@hotmail.com wrote:

    ....

    >>> and no, I am not a student.

    > ~
    >> You're certainly not much of a physicist

    > ~
    > "not much"? ... OK, Mr. "I-know-what-happens-inside-google-servers"
    > define muchNESS for me. You may be right.


    Unless you can come up with convincing support for your drivel about
    varying amounts of energy required to spin a cylinder depending upon the
    static orientation of its axis of rotation with respect to the gravity
    field, I suspect most people with any understanding whatsoever of the
    subject will conclude that I *am* right.

    Ja, ja, ... Also I am
    > talking here in this public newsgroup more in a general than
    > personally to you
    > ~
    > I wonder what makes you be so certain about me not being a physicist,


    I think I answered that above (if my earlier comment was not
    sufficiently specific for you to understand).

    > but hey even though google salary might be definitely enough for you
    > to lose a bet on it, I would rather ask you to convince your sugar
    > momm[y|ies] (or extensively so) to come clear with "their findings".
    > Not every body out there is so technically cynical


    I'm really not all that cynical - just competent (at least in the
    subjects upon which I choose to venture a strong opinion).

    > ~
    > http://www.usenix.org/events/fast07/...tml/index.html


    Thanks - read it months ago (and understood it, apparently unlike yourself).

    - bill

  14. Re: building on your own a large data storage ...

    Bill Todd wrote:
    > Unless you can come up with convincing support for your drivel about
    > varying amounts of energy required to spin a cylinder depending upon the
    > static orientation of its axis of rotation with respect to the gravity
    > field, I suspect most people with any understanding whatsoever of the
    > subject will conclude that I *am* right.


    Don't chase this one away. I'm really interested in the new torque
    definition and formula.

  15. [OT]Re: building on your own a large data storage ...

    [edit}
    > I was amazed that he was questioning that
    > about gravity/the difference in the torque of a spinning cYlinder
    > depending on the axis. From a mechanical point of view there definitely is
    > a difference and to me it is like second nature. Now, I am talking from a
    > mechanical point of view and this is just an illustration and NOT even a
    > representative case but just try to spin a bikes wheel horizontally and
    > vertically and you will notice the difference
    > ~


    I am not a physicist but doesn't it also depend on the bearings
    which are carrying the weight and how the cylinder (or platters) was
    'designed' to spin.

    In the case of drives I thought they were designed to spin in either the
    vertical or horizontal plane.

    [edit]
    > gaging/measuring the amount of power a drive consumes in both
    > positions will ultimately say the truth ~


    Certainly true. However, there is a point under which a small difference
    becomes essentially negligible for real world operation and that, rather
    than theory, is what most people will be thinking about. A small power
    difference would likely only be significant in huge arrays and you didn't
    say you're trying to do that.

    How did this topic get off on this tangent, it seems OT to me. Although
    interesting, it doesn't seem to have much to do with how to build your own
    large data storage.

    Rodney


  16. Re: building on your own a large data storage ...

    lbrtchx@hotmail.com wrote:
    > Kevin Snodgrass:
    >
    >>Bad for the bottom line and all.

    >
    > I would disagree, but I see it is just my opinion. Among many other
    > points, why is it you pay them a dollar amount they can do whatever
    > they want with and you are not supposed to do whatever you want with a
    > freaking hard drive? Why is it that people have class action suits


    Wow, did you even read what I posted? Since Google (in this example)
    buys so many gazillion hard drives I have speculated that they have
    contracts with HDD manufacturers to get better pricing, if Google
    promises to not publish data about HDD performance, reliability, speed,
    whatever that includes Make/Model. If Google wanted to publish that
    info, they most certainly could if they didn't sign a contract
    preventing such. And paying retail for the drives.

    > against tabaco companies and health care providers, but they leave
    > their data at the expense of "confidentiality agreements" and
    > "contracts"


    It's call the law. Maybe you should try reading up on it.

    If you sign a contract you are bound by the terms of the contract.
    Don't like the terms? Negotiate a better deal or take your business
    elsewhere.

    > and here I am not talking about the "laws" but about what I think is
    > right or not.


    OIC. When you are read to talk reality and not fantasy get back to me.

    > Also when you say "bad for the bottom line and all" I think you are
    > grossly overconfident about the "all" part. We, (most) people can be
    > so easily manipulated!!!


    Wow, where the hell did that come from? Did I say anything about
    manipulating you? The phrase "bad for the bottom line and all" is quite
    simple, it is in reference to the MBAs and CxOs tunnel vision on the
    10-Q for the next quarter.

    >>Bill Todd wrote:
    >>
    >>>Unless you can come up with convincing support for your drivel about
    >>>varying amounts of energy required to spin a cylinder depending upon the
    >>>static orientation of its axis of rotation with respect to the gravity
    >>>field, I suspect most people with any understanding whatsoever of the
    >>>subject will conclude that I *am* right.

    >
    >
    >>Don't chase this one away. I'm really interested in the new torque
    >>definition and formula.

    >
    > ~
    > Kevin, give me some time to get back to you on that one. I thought
    > Bill was just giving me sh!t. I was amazed that he was questioning
    > that about gravity/the difference in the torque of a spinning cYlinder
    > depending on the axis. From a mechanical point of view there
    > definitely is a difference and to me it is like second nature. Now, I


    I'll be here, waiting.

    Oh, I did just peruse 2 of my college texts (University Physics, 6th Ed.
    Sears, Zemansky, Young; Mechanics 3rd Ed, Symon) and they both give
    tourque as I remembered it; the cross product of the Force vector with
    the Radial vector. No mention of the gravity vector.

    > am talking from a mechanical point of view and this is just an
    > illustration and NOT even a representative case but just try to spin a
    > bikes wheel horizontally and vertically and you will notice the
    > difference


    No bicycles here. I've got 3 power drills, a Dremel tool and an angle
    grinder (remodeling a bedroom here) handy, would they do?

    > gaging/measuring the amount of power a drive consumes in both
    > positions will ultimately say the truth


    Got a Kill-O-Watt and a couple volt meters. Maybe I'll try that.

    > when I talked about drives' cache I didn't talk properly. yeah, of
    > course drives cache is way to little for me to consider it a big
    > factor in the drive's use, but the amount of RAM these applications
    > have and the RAM that databases use internally and the usage patters
    > of these apps could make a huge difference


    Yes, that cache would all make a huge difference.

    Two pieces of data I would have liked to have seen from the Google
    study, or any other similar study, is gross blocks written and gross
    blocks read. I've seen that data for some drives, I definately remember
    some SCSI drives having it, but not sure if that is available via SMART
    or not...

    Tried smartctl -a /dev/hdc and didn't find it.

    > ~
    > lbrtchx
    >


  17. Re: building on your own a large data storage ...

    > Oh, I did just peruse 2 of my college texts (University Physics, 6th Ed. Sears, Zemansky, Young; Mechanics 3rd Ed, Symon) and they both give tourque as I remembered it; the cross product of the Force vector with the Radial vector. No mention of the gravity vector.
    ~
    Many school physics books and some of those "for engineers" are very
    simple and basic even about relatively over-simplistic (and
    simplified) Mechanics
    ~
    I think what your books missed was that the angular momentum of a
    solid body you calculate over an integration and in this integration
    are considered
    ~
    * gravity
    * mass density distribution (pretty much constant for a hdd's
    platter)
    * lever
    * objects volumetric form as "perceived" by the rotating axis
    ~
    Do a search on "gravitational torque". Some of these links may lead
    you there
    ~
    // __ http://en.wikipedia.org/wiki/Torque
    ~
    // __ http://en.wikipedia.org/wiki/Angular_momentum
    ~
    // __ http://en.wikipedia.org/wiki/Center_of_mass
    ~
    // __ http://physnet.org/home/modules/pdf_modules/m34.pdf
    ~
    // __ http://www.lightandmatter.com/html_b...ch05/ch05.html
    ~
    // __ http://physics.ucsd.edu/students/cou...ems_Week_9.pdf
    ~
    > > gaging/measuring the amount of power a drive consumes in both
    > > positions will ultimately say the truth

    >
    > Got a Kill-O-Watt and a couple volt meters. Maybe I'll try that.

    ~
    I would bet half of my right gut ;-) that you need to do more work,
    therefore there will ultimately be more power consumption, but both
    spindle the hard drive's platter and its reading magnetic arm while
    having the drive horizontally placed. How much more I don't know
    ~
    lbrtchx


  18. Re: building on your own a large data storage ...

    lbrtchx@hotmail.com wrote:
    >>Oh, I did just peruse 2 of my college texts (University Physics, 6th Ed. Sears, Zemansky, Young; Mechanics 3rd Ed, Symon) and they both give tourque as I remembered it; the cross product of the Force vector with the Radial vector. No mention of the gravity vector.

    >
    > ~
    > Many school physics books and some of those "for engineers" are very
    > simple and basic even about relatively over-simplistic (and
    > simplified) Mechanics


    Ah, so my University Physics books are Physics for Dummys. I see. Did
    I mention Mechanics, by Symon, is a grad school book?

    > I think what your books missed was that the angular momentum of a
    > solid body you calculate over an integration and in this integration
    > are considered


    When you are dealing with a non-rigid body, you have a point. The
    platter(s) of a hard drive are quite rigid, therefore you do not have a
    point when dealing with hard drives.

    > * gravity


    Please, do explain. I'm still waiting for your explainantion of how
    gravity can affect energy requirements for spinning a small aluminum
    disk, inre disk parallel to gravity vector vs. disk perpendicular to
    gravity vector.

    > * mass density distribution (pretty much constant for a hdd's
    > platter)


    Damn well better be. At spin rates of 5400, 7200, 10K and 15K rpm any
    non-constant mass distribution will cause serious issues.

    > * lever


    Constant in this case.

    > * objects volumetric form as "perceived" by the rotating axis


    Constant in this case.

    > Do a search on "gravitational torque". Some of these links may lead
    > you there


    I already did. Nothing new here...

    >>>gaging/measuring the amount of power a drive consumes in both

    >>
    >> > positions will ultimately say the truth

    >>
    >>Got a Kill-O-Watt and a couple volt meters. Maybe I'll try that.

    >
    > ~
    > I would bet half of my right gut ;-) that you need to do more work,


    The phrase is "bet my right nut", as in gonad. I would suggest betting
    something less important, like maybe a box of Krispy Kremes.

    > therefore there will ultimately be more power consumption, but both
    > spindle the hard drive's platter and its reading magnetic arm while
    > having the drive horizontally placed. How much more I don't know


    Probably non-existant, but most certainly smaller than a Kill-o-Watt
    (digital), cheap digital multi-meter (digital), or the Voltmeter on my
    engine analyzer (analogue) can resolve.

  19. Re: building on your own a large data storage ...

    > ~
    > > Many school physics books and some of those "for engineers" are very
    > > simple and basic even about relatively over-simplistic (and
    > > simplified) Mechanics

    ~
    > Ah, so my University Physics books are Physics for Dummys. I see. Did I mention Mechanics, by Symon, is a grad school book?

    ~
    I don't know about this particular book, but you could see what I
    meant. Even the wikipedia entry for the explanation of what "torque"
    is was so basic that it read like an odd joke to me
    ~
    > > I think what your books missed was that the angular momentum of a
    > > solid body you calculate over an integration and in this integration
    > > are considered

    ~
    > When you are dealing with a non-rigid body, you have a point. The
    > platter(s) of a hard drive are quite rigid, therefore you do not have a
    > point when dealing with hard drives.

    ~
    Actually no. There is a form factor that factors (no pun intended)
    into the integral to calculate the angular momentum
    ~
    > > * gravity
    > > * lever
    > > * objects volumetric form as "perceived" by the rotating axis

    ~
    > Please, do explain. I'm still waiting for your explainantion of how
    > gravity can affect energy requirements for spinning a small aluminum
    > disk, inre disk parallel to gravity vector vs. disk perpendicular to
    > gravity vector.

    ~
    I still owe you those one
    ~
    > > Got a Kill-O-Watt and a couple volt meters. Maybe I'll try that.

    ~
    > I would bet half of my right gut ;-) that you need to do more work,

    ~
    > The phrase is "bet my right nut", as in gonad. I would suggest betting
    > something less important, like maybe a box of Krispy Kremes.

    ~
    ;-)
    ~
    > > therefore there will ultimately be more power consumption, but both
    > > spindle the hard drive's platter and its reading magnetic arm while
    > > having the drive horizontally placed. How much more I don't know

    ~
    > Probably non-existant, but most certainly smaller than a Kill-o-Watt
    > (digital), cheap digital multi-meter (digital), or the Voltmeter on my
    > engine analyzer (analogue) can resolve.

    ~
    Hmm! That was interesting to me! How did you make the drive spin?
    Issuing internal assembler code to just make it spin without any rw
    work and keeping its arm parked?
    ~
    lbrtchx


  20. Re: building on your own a large data storage ...

    > ~
    > > Many school physics books and some of those "for engineers" are very
    > > simple and basic even about relatively over-simplistic (and
    > > simplified) Mechanics

    ~
    > Ah, so my University Physics books are Physics for Dummys. I see. Did I mention Mechanics, by Symon, is a grad school book?

    ~
    I don't know about this particular book, but you could see what I
    meant. Even the wikipedia entry for the explanation of what "torque"
    is was so basic that it read like an odd joke to me
    ~
    > > I think what your books missed was that the angular momentum of a
    > > solid body you calculate over an integration and in this integration
    > > are considered

    ~
    > When you are dealing with a non-rigid body, you have a point. The
    > platter(s) of a hard drive are quite rigid, therefore you do not have a
    > point when dealing with hard drives.

    ~
    Actually no. There is a form factor that factors (no pun intended)
    into the integral to calculate the angular momentum
    ~
    > > * gravity
    > > * lever
    > > * objects volumetric form as "perceived" by the rotating axis

    ~
    > Please, do explain. I'm still waiting for your explainantion of how
    > gravity can affect energy requirements for spinning a small aluminum
    > disk, inre disk parallel to gravity vector vs. disk perpendicular to
    > gravity vector.

    ~
    I still owe you those one
    ~
    > > Got a Kill-O-Watt and a couple volt meters. Maybe I'll try that.

    ~
    > I would bet half of my right gut ;-) that you need to do more work,

    ~
    > The phrase is "bet my right nut", as in gonad. I would suggest betting
    > something less important, like maybe a box of Krispy Kremes.

    ~
    ;-)
    ~
    > > therefore there will ultimately be more power consumption, but both
    > > spindle the hard drive's platter and its reading magnetic arm while
    > > having the drive horizontally placed. How much more I don't know

    ~
    > Probably non-existant, but most certainly smaller than a Kill-o-Watt
    > (digital), cheap digital multi-meter (digital), or the Voltmeter on my
    > engine analyzer (analogue) can resolve.

    ~
    Hmm! That was interesting to me! How did you make the drive spin?
    Issuing internal assembler code to just make it spin without any rw
    work and keeping its arm parked? Could you do us the favor to let us
    know about the specifics of how exactly did you do it?
    ~
    If there is anything you could safely believe in is Physics
    ~
    lbrtchx


+ Reply to Thread
Page 2 of 3 FirstFirst 1 2 3 LastLast