Defraging and modern disks - Storage

This is a discussion on Defraging and modern disks - Storage ; Back when a $800 30MB RLL hard drive was the hot thing, I understood defraging. However, I was thinking recently about the difference between the apparent layout of a modern drive and the actual configuration. Since the drive controller maps ...

+ Reply to Thread
Results 1 to 11 of 11

Thread: Defraging and modern disks

  1. Defraging and modern disks

    Back when a $800 30MB RLL hard drive was the hot thing, I understood defraging.
    However, I was thinking recently about the difference between the apparent
    layout of a modern drive and the actual configuration. Since the drive
    controller maps the layout the OS thinks it is dealing with to the actual
    cyl/track/sector layout that is physically implemented on the disk, does a
    conventional defragmenting approach make sense? If the apparent relationship
    between two bits of file is not the same as the physical relationship, does
    defraging it really optimize or just potentially move a file piece to an equally
    arbitrary location on the disk?

    Also, on a RAID1 configuration, is there a guarantee that the two disks are
    written in exactly the same way? Is it possible that a hardware RAID1 will put
    the same data in two very different locations on two disks (thereby further
    negating the logic of a conventional defrag algorithm)?

    FWIW, the target platform under consideration is (cough) Windows XP.

    Mike

  2. Re: Defraging and modern disks

    Michael Daly wrote:

    > Back when a $800 30MB RLL hard drive was the hot thing, I understood defraging.


    Yes, it made more sense then.

    > However, I was thinking recently about the difference
    > between the apparent layout of a modern drive and the actual configuration. Since the drive
    > controller maps the layout the OS thinks it is dealing with to the actual cyl/track/sector layout
    > that is physically implemented on the disk, does a conventional defragmenting approach make sense?


    No it doesnt, but for different reasons.

    The real reason its pointless now except in a few very unusual
    situations is because modern hard drives seek so fast, and
    modern OSs are moving the heads around a hell of a lot even
    when just doing stuff as basic as web browsing, for the
    temporary internet cache. While you can get significant
    fragmentation with very large video files particularly, the
    speed of access to those is completely determined by the
    frame rate so extra seeks between frags are completely
    irrelevant to the playback speed and even when you are
    editing those, the speed is completely dominated by the time
    required to transcode those, not by the head seek times.

    The only real advantage with defragged files now is that they
    can be easier to recover if you are stupid enough to not have
    full backups. But with hard drives so cheap now, you have
    to be completely stupid not to have adequate backups.

    > If the apparent relationship between two bits of file is not the same as the physical
    > relationship,


    They are in the sense that a continuous series of logical blocks
    normally is still a continuous series of physical sectors. The only
    exception is with reallocated sectors which can involve an extra
    head move to the reallocated sector, but modern hard drives
    have so few of those that its completely academic in reality.

    > does defraging it really optimize


    Yes, it still does.

    > or just potentially move a file piece to an equally arbitrary location on the disk?


    Nope, that doesnt happen.

    > Also, on a RAID1 configuration, is there a guarantee that the two disks are written in exactly the
    > same way?


    Not a guarantee but its close enough to that.

    > Is it possible that a hardware RAID1 will put the same data in two very different locations on two
    > disks (thereby further negating the logic of a conventional defrag algorithm)?


    The short answer is no.

    > FWIW, the target platform under consideration is (cough) Windows XP.


    The other thing that many defraggers attempt to do is to locate files
    on the hard drive to maximise the speed of access. Thats a separate
    issue to minimising the number of fragments each file has. But XP does
    that reorganisation itself and it really only affects boot time much anyway.



  3. Re: Defraging and modern disks

    Previously Michael Daly wrote:
    > Back when a $800 30MB RLL hard drive was the hot thing, I understood defraging.
    > However, I was thinking recently about the difference between the apparent
    > layout of a modern drive and the actual configuration. Since the drive
    > controller maps the layout the OS thinks it is dealing with to the actual
    > cyl/track/sector layout that is physically implemented on the disk, does a
    > conventional defragmenting approach make sense? If the apparent relationship
    > between two bits of file is not the same as the physical relationship, does
    > defraging it really optimize or just potentially move a file piece to an equally
    > arbitrary location on the disk?


    > Also, on a RAID1 configuration, is there a guarantee that the two
    > disks are written in exactly the same way? Is it possible that a
    > hardware RAID1 will put the same data in two very different
    > locations on two disks (thereby further negating the logic of a
    > conventional defrag algorithm)?


    > FWIW, the target platform under consideration is (cough) Windows XP.


    Defraggers have allways optimized for linear reads. For those it is not
    relevant how the C/H are linearized. A RAID1 will allways put
    the same data into the same logical secor, except for the RAID
    superblock. That is different for each RAID component (disk,
    or with software RAID: Partition or file).

    In addition, modern filesystems do not require defragmentation
    in many cases.

    Arno

  4. Re: Defraging and modern disks

    Michael Daly wrote:
    > Also, on a RAID1 configuration, is there a guarantee that the two disks
    > are written in exactly the same way? Is it possible that a hardware
    > RAID1 will put the same data in two very different locations on two
    > disks (thereby further negating the logic of a conventional defrag
    > algorithm)?



    There's absolutely no guarantee that the two disks are written exactly
    the same way whether it's software or hardware raid. Hardware RAID has
    no additional insight into the internal organization of its disks
    anymore than software RAID does; hardware RAID is just software RAID
    moved into the disk storage array's processor. I've seen various volume
    managers write mirror data in completely opposite ends of each mirror
    disk. The only thing software cares about is the overall size of the
    volumes and that's it, but not the specific organization of that volume.

    Yousuf Khan

  5. Re: Defraging and modern disks

    Previously Yousuf Khan wrote:
    > Michael Daly wrote:
    >> Also, on a RAID1 configuration, is there a guarantee that the two disks
    >> are written in exactly the same way? Is it possible that a hardware
    >> RAID1 will put the same data in two very different locations on two
    >> disks (thereby further negating the logic of a conventional defrag
    >> algorithm)?



    > There's absolutely no guarantee that the two disks are written exactly
    > the same way whether it's software or hardware raid. Hardware RAID has
    > no additional insight into the internal organization of its disks
    > anymore than software RAID does; hardware RAID is just software RAID
    > moved into the disk storage array's processor. I've seen various volume
    > managers write mirror data in completely opposite ends of each mirror
    > disk. The only thing software cares about is the overall size of the
    > volumes and that's it, but not the specific organization of that volume.


    Interesting. I did not know that. Care to name a manager that
    does this?

    Well, Linux software RAID 1 does wite exactly the same to all
    disks. Except for the RAID superblock, of course. The RAID superblock
    is placed at the end of the disk/partition/file. The reason is that
    with these two things you can mount each disk individually and
    unraided. This is one of the design criteria for the RAID-1
    implementation in the kernel and hence reliable.

    Arno

  6. Re: Defraging and modern disks

    Arno Wagner wrote:
    > Previously Yousuf Khan wrote:
    >> There's absolutely no guarantee that the two disks are written exactly
    >> the same way whether it's software or hardware raid. Hardware RAID has

    >
    > Interesting. I did not know that. Care to name a manager that
    > does this?


    The ones I'm most familiar with are the ones that run under Solaris,
    which are Solstice Disk Suite and Veritas Volume Manager. They both do it.

    > Well, Linux software RAID 1 does wite exactly the same to all
    > disks. Except for the RAID superblock, of course. The RAID superblock
    > is placed at the end of the disk/partition/file. The reason is that
    > with these two things you can mount each disk individually and
    > unraided. This is one of the design criteria for the RAID-1
    > implementation in the kernel and hence reliable.



    The RAID superblock sounds like the same thing as what they call
    Metadata in Disk Suite, or Private sections in Volume Manager. They just
    maintain the persistent organization data for their respective volume
    management software.

    And these are just the software RAID products. In hardware RAID, you
    have even less control over placement, and storage array just chooses
    the disks for you.

    Yousuf Khan

  7. Re: Defraging and modern disks

    Previously Yousuf Khan wrote:
    > Arno Wagner wrote:
    >> Previously Yousuf Khan wrote:
    >>> There's absolutely no guarantee that the two disks are written exactly
    >>> the same way whether it's software or hardware raid. Hardware RAID has

    >>
    >> Interesting. I did not know that. Care to name a manager that
    >> does this?


    > The ones I'm most familiar with are the ones that run under Solaris,
    > which are Solstice Disk Suite and Veritas Volume Manager. They both do it.


    >> Well, Linux software RAID 1 does wite exactly the same to all
    >> disks. Except for the RAID superblock, of course. The RAID superblock
    >> is placed at the end of the disk/partition/file. The reason is that
    >> with these two things you can mount each disk individually and
    >> unraided. This is one of the design criteria for the RAID-1
    >> implementation in the kernel and hence reliable.



    > The RAID superblock sounds like the same thing as what they call
    > Metadata in Disk Suite, or Private sections in Volume Manager. They just
    > maintain the persistent organization data for their respective volume
    > management software.


    They are. The smart thing Linux software RAID does is placing
    them at the end, so the beginning looks like an ordinary disk.

    > And these are just the software RAID products. In hardware RAID, you
    > have even less control over placement, and storage array just chooses
    > the disks for you.


    Agreed. One reason I like software RAID better. I can just plug the
    disks into any other PC in any way, boot some current Linux
    and get at my data. If a hardware controller goes up in smoke,
    no such easy solution.

    Arno

  8. Re: Defraging and modern disks

    Arno Wagner wrote:
    > They are. The smart thing Linux software RAID does is placing
    > them at the end, so the beginning looks like an ordinary disk.


    These other products do the same thing too, usually. An exception to the
    case is when it's converting over an existing non-mirrored boot disk to
    mirrored. In that case, it has to build the RAID metadata wherever it
    can, so they often just steal a bit of space from the swap partition and
    put their RAID metadata there.

    >> And these are just the software RAID products. In hardware RAID, you
    >> have even less control over placement, and storage array just chooses
    >> the disks for you.

    >
    > Agreed. One reason I like software RAID better. I can just plug the
    > disks into any other PC in any way, boot some current Linux
    > and get at my data. If a hardware controller goes up in smoke,
    > no such easy solution.


    I'd say for mirroring, a good software RAID package is just as good as
    any hardware RAID. It's only when you're doing RAID5 that hardware RAID
    makes a bit of a difference to performance. And yet, still hardware
    RAID5 still can't compete against software RAID0+1 for maximum performance.

    Yousuf Khan

  9. Re: Defraging and modern disks

    Previously Yousuf Khan wrote:
    > Arno Wagner wrote:
    >> They are. The smart thing Linux software RAID does is placing
    >> them at the end, so the beginning looks like an ordinary disk.


    > These other products do the same thing too, usually. An exception to the
    > case is when it's converting over an existing non-mirrored boot disk to
    > mirrored. In that case, it has to build the RAID metadata wherever it
    > can, so they often just steal a bit of space from the swap partition and
    > put their RAID metadata there.


    >>> And these are just the software RAID products. In hardware RAID, you
    >>> have even less control over placement, and storage array just chooses
    >>> the disks for you.

    >>
    >> Agreed. One reason I like software RAID better. I can just plug the
    >> disks into any other PC in any way, boot some current Linux
    >> and get at my data. If a hardware controller goes up in smoke,
    >> no such easy solution.


    > I'd say for mirroring, a good software RAID package is just as good as
    > any hardware RAID. It's only when you're doing RAID5 that hardware RAID
    > makes a bit of a difference to performance. And yet, still hardware
    > RAID5 still can't compete against software RAID0+1 for maximum performance.


    That matches my experience. At least under Linux, software RAID
    is a match for hardware RAID. At it is cheaper, better integrated
    into the system, more flexibel and easier to manage.

    Arno

  10. Re: Defraging and modern disks

    Arno Wagner wrote:
    > That matches my experience. At least under Linux, software RAID
    > is a match for hardware RAID. At it is cheaper, better integrated
    > into the system, more flexibel and easier to manage.


    Yeah, the only really time-consuming, processor-intensive RAID is RAID5
    (and its variations) parity calculations. Especially when you've lost a
    disk and you're rebuilding data on the fly from parity. Second most
    intense usage of processing power is when you're building new parity
    from write operations. If you are performance constrained, but not
    capacity constrained, then you should always choose RAID1 mirroring over
    RAID5 parity.

    Yousuf Khan

  11. Re: Defraging and modern disks

    Previously Yousuf Khan wrote:
    > Arno Wagner wrote:
    >> That matches my experience. At least under Linux, software RAID
    >> is a match for hardware RAID. At it is cheaper, better integrated
    >> into the system, more flexibel and easier to manage.


    > Yeah, the only really time-consuming, processor-intensive RAID is RAID5
    > (and its variations) parity calculations. Especially when you've lost a
    > disk and you're rebuilding data on the fly from parity. Second most
    > intense usage of processing power is when you're building new parity
    > from write operations. If you are performance constrained, but not
    > capacity constrained, then you should always choose RAID1 mirroring over
    > RAID5 parity.


    Actually I found that even RAID6 is not too hard on the CPU
    on a dual core system on writing. Reading is no problem with a
    non-degraded array, of course.

    Arno

+ Reply to Thread