[9fans] ata drive capabilities
Hi 9fans,
can someone on this list tell me how to interpret the config part of
cpu% cat /dev/sdC0/ctl
inquiry WDC WD1600JB-00REA0
config 427A capabilities 2F00 dma 00550020 dmactl 00550020 rwm 16 rwmctl 0 lba48always off
I am trying to figure out whether the disk signals the implementation
of the SMART feature set.
Kind regards,
Christian
--
You may use my gpg key for replies:
pub 1024D/47F79788 2005/02/02 Christian Kellermann (C-Keen)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (OpenBSD)
iD8DBQFHcXi6XYob3Uf3l4gRAioSAJ4lPce7WoNTo0ttdrDrBCFpqlRlagCfYKFZ
nvdfpj3PfrJfer3JMDcPuas=
=6bPY
-----END PGP SIGNATURE-----
Re: [9fans] ata drive capabilities
You can tell this from your BIOS setup.
On Dec 25, 2007, at 4:40 PM, Christian Kellermann wrote:
[color=blue]
> Hi 9fans,
>
> can someone on this list tell me how to interpret the config part of
> cpu% cat /dev/sdC0/ctl
> inquiry WDC WD1600JB-00REA0
> config 427A capabilities 2F00 dma 00550020 dmactl 00550020 rwm 16
> rwmctl 0 lba48always off
>
> I am trying to figure out whether the disk signals the implementation
> of the SMART feature set.
>
> Kind regards,
>
> Christian
>
> --
> You may use my gpg key for replies:
> pub 1024D/47F79788 2005/02/02 Christian Kellermann (C-Keen)[/color]
Re: [9fans] ata drive capabilities
> Hi 9fans,[color=blue]
>
> can someone on this list tell me how to interpret the config part of
> cpu% cat /dev/sdC0/ctl
> inquiry WDC WD1600JB-00REA0
> config 427A capabilities 2F00 dma 00550020 dmactl 00550020 rwm 16 rwmctl 0 lba48always off
>
> I am trying to figure out whether the disk signals the implementation
> of the SMART feature set.
>
> Kind regards,[/color]
in the return of identify (packet) device, if bits 14:16 of word 83 is 1, then
smart support is indicated by word 82 bit 1. otherwise smart isn't supported.
word 49 is the capabilities word and the important bits of the configuration
are:
10,11 iordy configuration
bit 8 dma support
9 lba support
the intel/amd sata driver support smart commands via
echo smartenable>/dev/sdXX/ctl # turn drive's smart on.
echo smart>/dev/sdXX/ctl # smart report status.
this isn't implemented in the sdata driver, but i think a similar
strategy could be employed. note: smart commands are not dma
commands. also, smart support doesn't imply much about what
commands are supported or much about the return values.
report returns if the drive is likely to fail seems the most useful.
bios isn't always helpful in this regard. some bios don't report
smart status. some bios do a smart check on power on and won't
boot with a drive that smart considers suspect. (we have a drive
in the lab that smart declares will fail any minute now. it's been
this way for 2 years.) this can be a big problem if you have a
machine with raid that won't boot due to a drive failure.
(why have a raid if one failure means an unbootable machine?)
- erik
Re: [9fans] ata drive capabilities
On Dec 25, 2007 6:59 PM, erik quanstrom <quanstro@quanstro.net> wrote:[color=blue]
>(we have a drive
> in the lab that smart declares will fail any minute now. it's been
> this way for 2 years.)[/color]
[color=blue]
>From everything I've seen, SMART has zero correlation with real[/color]
hardware issues -- confirmed by a discussion with someone at a big
search company. SMART is dumb.
[color=blue]
> this can be a big problem if you have a
> machine with raid that won't boot due to a drive failure.
> (why have a raid if one failure means an unbootable machine?)[/color]
it makes great ad copy.
ron
Re: [9fans] ata drive capabilities
On Wed Dec 26 01:33:14 EST 2007, [email]rminnich@gmail.com[/email] wrote:[color=blue]
> On Dec 25, 2007 6:59 PM, erik quanstrom <quanstro@quanstro.net> wrote:[color=green]
> >(we have a drive
> > in the lab that smart declares will fail any minute now. it's been
> > this way for 2 years.)[/color]
>
> From everything I've seen, SMART has zero correlation with real
> hardware issues -- confirmed by a discussion with someone at a big
> search company. SMART is dumb.[/color]
the google paper shows a 40% afr for the first 6 months after some
smart errors appear. (unfortunately they don't do numbers for
a simple smart status.)
from my understanding of how google do things, loosing a drive just
means they need to replace it. so it's cheeper to let drives fail.
on the other hand, we have our main filesystem raided on an aoe
appliance. suppose that one of those raids has two disks showing
a smart status of "will fail". in this case i want to know the elevated
risk and i will allocate a spare drive to replace at least one of the
drives.
i guess this is the long way of saying, it all depends on how painful
loosing your data might be. if it's painful enough, even a poor tool
like smart is better than nothing.
- erik
Re: [9fans] ata drive capabilities
Thanks for your replies!
The reason I asked is that I am thinking about a couple of methods
to detect failures of my two mirrored disks (by fs) automatically.
How do you check if your disks are still ok? I know I could invest
in a real raid controller and rely on that but I still like the
idea of being independent of yet another piece of hardware which
internal formatting of the disk is hidden from me..
Kind regards,
Christian
--
You may use my gpg key for replies:
pub 1024D/47F79788 2005/02/02 Christian Kellermann (C-Keen)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (OpenBSD)
iD8DBQFHcrEPXYob3Uf3l4gRAu4/AJ9QwwddQHNZ/hHe/cNfUoqPqSUv6wCeJ7Ht
gIRyU80nB5PxDGm32+oVGtk=
=jfPR
-----END PGP SIGNATURE-----
Re: [9fans] ata drive capabilities
> How do you check if your disks are still ok?
i used to run cmp on the two mirrorred partitions to verify weekly
that devfs hadn't missed anything. i would imagine if one of the disks
went south cmp would fail.
Re: [9fans] ata drive capabilities
Christian Kellermann wrote:[color=blue]
> Thanks for your replies!
>
> The reason I asked is that I am thinking about a couple of methods
> to detect failures of my two mirrored disks (by fs) automatically.
> How do you check if your disks are still ok? I know I could invest[/color]
The newer high capacity drives all have high "raw error read rates" but
they're generally all corrected as indicated by an equivalent value of
"hardware ECC corrected." So this does not seem to really correspond to
anything.
Frankly, the one SMART variable I've seen that seems to always
correspond to impending disk failure is the "reallocated sector count."
Once you see that incrementing, it's time to decommission that disk
for anything other than scratch storage. Such drives are good for
intermediate files of non-linear video editing until they die.
Re: [9fans] ata drive capabilities
> Thanks for your replies![color=blue]
>
> The reason I asked is that I am thinking about a couple of methods
> to detect failures of my two mirrored disks (by fs) automatically.
> How do you check if your disks are still ok? I know I could invest
> in a real raid controller and rely on that but I still like the
> idea of being independent of yet another piece of hardware which
> internal formatting of the disk is hidden from me..
>
> Kind regards,
>
> Christian[/color]
i'm not sure you need that. fs(3) already logs i/o errors which should
be as good an indication of trouble as smart. i/o errors have the benefit
of not being drive dependent. the recovery will need to be done by
hand anway, as fs doesn't have the concept of device state. (there are
some subtile difficulties, too. sometimes drives read an lba correctly
but writes fail.)
for something more automatic, devices would need state and an
online recovery mechanism.
- erik
Re: [9fans] ata drive capabilities
* andrey mirtchovski <mirtchovski@gmail.com> [071226 21:17]:[color=blue][color=green]
> > How do you check if your disks are still ok?[/color]
>
> i used to run cmp on the two mirrorred partitions to verify weekly
> that devfs hadn't missed anything. i would imagine if one of the disks
> went south cmp would fail.[/color]
Ah Thanks!
I have fs(3) configured as follows:
cpu% cat /dev/fs/ctl
mirror arenas /dev/sdD0/arenas /dev/sdD1/arenas
mirror isect /dev/sdD0/isect /dev/sdD1/isect
mirror fscfg /dev/sdD0/fscfg /dev/sdD1/fscfg
And my disks have been formatted and partitioned beforehand like this:
cpu% lc sdD0
arenas ctl data fscfg isect plan9 raw
cpu% lc sdD1
arenas ctl data fscfg isect plan9 raw
Still if I know run cmp against the arena partition from the running
system it says they differ, and ls says the same:
cpu% lc -ld sdD*
--rw-r----- S 0 bootes bootes 285779527168 Dec 5 03:37 arenas --rw-r----- S 0 bootes bootes 285777530368 Dec 5 03:37 arenas
--rw-r----- S 0 bootes bootes 0 Dec 5 03:37 ctl --rw-r----- S0 bootes bootes 0 Dec 5 03:37 ctl
--rw-r----- S 0 bootes bootes 300069052416 Dec 5 03:37 data --rw-r----- S 0 bootes bootes 300069052416 Dec 5 03:37 data
--rw-r----- S 0 bootes bootes 512 Dec 5 03:37 fscfg --rw-r----- S 0 bootes bootes 512 Dec 5 03:37 fscfg
--rw-r----- S 0 bootes bootes 14288976384 Dec 5 03:37 isect --rw-r----- S0 bootes bootes 14288876544 Dec 5 03:37 isect
--rw-r----- S 0 bootes bootes 300068504064 Dec 5 03:37 plan9 --rw-r----- S0 bootes bootes 300066407424 Dec 5 03:37 plan9
-lrw------- S 0 bootes bootes 0 Dec 5 03:37 raw -lrw------- S0 bootes bootes 0 Dec 5 03:37 raw
I cannot see the /dev/fs directory from my server terminal so I
wonder if fs is even active? But if it is not I wonder how my venti
tells me it is using /def/fs/arenas for storage?
I guess that fs(3) is working properly otherwise it would not show
the correct configuration on a simple bind. But then why do the
disks differ and would it suffice to dd the contents of sdD0 to
sdD1 using the live cd?
Kind regards,
Christian
--
You may use my gpg key for replies:
pub 1024D/47F79788 2005/02/02 Christian Kellermann (C-Keen)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (OpenBSD)
iD8DBQFHc+scXYob3Uf3l4gRAsl0AJ9nd/Of8BsUC/E7LABKPwbJFgRa9gCgrH91
G7sGrgx+VgjOXWwDA0Ee8cU=
=jN/L
-----END PGP SIGNATURE-----