File system full corrupts grub boot process - Solaris

This is a discussion on File system full corrupts grub boot process - Solaris ; While downloading and installing software on a new laptop running Solaris 10 8/07, I inadvertently filled up the root partition. I noticed the problem, cleaned up some space and continued w/o any apparent problems. However, later when I rebooted the ...

+ Reply to Thread
Results 1 to 7 of 7

Thread: File system full corrupts grub boot process

  1. File system full corrupts grub boot process

    While downloading and installing software on a new laptop running
    Solaris 10 8/07, I inadvertently filled up the root partition. I
    noticed the problem, cleaned up some space and continued w/o any
    apparent problems.

    However, later when I rebooted the system, instead of the grub menu, I
    got a grub commandline. Has anyone else seen this? Repairing grub
    isn't a big deal, but I'm concerned that something else got corrupted
    and will bite me later.

    thanks,
    Reg

  2. Re: File system full corrupts grub boot process

    A bit more information:

    When I started to repair the damage, I discovered that the disk label
    had been trashed. Not really a surprise given the problem, but I'm
    puzzled as to how a write to a file on a mounted filesystem could
    overwrite the label.

    After booting miniroot from the CD, format(1m) was able to recover a
    backup label. After that fsck seems to have fixed things. However, I'm
    still a bit spooked. If this were Linux instead of Solaris I wouldn't
    think too much of it, but I expect Solaris to be immune to such things.

    This is a new Acer 5520 laptop, so I've got all the usual skittishness
    about device drivers and/or flaky hardware.

    Comments or suggestions?

    Reg

  3. Re: File system full corrupts grub boot process

    On Feb 23, 3:00 pm, Reginald Beardsley wrote:
    >
    > When I started to repair the damage, I discovered that the disk label
    > had been trashed. Not really a surprise given the problem, but I'm
    > puzzled as to how a write to a file on a mounted filesystem could
    > overwrite the label.


    This should never happen unless, possibly, you've got some disk layout
    such that a slice is sitting on top of wherever the label is. I
    forget how you arrange that or if it is possible at all, although I
    have heard of it happening.

    I suspect something else trashed the label.

  4. Re: File system full corrupts grub boot process

    Tim Bradshaw wrote:
    > On Feb 23, 3:00 pm, Reginald Beardsley wrote:
    >
    >>When I started to repair the damage, I discovered that the disk label
    >>had been trashed. Not really a surprise given the problem, but I'm
    >>puzzled as to how a write to a file on a mounted filesystem could
    >>overwrite the label.

    >
    >
    > This should never happen unless, possibly, you've got some disk layout
    > such that a slice is sitting on top of wherever the label is. I
    > forget how you arrange that or if it is possible at all, although I
    > have heard of it happening.
    >
    > I suspect something else trashed the label.


    Such as? The layout was created using the installer. It had 3 slices:
    root, swap & export. There was a single primary parition mounted and
    mozilla was downloading a large file into the root slice.

    The download failed and when I investigated I saw the root filesystem
    was full. Deleted stray cruft restarted the download. After it
    completed I rebooted and discovered the problem.

    I just tried an experiment. I unmounted everything but the root slice
    and did a dd of /dev/zero to a file in the root filesystem. After the
    disk filled up I deleted the file and attempted to reboot. The reboot
    failed w/ a "bad PBR". In this case the Solaris label is intact & the
    PC partition table is intact, but the primary boot block was blown away,
    so grub doesn't even start to load this time.

    The Solaris partition is in cylinders 1 - 10199 w/ the root filesystem
    slice in cylinders 2043-3017 and swap in 3-767.

    installgrub fixed it, but it is a bit unnerving to get different
    results, so I did this again several times while writing this note. For
    these I didn't unmount /export, but the problem never occurred again in
    4 attempts.

    Bad hardware??? It's a new Acer 5520 laptop. On two occasions it
    appeared to panic (went by too fast to read) booting miniroot off the
    installation CD, but then worked fine when I rebooted. I just put it
    down to a slightly off CD-R at the time.

    Reg

  5. Re: File system full corrupts grub boot process

    On Feb 25, 3:04 pm, Reginald Beardsley wrote:

    >
    > Such as? The layout was created using the installer. It had 3 slices:
    > root, swap & export. There was a single primary parition mounted and
    > mozilla was downloading a large file into the root slice.


    No idea, I am reasonably certain that filling / will not trash labels
    or boot blocks in any normal case however, or there would be an awful
    lot of broken systems about.

  6. Re: File system full corrupts grub boot process

    In comp.unix.solaris Reginald Beardsley wrote:
    >> This should never happen unless, possibly, you've got some disk layout
    >> such that a slice is sitting on top of wherever the label is. I
    >> forget how you arrange that or if it is possible at all, although I
    >> have heard of it happening.
    >>
    >> I suspect something else trashed the label.

    >
    > Such as?


    A bug? Some other program? Don't know.

    Which "label" are you referring to? (x86 MBR label or Solaris VTOC label)

    > The layout was created using the installer. It had 3 slices:
    > root, swap & export. There was a single primary parition mounted and
    > mozilla was downloading a large file into the root slice.


    Really it doesn't matter. All UFS filesystems skip the first 16 blocks
    of the slice they occupy. Unless the label extends past that point,
    there's no way for normal filesystem writes to affect it.

    I've filled many a filesystem (certainly including a great number of
    root filesystems). I've not seen this occur. So I doubt it is as
    simple as a misconfiguration.

    > I just tried an experiment. I unmounted everything but the root slice
    > and did a dd of /dev/zero to a file in the root filesystem. After the
    > disk filled up I deleted the file and attempted to reboot. The reboot
    > failed w/ a "bad PBR". In this case the Solaris label is intact & the
    > PC partition table is intact, but the primary boot block was blown away,
    > so grub doesn't even start to load this time.
    >
    > The Solaris partition is in cylinders 1 - 10199 w/ the root filesystem
    > slice in cylinders 2043-3017 and swap in 3-767.


    I assume those "cylinders" are from the Solaris VTOC, and so they are
    only within the Solaris partition (not referencing the raw disk).

    Unless the MBR is within the Solaris partition, just the act of writing
    to the filesystem can't be doing this. It would have to be related to
    something going off and trashing the boot (either correctly, but
    unintentionally like installgrub; or incorrectly like a bug).

    > installgrub fixed it, but it is a bit unnerving to get different
    > results, so I did this again several times while writing this note. For
    > these I didn't unmount /export, but the problem never occurred again in
    > 4 attempts.


    Darn. If it were reproducable, you could probably track it down
    (perhaps with dtrace watching all disk I/O).

    > Bad hardware??? It's a new Acer 5520 laptop. On two occasions it
    > appeared to panic (went by too fast to read) booting miniroot off the
    > installation CD, but then worked fine when I rebooted. I just put it
    > down to a slightly off CD-R at the time.


    Not impossible. I especially don't like that you can't reproduce the
    problem on demand.

    --
    Darren Dunham ddunham@taos.com
    Senior Technical Consultant TAOS http://www.taos.com/
    Got some Dr Pepper? San Francisco, CA bay area
    < This line left intentionally blank to confuse you. >

  7. Re: File system full corrupts grub boot process

    Reginald Beardsley wrote:
    > A bit more information:
    >
    > When I started to repair the damage, I discovered that the disk label
    > had been trashed. Not really a surprise given the problem, but I'm
    > puzzled as to how a write to a file on a mounted filesystem could
    > overwrite the label.
    >
    > After booting miniroot from the CD, format(1m) was able to recover a
    > backup label. After that fsck seems to have fixed things. However,
    > I'm still a bit spooked. If this were Linux instead of Solaris I
    > wouldn't think too much of it, but I expect Solaris to be immune to
    > such things.
    >
    > This is a new Acer 5520 laptop, so I've got all the usual skittishness
    > about device drivers and/or flaky hardware.
    >
    > Comments or suggestions?
    >
    > Reg

    Hmm can confirm it can happen its less than 2 weeks ago I had the
    pleasure of it.
    Min was occurring during a patch update via Update Manager at reboot it
    ended up with out PBR and I had to use installgrub to get it back.

+ Reply to Thread