Shadow set problem finally solved - VMS

This is a discussion on Shadow set problem finally solved - VMS ; Well, I have posted about a shawdow set problem that I had: Hard errors in the error log only during a shadow set merge or ANAL/DISK/ SHAD that were apparently not in blocks allocated to files. This is a 2 ...

+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 20 of 33

Thread: Shadow set problem finally solved

  1. Shadow set problem finally solved

    Well, I have posted about a shawdow set problem that I had: Hard
    errors in the error log only during a shadow set merge or ANAL/DISK/
    SHAD that were apparently not in blocks allocated to files.

    This is a 2 disk shadowset on a DS10 running 7.3.2.

    I got 2 new disks to replace the old ones.

    I swapped out one disk and merged it in. But I got the same errors on
    the new disk at exactly the same LBAs!

    Then I swapped the other disks and merged it in. I got the same
    errors on it at the same LBAs!

    To be exact, I get a few more errors on the source disk for the merge,
    probably due to retries, but all at the same 4 LBAs on both disks.

    Here's what I think happened: Had one bad disk that had parity errors
    (or a glitch that established some parity errors). Once those errors
    were established, they are just copied from disk to disk every time I
    merged a shadowset.

    One thing I don't know is why ANAL/DISK/SHAD/BLOCKS=FILE hits these
    parity errors, since all other tests indicate that the reported LBAs
    are not in a file. ANAL/DISK/READ reported no errors, and DFU
    indicates that the LBAs are not in a file.

    To get rid of these errors I am in the process of trying this:

    1. Broke a disk out of the shadow set and did an image save and
    compress on it.

    2. Booted on that disk and remerged the shadow set.

    That merge is in progress. I will see if I have any errors on it.

  2. Re: Shadow set problem finally solved

    On Mar 6, 1:24*pm, tadamsmar wrote:
    > Well, I have posted about a shawdow set problem that I had: Hard
    > errors in the error log only during a shadow set merge or ANAL/DISK/
    > SHAD that were apparently not in blocks allocated to files.
    >
    > This is a 2 disk shadowset on a DS10 running 7.3.2.
    >
    > I got 2 new disks to replace the old ones.
    >
    > I swapped out one disk and merged it in. *But I got the same errors on
    > the new disk at exactly the same LBAs!
    >
    > Then I swapped the other disks and merged it in. *I got the same
    > errors on it at the same LBAs!
    >
    > To be exact, I get a few more errors on the source disk for the merge,
    > probably due to retries, *but all at the same 4 LBAs on both disks.
    >
    > Here's what I think happened: *Had one bad disk that had parity errors
    > (or a glitch that established some parity errors). *Once those errors
    > were established, they are just copied from disk to disk every time I
    > merged a shadowset.
    >
    > One thing I don't know is why ANAL/DISK/SHAD/BLOCKS=FILE hits these
    > parity errors, since all other tests indicate that the reported LBAs
    > are not in a file. *ANAL/DISK/READ reported no errors, and DFU
    > indicates that the LBAs are not in a file.
    >
    > To get rid of these errors I am in the process of trying this:
    >
    > 1. *Broke a disk out of the shadow set and did an image save and
    > compress on it.
    >
    > 2. *Booted on that disk and remerged the shadow set.
    >
    > That merge is in progress. *I will see if I have any errors on it.


    Nope, did not fix the problem. Errors at the same 4 lbas.

  3. Re: Shadow set problem finally solved

    tadamsmar writes:

    >Well, I have posted about a shawdow set problem that I had: Hard
    >errors in the error log only during a shadow set merge or ANAL/DISK/
    >SHAD that were apparently not in blocks allocated to files.


    >This is a 2 disk shadowset on a DS10 running 7.3.2.


    >I got 2 new disks to replace the old ones.


    >I swapped out one disk and merged it in. But I got the same errors on
    >the new disk at exactly the same LBAs!


    >Then I swapped the other disks and merged it in. I got the same
    >errors on it at the same LBAs!


    >To be exact, I get a few more errors on the source disk for the merge,
    >probably due to retries, but all at the same 4 LBAs on both disks.


    >Here's what I think happened: Had one bad disk that had parity errors
    >(or a glitch that established some parity errors). Once those errors
    >were established, they are just copied from disk to disk every time I
    >merged a shadowset.


    That is what shadowing is supposed to do. Say you have a parity error in
    some important file. You'd rather know about that rather than getting
    random data, correct? Yes. With shadowing, if there is another member
    available, the same block from the other member will be read to attempt
    to retrieve the data. But if there is no other member, the error is
    returned to the caller.

    Now let's say you start with a 1 member set with a bad block, and you
    add a member. All blocks from the first member are copied to the new
    member. When it reaches the bad block, there is no good data. So the
    corresponding block on the new drive is also marked bad so random data
    will not be returned to the caller as good.

    Now if you remove the original member and add another disk, the copy
    takes place again. This time the block (on the second disk) is seen as
    bad even though there really isn't anything physically wrong with that
    particular block. The "badness" of the block is copied again, and any
    reads of that block on the shadowset will return a parity error since all
    members of the set have it marked as bad.

    How do you get rid of the bad block, esp. if you "know" it's not really
    bad (by replacing all the drives with new ones)? Just write to the block.
    Easier said than done, perhaps, but if the block is in a file that you can
    restore, just $ delete/erase the file and then restore it. When I worked
    on shadowing, I had a little utility that could write (or read) any
    block, or mark a block as bad.

  4. Re: Shadow set problem finally solved

    On Mar 6, 2:33*pm, moro...@world.std.spaamtrap.com (Michael Moroney)
    wrote:
    > tadamsmar writes:
    > >Well, I have posted about a shawdow set problem that I had: Hard
    > >errors in the error log only during a shadow set merge or ANAL/DISK/
    > >SHAD that were apparently not in blocks allocated to files.
    > >This is a 2 disk shadowset on a DS10 running 7.3.2.
    > >I got 2 new disks to replace the old ones.
    > >I swapped out one disk and merged it in. *But I got the same errors on
    > >the new disk at exactly the same LBAs!
    > >Then I swapped the other disks and merged it in. *I got the same
    > >errors on it at the same LBAs!
    > >To be exact, I get a few more errors on the source disk for the merge,
    > >probably due to retries, *but all at the same 4 LBAs on both disks.
    > >Here's what I think happened: *Had one bad disk that had parity errors
    > >(or a glitch that established some parity errors). *Once those errors
    > >were established, they are just copied from disk to disk every time I
    > >merged a shadowset.

    >
    > That is what shadowing is supposed to do. *Say you have a parity error in
    > some important file. *You'd rather know about that rather than getting
    > random data, correct? Yes. *With shadowing, if there is another member
    > available, the same block from the other member will be read to attempt
    > to retrieve the data. *But if there is no other member, the error is
    > returned to the caller.
    >
    > Now let's say you start with a 1 member set with a bad block, and you
    > add a member. *All blocks from the first member are copied to the new
    > member. *When it reaches the bad block, there is no good data. *So the
    > corresponding block on the new drive is also marked bad so random data
    > will not be returned to the caller as good.
    >
    > Now if you remove the original member and add another disk, the copy
    > takes place again. *This time the block (on the second disk) is seen as
    > bad even though there really isn't anything physically wrong with that
    > particular block. *The "badness" of the block is copied again, and any
    > reads of that block on the shadowset will return a parity error since all
    > members of the set have it marked as bad.
    >
    > How do you get rid of the bad block, esp. if you "know" it's not really
    > bad (by replacing all the drives with new ones)? *Just write to the block. *
    > Easier said than done, perhaps, but if the block is in a file that you can
    > restore, just $ delete/erase the file and then restore it. *When I worked
    > on shadowing, I had a little utility that could write (or read) any
    > block, or mark a block as bad.- Hide quoted text -
    >
    > - Show quoted text -


    Seems these blocks are not in a file because:

    1. No errors from ANAL/DISK/READ

    2. No errors when I break a disk out of the shadow set an do an image
    backup of the disk.

    But I get parity errors when I run ANAL/DISK/SHADOW/BLOCKS=FILE that
    implies the blocks are in
    a file. But when I check the LBAs where the errors are reported using
    DFU I find that the LBA is not in a file!

    I am not sure how an image backup could fail to encounter an error if
    it is not in a file.

    It is very confusing to me.

    I was thinking about running an INIT/ERASE on one of the disks (broken
    out of the shadowset and then restoring it from an image backup, then
    booting on that disk, then merging the shadowset.

    But I already tried a restore of a disk. I am not sure how I could
    end up with these parity errors after I restored the disk from a save
    set with no parity error reported. Can you explain that?

    Thanks for any help. I am stumped.

  5. Re: Shadow set problem finally solved

    ANAL/DISK/SHADOW/BLOCK=FILE_SYSTEM reads the disks using big QIOs
    (exactly 127 blocks as can be seen with SDA> IO tracing). Just let's
    assume when it hits a parity error, it will report it and incorrectly
    apply the check, whether the bad block read as part of the 127-block
    QIO is really in a file. This could be seen as a software problem and
    reported to HP.

    Volker.

  6. Re: Shadow set problem finally solved

    In article <9aca2038-4e20-4bb4-850e-4036ca4b82d2@59g2000hsb.googlegroups.com>, tadamsmar writes:
    >{...snip...}
    >Nope, did not fix the problem. Errors at the same 4 lbas.


    Have you ever determined what is in the LBAs in question?

    What file(s) use the LBAs?


    --
    VAXman- A Bored Certified VMS Kernel Mode Hacker VAXman(at)TMESIS(dot)COM

    "Well my son, life is like a beanstalk, isn't it?"

    http://tmesis.com/drat.html

  7. Re: Shadow set problem finally solved

    In article <47d13a1d$0$5632$607ed4bc@cv.net>, VAXman- @SendSpamHere.ORG writes:
    > In article <9aca2038-4e20-4bb4-850e-4036ca4b82d2@59g2000hsb.googlegroups.com>, tadamsmar writes:
    >>{...snip...}
    >>Nope, did not fix the problem. Errors at the same 4 lbas.

    >
    > Have you ever determined what is in the LBAs in question?
    >
    > What file(s) use the LBAs?


    To the OP: just in case you are not aware of it, DFU can do
    it, e.g.,

    $ DFU SEAR SYS$DISK/LBN=1234567

    Disk and File Utilities for OpenVMS DFU V2.4
    Freeware version
    Copyright ) 1996 Digital Equipment Corporation

    %DFU-I-SEARCH, Start search on SYS$DISK: (DSA5213

    DSA5213:[DIR1.DIR2.DIR3.DIR4.TAR20031006]USR.;1
    1761292/1761302

    %DFU-I-EOF, End of file INDEXF.SYS, Primary headers : 42828

    %DFU-S-FND , Files found : 1, Size : 1761292/1761302

    --
    George Cornelius cornelius A T eisner D O T decus D O T org
    cornelius A T mayo D O T edu


    > --
    > VAXman- A Bored Certified VMS Kernel Mode Hacker VAXman(at)TMESIS(dot)COM
    >
    > "Well my son, life is like a beanstalk, isn't it?"
    >
    > http://tmesis.com/drat.html


  8. Re: Shadow set problem finally solved

    In article , tadamsmar writes:
    > I am not sure how an image backup could fail to encounter an error if
    > it is not in a file.


    BACKUP /IMAGE only copies the files on a disk. It doesn't copy the
    empty space.

    BACKUP /PHYSICAL would copy the empty space.

  9. Re: Shadow set problem finally solved

    tadamsmar writes:

    >Seems these blocks are not in a file because:


    >1. No errors from ANAL/DISK/READ


    >2. No errors when I break a disk out of the shadow set an do an image
    >backup of the disk.


    >But I get parity errors when I run ANAL/DISK/SHADOW/BLOCKS=3DFILE that
    >implies the blocks are in
    >a file. But when I check the LBAs where the errors are reported using
    >DFU I find that the LBA is not in a file!


    >I am not sure how an image backup could fail to encounter an error if
    >it is not in a file.


    I can't explain this other than ANAL/DISK/SHADOW/BLOCKS=FILE doesn't
    work as advertized.

    >I was thinking about running an INIT/ERASE on one of the disks (broken
    >out of the shadowset and then restoring it from an image backup, then
    >booting on that disk, then merging the shadowset.


    The INIT/ERASE will overwrite all blocks marked with parity errors that
    are actually good. Try: INIT/ERASE, restore, ANAL/DISK/SHADOW
    in that order before adding the other drive. Should be no errors.
    Repeat ANAL/DISK/SHADOW after adding the other drive.

    >But I already tried a restore of a disk. I am not sure how I could
    >end up with these parity errors after I restored the disk from a save
    >set with no parity error reported. Can you explain that?


    All I can say is the restore never wrote the blocks with parity errors.
    Which makes some sense of they are not in any files, although it is a
    bit odd that none of the restored files moved to those blocks.

  10. Re: Shadow set problem finally solved

    On Thu, 6 Mar 2008 17:09:44 -0800 (PST), tadamsmar
    wrote:


    Maybe if you posted all the messages returned from the ana/disk/read
    someone might be able to pick somethin out.

  11. Re: Shadow set problem finally solved

    In article , Pete writes:
    > On Thu, 6 Mar 2008 17:09:44 -0800 (PST), tadamsmar
    > wrote:
    >
    >
    > Maybe if you posted all the messages returned from the ana/disk/read
    > someone might be able to pick somethin out.


    This thing really seems to be going nowhere. It sure seems to me that
    this is a standard, pure-vanilla problem of a shadow set having a forced
    error that VMS replicates on every shadow copy, something that's been
    around since Phase I shadowing and RA series drives.

    If that is the case, the solution is to overwrite the block in question.

    You should, just to be thorough, examine [000000]BADBLK.SYS and if it
    has any allocation, check its LBN's with $ DUMP/HEADER/BLOCK=C:0 .

    If your block's not there, and if it's not already in some other file,
    you have to allocate the block somehow.

    I've done it in the past by extending an existing zero-length file with
    some Macro code and an allocation XAB, something that's not too difficult
    once you home in on the correct settings, as in

    ExtendALQ=1
    StartLBN=16578125
    ExtendXAB: $XABALL ALQ=ExtendALQ,AOP=,ALN=LBN,LOC=StartLBN

    But you can also do this:

    $ create/fdl=sys$input JUNK.TMP001
    area 0;allocation 1;contig yes;exact_positioning yes;position logical 16578125


    Note that this seems to work as long as the cluster containing the block
    is available, in which case the starting point will be as much as
    cluster_size - 1 earlier than was requested.

    Once you have the block allocated into a file, say 0 blocks in use of
    N blocks allocated, write to the file, making sure there is enough
    data to overwrite the bad block. Remember that you should write at
    least one cluster's worth of data even though you only asked for one
    block to be allocated.

    After redoing the shadow copy and checking that your errors have, in
    fact, gone away and everything is stable, you can delete the temporary
    file.

    [I haven't done this for a long time, so it's from memory. If this fails,
    init/erase a target drive, then back up to it from your master copy].

    --
    George Cornelius cornelius A T eisner D O T decus D O T org
    cornelius A T mayo D O T edu

  12. Re: Shadow set problem finally solved

    On Mar 8, 5:22*pm, BEGINcornel...@decuserve.orgEND (George Cornelius)
    wrote:
    > In article , Pete writes:
    > > On Thu, 6 Mar 2008 17:09:44 -0800 (PST), tadamsmar
    > > wrote:

    >
    > > Maybe if you posted all the messages returned from the ana/disk/read
    > > someone might be able to pick somethin out.

    >
    > This thing really seems to be going nowhere. *It sure seems to me that
    > this is a standard, pure-vanilla problem of a shadow set having a forced
    > error that VMS replicates on every shadow copy, something that's been
    > around since Phase I shadowing and RA series drives.
    >
    > If that is the case, the solution is to overwrite the block in question.
    >
    > You should, just to be thorough, examine [000000]BADBLK.SYS and if it
    > has any allocation, check its LBN's with $ DUMP/HEADER/BLOCK=C:0 .
    >
    > If your block's not there, and if it's not already in some other file,
    > you have to allocate the block somehow.
    >
    > I've done it in the past by extending an existing zero-length file with
    > some Macro code and an allocation XAB, something that's not too difficult
    > once you home in on the correct settings, as in
    >
    > * ExtendALQ=1
    > * StartLBN=16578125
    > * ExtendXAB: *$XABALL ALQ=ExtendALQ,AOP=,ALN=LBN,LOC=StartLBN
    >
    > But you can also do this:
    >
    > *$ create/fdl=sys$input JUNK.TMP001
    > *area 0;allocation 1;contig yes;exact_positioning yes;position logical 16578125
    > *
    >
    > Note that this seems to work as long as the cluster containing the block
    > is available, in which case the starting point will be as much as
    > cluster_size - 1 earlier than was requested.
    >
    > Once you have the block allocated into a file, say 0 blocks in use of
    > N blocks allocated, write to the file, making sure there is enough
    > data to overwrite the bad block. *Remember that you should write at
    > least one cluster's worth of data even though you only asked for one
    > block to be allocated.
    >
    > After redoing the shadow copy and checking that your errors have, in
    > fact, gone away and everything is stable, you can delete the temporary
    > file.
    >
    > [I haven't done this for a long time, so it's from memory. *If this fails,
    > init/erase a target drive, then back up to it from your master copy].
    >
    > --
    > George Cornelius * * * * * * *cornelius A T eisner D O T decus D O T org
    > * * * * * * * * * * * * * * * cornelius A T * mayo D O T edu


    How do I look at BADBLK.SYS for a shadow set? I have broke a member
    out of the shadowset and mounted it foreign before, but I don't get
    any useful info that way. ANAL/MEDIA seems to be completely
    independent of the shadowset algorithm for locating bad blocks. Are
    you sure a *shadow set* uses BADBLK.SYS? If so, how do I read it?

    How about doing this:

    1. Break out a member.
    2. Do an back/image on the member
    3. init/eras the member
    4. restore the image on the member
    5. boot on the member, making it the only member of the shadowset.
    6. merge the other disk into the shadowset

    Would that get rid of all these parity errors?

    It seems a little safer and easier (for me) than trying to write
    blocks to specific LBAs.

    If this procedure would work, do I need any qualifiers on init/erase
    or are the defaults just reproduce the disk parameters that are
    already there?

  13. Re: Shadow set problem finally solved

    In article <8cyDTduSfM0m@eisner.encompasserve.org>, BEGINcornelius@decuserve.orgEND (George Cornelius) writes:
    >In article <10b5d716-18aa-4705-bec3-dc169725ada0@s19g2000prg.googlegroups.com>, tadamsmar writes:
    >> On Mar 8, 5:22pm, George Cornelius wrote:
    >>> This thing really seems to be going nowhere. It sure seems to me that
    >>> this is a standard, pure-vanilla problem of a shadow set having a forced
    >>> error that VMS replicates on every shadow copy, something that's been
    >>> around since Phase I shadowing and RA series drives.
    >>>
    >>> If that is the case, the solution is to overwrite the block in question.
    >>>
    >>> You should, just to be thorough, examine [000000]BADBLK.SYS and if it
    >>> has any allocation, check its LBN's with $ DUMP/HEADER/BLOCK=C:0 .

    >
    >
    >>> [...] you can also do this:
    >>>
    >>> $ create/fdl=sys$input JUNK.TMP001
    >>> area 0;allocation 1;contig yes;exact_positioning yes;position logical -
    >>> 16578125
    >>>

    >
    >[...]
    >
    >> How do I look at BADBLK.SYS for a shadow set?

    >
    >I supplied a $ DUMP command. But, never mind. I only mentioned it for
    >thoroughness. We already know in your case that the block is not allocated
    >anywhere.


    You *can* set blocks aside in BADBLK.SYS but I'm pretty certain that no
    modern disk drives use BADBLK.SYS.

    --
    VAXman- A Bored Certified VMS Kernel Mode Hacker VAXman(at)TMESIS(dot)COM

    "Well my son, life is like a beanstalk, isn't it?"

    http://tmesis.com/drat.html

  14. Re: Shadow set problem finally solved

    In article <10b5d716-18aa-4705-bec3-dc169725ada0@s19g2000prg.googlegroups.com>, tadamsmar writes:
    > On Mar 8, 5:22pm, George Cornelius wrote:
    >> This thing really seems to be going nowhere. It sure seems to me that
    >> this is a standard, pure-vanilla problem of a shadow set having a forced
    >> error that VMS replicates on every shadow copy, something that's been
    >> around since Phase I shadowing and RA series drives.
    >>
    >> If that is the case, the solution is to overwrite the block in question.
    >>
    >> You should, just to be thorough, examine [000000]BADBLK.SYS and if it
    >> has any allocation, check its LBN's with $ DUMP/HEADER/BLOCK=C:0 .



    >> [...] you can also do this:
    >>
    >> $ create/fdl=sys$input JUNK.TMP001
    >> area 0;allocation 1;contig yes;exact_positioning yes;position logical -
    >> 16578125
    >>


    [...]

    > How do I look at BADBLK.SYS for a shadow set?


    I supplied a $ DUMP command. But, never mind. I only mentioned it for
    thoroughness. We already know in your case that the block is not allocated
    anywhere.

    > How about doing this:
    >
    > 1. Break out a member.
    > 2. Do an back/image on the member
    > 3. init/eras the member
    > 4. restore the image on the member
    > 5. boot on the member, making it the only member of the shadowset.
    > 6. merge the other disk into the shadowset


    Yes, and if not a boot disk the following [after being certain nothing will be
    writing to DSAn] would be more direct:

    $ dismount member2
    $ initialize/erase/system member2: dummy ! No special init command needed
    $ mount/foreign/noassist/override=shadow_membership member2
    $ backup/image/noalias DSAn: member2: ! Default init params are from source vol
    $ dismount DSAn:
    $ dismount member2:
    $ mount/sys DSAn:/shadow=member2:/conf label ! /conf just in case .-2 failed
    $ !

    Even if it is a boot disk you may be able to do the above steps by booting
    from CDROM.

    My temp file technique solves the problem outright. There are no mounts and
    dismounts; nor are there shadow copies.

    Since it uses VMS to allocate the LBN into a temp file, it does not allocate
    the LBN unless it is free. This means you do not have to worry that it can
    destroy data in some random file.

    > Would that get rid of all these parity errors?


    Both approaches do this. One is surgical; the other brute force.

    > It seems a little safer and easier (for me) than trying to write
    > blocks to specific LBAs.


    Go for it. You seem to be a bit over your head here, and a procedure you
    understand is much better than one that confuses you further.

    --
    George Cornelius cornelius A T eisner D O T decus D O T org
    cornelius A T mayo D O T edu

  15. Re: Shadow set problem finally solved

    In article <10b5d716-18aa-4705-bec3-dc169725ada0@s19g2000prg.googlegroups.com>, tadamsmar writes:
    > How about doing this:


    [...]

    > 4. restore the image on the member
    > 5. boot on the member, making it the only member of the shadowset.
    > 6. merge the other disk into the shadowset


    What you will be doing, via an appropriate /SHADOW qualifier in a mount
    command, is _not_ a merge; it is a copy.

    A merge is something the O/S does all by itself to maintain a certain
    kind of internal shadow consistency. For the most part you should
    ignore merge operations [if you can - they slow your system down and
    should be avoided when possible].

    Read The Fine Manual.

    --
    George Cornelius cornelius A T eisner D O T decus D O T org
    cornelius A T mayo D O T edu

  16. Re: Shadow set problem finally solved

    In article <47d8667c$0$5641$607ed4bc@cv.net>, VAXman- @SendSpamHere.ORG writes:
    > In article <8cyDTduSfM0m@eisner.encompasserve.org>, BEGINcornelius@decuserve.orgEND (George Cornelius) writes:


    > You *can* set blocks aside in BADBLK.SYS but I'm pretty certain that no
    > modern disk drives use BADBLK.SYS.


    Well I said I was just being thorough!

    I was once told that all of that infrastructure was obsolete, that DSA
    (sp?) drives' revectoring made it so there were no bad blocks.

    Then SCSI arrived on the scene, and, at least from my point of view, it
    appeared that the whole bad-block issue reappeared. Along with some kind
    of odd "SCSI block scrubber" that would scrub the blocks of a file at
    deletion time if the O/S had marked that file as having unrecovered bad
    blocks (or revectored bad blocks with the forced error flag set).

    Of course, modern SCSI drives revector and presumably never have to have
    anything in BADBLK.SYS .

    As a sidelight: I basically developed this 'allocate the bad blocks'
    technique for exactly the situation you describe - putting blocks with
    known problems into BADBLK.SYS .

    In my case, the known problem was something different: naively assuming
    I could transport drives from HSD05/HSD10 controllers to HSJ40/HSJ50
    series and not have to worry about reinitializing and restoring from
    backup.

    Apparently the more advanced controllers needed larger metadata areas
    and some of my files turned out to have allocations beyond the end of
    the disk.

    Luckily these were static files that could be recovered from a backup,
    but I had to allocate a thousand or so LBN's into BADBLK.SYS to avoid
    VMS using them somewhere else.

    --
    George Cornelius cornelius A T eisner D O T decus D O T org
    cornelius A T mayo D O T edu

  17. Re: Shadow set problem finally solved

    On Mar 12, 9:08*pm, BEGINcornel...@decuserve.orgEND (George Cornelius)
    wrote:
    > In article <10b5d716-18aa-4705-bec3-dc169725a...@s19g2000prg.googlegroups.com>, tadamsmar writes:
    >
    >
    >
    >
    >
    > > On Mar 8, 5:22pm, George Cornelius wrote:
    > >> This thing really seems to be going nowhere. It sure seems to me that
    > >> this is a standard, pure-vanilla problem of a shadow set having a forced
    > >> error that VMS replicates on every shadow copy, something that's been
    > >> around since Phase I shadowing and RA series drives.

    >
    > >> If that is the case, the solution is to overwrite the block in question..

    >
    > >> You should, just to be thorough, examine [000000]BADBLK.SYS and if it
    > >> has any allocation, check its LBN's with $ DUMP/HEADER/BLOCK=C:0 .
    > >> [...] you can also do this:

    >
    > >> $ create/fdl=sys$input JUNK.TMP001
    > >> area 0;allocation 1;contig yes;exact_positioning yes;position logical -
    > >> *16578125
    > >>

    >
    > [...]
    >
    > > How do I look at BADBLK.SYS for a shadow set?

    >
    > I supplied a $ DUMP command. *But, never mind. *I only mentioned it for
    > thoroughness. *We already know in your case that the block is not allocated
    > anywhere.
    >
    > > How about doing this:

    >
    > > 1. *Break out a member.
    > > 2. *Do an back/image on the member
    > > 3. *init/eras the member
    > > 4. *restore the image on the member
    > > 5. *boot on the member, making it the only member of the shadowset.
    > > 6. *merge the other disk into the shadowset

    >
    > Yes, and if not a boot disk the following [after being certain nothing will be
    > writing to DSAn] would be more direct:
    >
    > *$ dismount member2
    > *$ initialize/erase/system member2: dummy ! No special init command needed
    > *$ mount/foreign/noassist/override=shadow_membership member2
    > *$ backup/image/noalias DSAn: member2: ! Default init params are from source vol
    > *$ dismount DSAn:
    > *$ dismount member2:
    > *$ mount/sys DSAn:/shadow=member2:/conf label ! /conf just in case .-2failed
    > *$ !
    >
    > Even if it is a boot disk you may be able to do the above steps by booting
    > from CDROM.
    >
    > My temp file technique solves the problem outright. There are no mounts and
    > dismounts; nor are there shadow copies.
    >
    > Since it uses VMS to allocate the LBN into a temp file, it does not allocate
    > the LBN unless it is free. *This means you do not have to worry that it can
    > destroy data in some random file.
    >
    > > Would that get rid of all these parity errors?

    >
    > Both approaches do this. *One is surgical; the other brute force.
    >
    > > It seems a little safer and easier (for me) than trying to write
    > > blocks to specific LBAs.

    >
    > Go for it.


    It worked! I finally got a clean ANAL/DISK/SHADOW!

    >You seem to be a bit over your head here, and a procedure you
    > understand is much better than one that confuses you further.


    Yes. I was a bit nervous about trying to wipe out four blocks even
    with the indications that they were not in a file. And I could have
    made a mistake.

    I already had a command procedure for disk compression and all I had
    to do was add an image/erase to it.

    The one drawback to this procedure is that you have to refrain from
    working on the computer till it completes, or at least be prepared to
    copy your work off to a safe place. The system was up, but the system
    disk was destined to be overwritten.

    If I just zapped the four blocks I could have avoided this.

  18. Re: Shadow set problem finally solved

    On Mar 13, 8:33*am, tadamsmar wrote:
    > On Mar 12, 9:08*pm, BEGINcornel...@decuserve.orgEND (George Cornelius)
    > wrote:
    >
    >
    >
    >
    >
    > > In article <10b5d716-18aa-4705-bec3-dc169725a...@s19g2000prg.googlegroups.com>, tadamsmar writes:

    >
    > > > On Mar 8, 5:22pm, George Cornelius wrote:
    > > >> This thing really seems to be going nowhere. It sure seems to me that
    > > >> this is a standard, pure-vanilla problem of a shadow set having a forced
    > > >> error that VMS replicates on every shadow copy, something that's been
    > > >> around since Phase I shadowing and RA series drives.

    >
    > > >> If that is the case, the solution is to overwrite the block in question.

    >
    > > >> You should, just to be thorough, examine [000000]BADBLK.SYS and if it
    > > >> has any allocation, check its LBN's with $ DUMP/HEADER/BLOCK=C:0 .
    > > >> [...] you can also do this:

    >
    > > >> $ create/fdl=sys$input JUNK.TMP001
    > > >> area 0;allocation 1;contig yes;exact_positioning yes;position logical-
    > > >> *16578125
    > > >>

    >
    > > [...]

    >
    > > > How do I look at BADBLK.SYS for a shadow set?

    >
    > > I supplied a $ DUMP command. *But, never mind. *I only mentioned it for
    > > thoroughness. *We already know in your case that the block is not allocated
    > > anywhere.

    >
    > > > How about doing this:

    >
    > > > 1. *Break out a member.
    > > > 2. *Do an back/image on the member
    > > > 3. *init/eras the member
    > > > 4. *restore the image on the member
    > > > 5. *boot on the member, making it the only member of the shadowset.
    > > > 6. *merge the other disk into the shadowset

    >
    > > Yes, and if not a boot disk the following [after being certain nothing will be
    > > writing to DSAn] would be more direct:

    >
    > > *$ dismount member2
    > > *$ initialize/erase/system member2: dummy ! No special init command needed
    > > *$ mount/foreign/noassist/override=shadow_membership member2
    > > *$ backup/image/noalias DSAn: member2: ! Default init params are from source vol
    > > *$ dismount DSAn:
    > > *$ dismount member2:
    > > *$ mount/sys DSAn:/shadow=member2:/conf label ! /conf just in case .-2 failed
    > > *$ !

    >
    > > Even if it is a boot disk you may be able to do the above steps by booting
    > > from CDROM.

    >
    > > My temp file technique solves the problem outright. There are no mounts and
    > > dismounts; nor are there shadow copies.

    >
    > > Since it uses VMS to allocate the LBN into a temp file, it does not allocate
    > > the LBN unless it is free. *This means you do not have to worry that it can
    > > destroy data in some random file.

    >
    > > > Would that get rid of all these parity errors?

    >
    > > Both approaches do this. *One is surgical; the other brute force.

    >
    > > > It seems a little safer and easier (for me) than trying to write
    > > > blocks to specific LBAs.

    >
    > > Go for it.

    >
    > It worked! *I finally got a clean ANAL/DISK/SHADOW!


    Opps, not quite there. It ran clean on the single disk. But when I
    add the other disk to the shadowset it reported the 4 blocks with
    errors. VMS does not want to give these up without a fight.

    I'm am going to init/erase that disk.

    I am going to get a real sense of accomplisment when I finally see
    those two zeros in the error count column of SHOW DEV D.

    >
    > >You seem to be a bit over your head here, and a procedure you
    > > understand is much better than one that confuses you further.

    >
    > Yes. *I was a bit nervous about trying to wipe out four blocks even
    > with the indications that they were not in a file. *And I could have
    > made a mistake.
    >
    > I already had a command procedure for disk compression and all I had
    > to do was add an image/erase to it.
    >
    > The one drawback to this procedure is that you have to refrain from
    > working on the computer till it completes, or at least be prepared to
    > copy your work off to a safe place. *The system was up, but the system
    > disk was destined to be overwritten.
    >
    > If I just zapped the four blocks I could have avoided this.- Hide quoted text -
    >
    > - Show quoted text -



  19. Re: Shadow set problem finally solved

    In article , tadamsmar writes:
    >{...snip...}
    >
    >Opps, not quite there. It ran clean on the single disk. But when I
    >add the other disk to the shadowset it reported the 4 blocks with
    >errors. VMS does not want to give these up without a fight.
    >
    >I'm am going to init/erase that disk.


    What happens if you shadow and mount an init'd disk (i.e. nothing on it)?

    --
    VAXman- A Bored Certified VMS Kernel Mode Hacker VAXman(at)TMESIS(dot)COM

    "Well my son, life is like a beanstalk, isn't it?"

    http://tmesis.com/drat.html

  20. Re: Shadow set problem finally solved

    On Mar 13, 12:07*pm, VAXman- @SendSpamHere.ORG wrote:
    > In article , tadamsmar writes:
    >
    > >{...snip...}

    >
    > >Opps, not quite there. *It ran clean on the single disk. *But when I
    > >add the other disk to the shadowset it reported the 4 blocks with
    > >errors. *VMS does not want to give these up without a fight.

    >
    > >I'm am going to init/erase that disk.

    >
    > What happens if you shadow and mount an init'd disk (i.e. nothing on it)?


    I am not sure exactly what you mean? init without erase? I did not
    try that.

    >
    > --
    > VAXman- A Bored Certified VMS Kernel Mode Hacker * VAXman(at)TMESIS(dot)COM
    >
    > * "Well my son, life is like a beanstalk, isn't it?"
    >
    > http://tmesis.com/drat.html



+ Reply to Thread
Page 1 of 2 1 2 LastLast