rescue damaged tar file - Unix

This is a discussion on rescue damaged tar file - Unix ; I have a compressed archive that I created using GNU tar on a gentoo system. The tar file is about 900 MB compressed and about 2.5 GB uncompressed. This archive has bad data about 1/3 to 1/2 way through it. ...

+ Reply to Thread
Results 1 to 20 of 20

Thread: rescue damaged tar file

  1. rescue damaged tar file


    I have a compressed archive that I created using GNU tar on a gentoo
    system. The tar file is about 900 MB compressed and about 2.5 GB
    uncompressed.

    This archive has bad data about 1/3 to 1/2 way through it. I'm trying
    to find a way to get at the data below the damaged part. At the point
    of failure, tar reports:

    tar: skipping to next header
    tar: archive contains obsolescent base-64 headers
    tar: error exit delayed from previous errors

    I used the gzip recovery kit to unpack the file. This worked and gave
    me the 2.5 GB uncompressed file. However, tar will only unpack up to
    the damaged part, and even with --ignore-failed-read, blows up.

    cpio will not work, I tried this on the uncompressed file:

    cpio -F tarfilename -i -v

    The error returned is:

    cpio: standard input is closed: Value too large for defined data type.

    I have googled high and low and cannot find a relevant explanation of
    the error in this context nor any resolution.

    The same message is returned for

    cpio -ird -H tar < tarfilename

    I also tried the recovery kit's option to split the recovered file at
    the point of corruption and save the "good" parts. However, it
    doesn't split the file, it generates a message that says "bad data at
    byte abc" immediately followed by "good data at byte abc" and then
    keeps going, returning a single file.

    Finally, I copied the file to a windows machine and opened it in
    winzip, but winzip only opens it up to the point of corruption.

    If anyone can tell me how I can recover this data, I will be eternally
    grateful. I have looked at the "Advanced Tar Repair" tool, but of
    course, I would like to avoid shelling out that kind of money for a
    one-off. This is the only time I have had this problem in 8 years of
    using tar, and I fully expect I could go that long again.

    Thanks.

    mp

    --
    'cat' is not recognized as an internal or external command,
    operable program or batch file.

  2. Re: rescue damaged tar file

    [followup set to comp.unix.admin only]

    In comp.unix.admin Michael Powe wrote:

    | I have a compressed archive that I created using GNU tar on a gentoo
    | system. The tar file is about 900 MB compressed and about 2.5 GB
    | uncompressed.
    |
    | This archive has bad data about 1/3 to 1/2 way through it. I'm trying
    | to find a way to get at the data below the damaged part. At the point
    | of failure, tar reports:

    [etc]

    Possibly the damage is in a form that effects the relative offset of the
    data, for example the addition or deletion of an amount of data that is
    not a multiple of 512 bytes. What a tar recovery program would have to
    do is look for what appears to be a valid tar header using every level of
    byte offset to find something. Assuming only one point of corruption,
    this might be doable from the end of the file working in reverse to the
    front. What is the exact number of bytes of the uncompressed copy?

    --
    -----------------------------------------------------------------------------
    | Phil Howard KA9WGN | http://linuxhomepage.com/ http://ham.org/ |
    | (first name) at ipal.net | http://phil.ipal.org/ http://ka9wgn.ham.org/ |
    -----------------------------------------------------------------------------

  3. Re: rescue damaged tar file

    >>>>> "phil" == phil-news-nospam writes:

    phil> [followup set to comp.unix.admin only]
    phil> In comp.unix.admin Michael Powe wrote:

    phil> | I have a compressed archive that I created using GNU tar
    phil> on a gentoo | system. The tar file is about 900 MB
    phil> compressed and about 2.5 GB | uncompressed.
    phil> |
    phil> | This archive has bad data about 1/3 to 1/2 way through it.
    phil> I'm trying | to find a way to get at the data below the
    phil> damaged part. At the point | of failure, tar reports:

    phil> [etc]

    phil> Possibly the damage is in a form that effects the relative
    phil> offset of the data, for example the addition or deletion of
    phil> an amount of data that is not a multiple of 512 bytes. What
    phil> a tar recovery program would have to do is look for what
    phil> appears to be a valid tar header using every level of byte
    phil> offset to find something. Assuming only one point of
    phil> corruption, this might be doable from the end of the file
    phil> working in reverse to the front. What is the exact number
    phil> of bytes of the uncompressed copy?

    1010083339 2005-12-24 15:44 powem12242005.tar.gz
    2469939203 2005-12-25 20:41 powem12242005.tar.recovered

    The "recovered" file is generated by gzrecover from GNU Recovery Tool Kit
    (http://www.urbanophile.com/arenn/coding/gzrt/gzrt.html). Standard
    gzip will not unpack the file and exits with a CRC error.

    BTW, if I had some pointers on analyzing these tar files, I have some
    knowledge of Java and perl and I'd be willing to spend some time
    trying to "reverse engineer" the file, so to speak. I have no
    experience handling binary files in that manner, though.

    Thanks.

    mp


    --
    Michael Powe michael@trollope.org Waterbury CT
    ENOSIG: signature file is empty

  4. Re: rescue damaged tar file

    Michael Powe wrote:
    >
    > I have a compressed archive that I created using GNU tar on a
    > gentoo system. The tar file is about 900 MB compressed and
    > about 2.5 GB uncompressed.
    >
    > This archive has bad data about 1/3 to 1/2 way through it. I'm
    > trying to find a way to get at the data below the damaged part.
    > At the point of failure, tar reports:


    I think you can forget about it. Modern compressors operate by
    specifying a position and length of previous data to copy as new
    data, using a window of from 4k to 64k or more into the old data
    (measured from the point of expansion). Once the data is fouled
    that reference is gone, and there is no way to recover.

    The same applies to LZW compression, although the mechanism is
    different. There there is a remote chance that the compressor has
    detected a poor compression ratio and decided to reinitialize its
    tables, in which case some recovery would be possible. However
    that mechanism hasn't been used seriously for 20 years or so, due
    to patent problems.

    That is one advantage of the zip format, i.e. each individual
    compressed file stands alone, so an error will only lose one file.
    With the tar format all files are basically concatenated into
    one, and the result may then be compressed.

    To avoid this sort of foul up in the future, you would be well
    advised to insist on ECC memory in all your systems.

    --
    "If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers." - Keith Thompson
    More details at:

  5. Re: rescue damaged tar file


    Michael Powe wrote:
    >
    > I have a compressed archive that I created using GNU tar on a
    > gentoo system. The tar file is about 900 MB compressed and
    > about 2.5 GB uncompressed.
    > This archive has bad data about 1/3 to 1/2 way through it. I'm
    > trying to find a way to get at the data below the damaged part.
    > At the point of failure, tar reports:


    Google gives me hits on 'tar file recovery'.
    The first one I looked at was commercial but seemed to have
    a demo package.

    I'd also try Shilling's 'star':

    http://freshmeat.net/projects/star

  6. Re: rescue damaged tar file


    I have looked at the various 'google' options but without joy. The
    "Advanced Tar Recover" tool failed.

    Thanks.

    mp

  7. Re: rescue damaged tar file

    On Mon, 26 Dec 2005 11:40:26 -0500, Michael Powe wrote:

    >
    > I have looked at the various 'google' options but without joy. The
    > "Advanced Tar Recover" tool failed.


    Best option is to make absolutely certain that you can't get the damaged
    section of the data back - if the reason for the corruption is due to the
    source media (tape or disk say) then there are sometimes options there to
    recover bad data...

    And in the future, don't compress archives of anything critical, just
    because of potential problems like this

    cheers

    Jules


  8. Re: rescue damaged tar file

    Jules wrote (in part):

    > And in the future, don't compress archives of anything critical, just
    > because of potential problems like this
    >

    How do tape drives that do _hardware compression_ get around this problem?

    I imagine they compress one block at a time. On my tape drives, they have a
    2 Megabyte buffer in the drive, and it normally writes 65536-byte blocks to
    the tape. They employ a 4 layer Reed Solomon error detection and correction
    scheme, and use what they call Adaptive Lossless Data Compression (whatever
    that is). I am also not sure what "4 layer Reed Solomon error detection and
    correction" is. I know they do lateral and longitudinal and diagonal parity
    checking of each block. Reed Solomon technique is well known, but I do not
    know what the 4-layer is all about.

    http://www.siam.org/siamnews/mtc/mtc193.htm

    I always have hardware compression on (it is the default for the drive), and
    have never lost anything. But that is 2 data points (2 drives).

    --
    .~. Jean-David Beyer Registered Linux User 85642.
    /V\ PGP-Key: 9A2FC99A Registered Machine 241939.
    /( )\ Shrewsbury, New Jersey http://counter.li.org
    ^^-^^ 16:00:00 up 30 days, 2:32, 5 users, load average: 4.30, 4.23, 3.98

  9. Re: rescue damaged tar file

    Jean-David Beyer wrote:
    > Jules wrote (in part):
    >
    >> And in the future, don't compress archives of anything
    >> critical, just because of potential problems like this
    >>

    > How do tape drives that do _hardware compression_ get around
    > this problem?
    >
    > I imagine they compress one block at a time. On my tape drives,
    > they have a 2 Megabyte buffer in the drive, and it normally
    > writes 65536-byte blocks to the tape. They employ a 4 layer Reed
    > Solomon error detection and correction scheme, and use what they
    > call Adaptive Lossless Data Compression (whatever that is). I am
    > also not sure what "4 layer Reed Solomon error detection and
    > correction" is. I know they do lateral and longitudinal and
    > diagonal parity checking of each block. Reed Solomon technique
    > is well known, but I do not know what the 4-layer is all about.
    >
    > http://www.siam.org/siamnews/mtc/mtc193.htm
    >
    > I always have hardware compression on (it is the default for the
    > drive), and have never lost anything. But that is 2 data points
    > (2 drives).


    You might look into using ARJ for this sort of archival
    compression. It has provisions for generating the extra
    information needed to recover from faulty media. I don't know the
    algorithms used, nor have I really tested the result. It is free
    for personal use, but may not be available for Linux.



    --
    "If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers." - Keith Thompson
    More details at:

  10. Re: rescue damaged tar file

    Chuck F. wrote:
    > Jean-David Beyer wrote:
    >
    >> Jules wrote (in part):
    >>
    >>> And in the future, don't compress archives of anything
    >>> critical, just because of potential problems like this
    >>>

    >> How do tape drives that do _hardware compression_ get around
    >> this problem?
    >>
    >> I imagine they compress one block at a time. On my tape drives,
    >> they have a 2 Megabyte buffer in the drive, and it normally
    >> writes 65536-byte blocks to the tape. They employ a 4 layer Reed
    >> Solomon error detection and correction scheme, and use what they
    >> call Adaptive Lossless Data Compression (whatever that is). I am
    >> also not sure what "4 layer Reed Solomon error detection and
    >> correction" is. I know they do lateral and longitudinal and
    >> diagonal parity checking of each block. Reed Solomon technique
    >> is well known, but I do not know what the 4-layer is all about.
    >>
    >> http://www.siam.org/siamnews/mtc/mtc193.htm
    >>
    >> I always have hardware compression on (it is the default for the
    >> drive), and have never lost anything. But that is 2 data points
    >> (2 drives).

    >
    >
    > You might look into using ARJ for this sort of archival compression. It
    > has provisions for generating the extra information needed to recover
    > from faulty media. I don't know the algorithms used, nor have I really
    > tested the result. It is free for personal use, but may not be
    > available for Linux.
    >
    >
    >

    I looked there, but they do not describe the algorithm used, so I cannot
    tell if it is superior to the hardware algorithm used in my tape drive itself.

    --
    .~. Jean-David Beyer Registered Linux User 85642.
    /V\ PGP-Key: 9A2FC99A Registered Machine 241939.
    /( )\ Shrewsbury, New Jersey http://counter.li.org
    ^^-^^ 08:55:00 up 30 days, 19:27, 5 users, load average: 4.12, 4.18, 4.13

  11. Re: rescue damaged tar file

    On 25 Dec 2005 22:19:22 -0500 Michael Powe wrote:
    |>>>>> "phil" == phil-news-nospam writes:
    |
    | phil> [followup set to comp.unix.admin only]
    | phil> In comp.unix.admin Michael Powe wrote:
    |
    | phil> | I have a compressed archive that I created using GNU tar
    | phil> on a gentoo | system. The tar file is about 900 MB
    | phil> compressed and about 2.5 GB | uncompressed.
    | phil> |
    | phil> | This archive has bad data about 1/3 to 1/2 way through it.
    | phil> I'm trying | to find a way to get at the data below the
    | phil> damaged part. At the point | of failure, tar reports:
    |
    | phil> [etc]
    |
    | phil> Possibly the damage is in a form that effects the relative
    | phil> offset of the data, for example the addition or deletion of
    | phil> an amount of data that is not a multiple of 512 bytes. What
    | phil> a tar recovery program would have to do is look for what
    | phil> appears to be a valid tar header using every level of byte
    | phil> offset to find something. Assuming only one point of
    | phil> corruption, this might be doable from the end of the file
    | phil> working in reverse to the front. What is the exact number
    | phil> of bytes of the uncompressed copy?
    |
    | 1010083339 2005-12-24 15:44 powem12242005.tar.gz
    | 2469939203 2005-12-25 20:41 powem12242005.tar.recovered
    |
    | The "recovered" file is generated by gzrecover from GNU Recovery Tool Kit
    | (http://www.urbanophile.com/arenn/coding/gzrt/gzrt.html). Standard
    | gzip will not unpack the file and exits with a CRC error.
    |
    | BTW, if I had some pointers on analyzing these tar files, I have some
    | knowledge of Java and perl and I'd be willing to spend some time
    | trying to "reverse engineer" the file, so to speak. I have no
    | experience handling binary files in that manner, though.

    If the damage was done to the compressed file, that could result in the
    uncompressed data being totally corrupt. The recovery tool would be a
    best effort attempt.

    --
    -----------------------------------------------------------------------------
    | Phil Howard KA9WGN | http://linuxhomepage.com/ http://ham.org/ |
    | (first name) at ipal.net | http://phil.ipal.org/ http://ka9wgn.ham.org/ |
    -----------------------------------------------------------------------------

  12. Re: rescue damaged tar file

    Michael Powe wrote:
    >
    > phil> | This archive has bad data about 1/3 to 1/2 way through it.
    > phil> I'm trying | to find a way to get at the data below the
    > phil> damaged part. At the point | of failure, tar reports:
    >
    > 1010083339 2005-12-24 15:44 powem12242005.tar.gz
    > 2469939203 2005-12-25 20:41 powem12242005.tar.recovered


    If the damage is in the tar phase not the compression phase, then
    playing around with:

    dd skip=N powem12242005.tar.recovered | tar tvf - | head

    might help. It's a huge "if" and it would take a lot of fshing to
    locate any header after the corrupt section. Especially not knowing
    in advance if it's the compression so there isn't any good point
    after the corruption.


  13. Re: rescue damaged tar file

    On 27 Dec 2005 07:57:20 -0800 Doug Freyburger wrote:
    | Michael Powe wrote:
    |>
    |> phil> | This archive has bad data about 1/3 to 1/2 way through it.
    |> phil> I'm trying | to find a way to get at the data below the
    |> phil> damaged part. At the point | of failure, tar reports:
    |>
    |> 1010083339 2005-12-24 15:44 powem12242005.tar.gz
    |> 2469939203 2005-12-25 20:41 powem12242005.tar.recovered
    |
    | If the damage is in the tar phase not the compression phase, then
    | playing around with:
    |
    | dd skip=N powem12242005.tar.recovered | tar tvf - | head
    |
    | might help. It's a huge "if" and it would take a lot of fshing to
    | locate any header after the corrupt section. Especially not knowing
    | in advance if it's the compression so there isn't any good point
    | after the corruption.

    The size of his recovered file, 2469939203 bytes, is: 4824100 * 512 + 3
    I think the first issue is to found whereabout that extra 3 bytes is.
    Then increment in 512 byte steps in alignment with the end of the file
    to see where a valid tar header might be found, in the GZ recovery was
    able to re-stablish decompression state correctly.

    --
    -----------------------------------------------------------------------------
    | Phil Howard KA9WGN | http://linuxhomepage.com/ http://ham.org/ |
    | (first name) at ipal.net | http://phil.ipal.org/ http://ka9wgn.ham.org/ |
    -----------------------------------------------------------------------------

  14. Re: rescue damaged tar file

    In article ,
    Jules wrote:
    >On Mon, 26 Dec 2005 11:40:26 -0500, Michael Powe wrote:
    >
    >>
    >> I have looked at the various 'google' options but without joy. The
    >> "Advanced Tar Recover" tool failed.

    >
    >Best option is to make absolutely certain that you can't get the damaged
    >section of the data back - if the reason for the corruption is due to the
    >source media (tape or disk say) then there are sometimes options there to
    >recover bad data...
    >
    >And in the future, don't compress archives of anything critical, just
    >because of potential problems like this


    Note that bzip & bzip2 do block compression (that's what the 'b' is for),
    and bzip2recover lets you split a bzip2 file into those blocks so that you can
    recover data from any that are not corrupted.

    John
    --
    John DuBois spcecdt@armory.com KC6QKZ/AE http://www.armory.com/~spcecdt/

  15. Re: rescue damaged tar file

    >>>>> "phil-news-nospam" == phil-news-nospam writes:

    phil-news-nospam> On 27 Dec 2005 07:57:20 -0800 Doug Freyburger wrote:
    phil-news-nospam> | Michael Powe wrote:
    phil-news-nospam> |> |> phil> | This archive has bad data about
    phil-news-nospam> 1/3 to 1/2 way through it. |> phil> I'm trying
    phil-news-nospam> | to find a way to get at the data below the |>
    phil-news-nospam> phil> damaged part. At the point | of failure,
    phil-news-nospam> tar reports: |> |> 1010083339 2005-12-24 15:44
    phil-news-nospam> powem12242005.tar.gz |> 2469939203 2005-12-25
    phil-news-nospam> 20:41 powem12242005.tar.recovered | | If the
    phil-news-nospam> damage is in the tar phase not the compression
    phil-news-nospam> phase, then | playing around with: | | dd skip=N
    phil-news-nospam> powem12242005.tar.recovered | tar tvf - | head |
    phil-news-nospam> | might help. It's a huge "if" and it would
    phil-news-nospam> take a lot of fshing to | locate any header
    phil-news-nospam> after the corrupt section. Especially not
    phil-news-nospam> knowing | in advance if it's the compression so
    phil-news-nospam> there isn't any good point | after the
    phil-news-nospam> corruption.

    phil-news-nospam> The size of his recovered file, 2469939203
    phil-news-nospam> bytes, is: 4824100 * 512 + 3 I think the first
    phil-news-nospam> issue is to found whereabout that extra 3 bytes
    phil-news-nospam> is. Then increment in 512 byte steps in
    phil-news-nospam> alignment with the end of the file to see where
    phil-news-nospam> a valid tar header might be found, in the GZ
    phil-news-nospam> recovery was able to re-stablish decompression
    phil-news-nospam> state correctly.

    How can I tell what a tar header looks like? Where can I find out?
    When I 'head -c 512' or 'tail -c 512' the file, I see a mixture of text from text
    files and binary data, but I don't see anything identifiable.

    (Linux 2.4.31~ellen) [powem] [ /home]
    520 $ --> !head
    head -c 512 powem12242005.tar.recovered
    powem/#_ascp_bcgaw@cgaw.org_b_ahome_acgaw_apublic__html_ alayout.html#0000600000175000001440000000000010343 042116026673 0ustar powemusers00000000000000

    Thanks.

    mp

    --
    Michael Powe michael@trollope.org Naugatuck CT USA

    "The secret to strong security: less reliance on secrets."
    -- Whitfield Diffie

  16. Re: rescue damaged tar file

    Michael Powe wrote:
    >
    > How can I tell what a tar header looks like? Where can I find out?


    man tar. Look in the see also section. man 4 tar
    man file. Look in the see also section. man magic, more /etc/magic.

    This is how UNIX documentation works. It takes a while to get
    comfortable with how the man pages work.

    > When I 'head -c 512' or 'tail -c 512' the file, I see a mixture of text from text
    > files and binary data, but I don't see anything identifiable.


    Yup, straight out of the section 4 doc on it. It needs binary for
    format,
    text because what good is a filename that isn't text.


  17. Re: rescue damaged tar file

    In article <87fyobl8u0.fsf@ellen.trollope.org>,
    >
    >How can I tell what a tar header looks like? Where can I find out?
    >When I 'head -c 512' or 'tail -c 512' the file, I see a mixture of text
    >from text
    >files and binary data, but I don't see anything identifiable.
    >
    >(Linux 2.4.31~ellen) [powem] [ /home]
    > 520 $ --> !head
    >head -c 512 powem12242005.tar.recovered
    >powem/#_ascp_bcgaw@cgaw.org_b_ahome_acgaw_apublic__html_ alayout.html#0000600000175000001440000000000010343 042116026673 0ustar powemusers00000000000000


    In the olden days, man 5 tar would get you an explanation of the tar header.
    This seems to be gone from anything touched by GNU.

    Maybe you can find some other flavor of Unix system to look at. Or read
    the source for GNU tar to get some clues.

    carl

    --
    carl lowenstein marine physical lab u.c. san diego
    clowenst@ucsd.edu

  18. Re: rescue damaged tar file

    In article ,
    Carl Lowenstein wrote:
    >In article <87fyobl8u0.fsf@ellen.trollope.org>,
    >>
    >>How can I tell what a tar header looks like? Where can I find out?
    >>When I 'head -c 512' or 'tail -c 512' the file, I see a mixture of text
    >>from text
    >>files and binary data, but I don't see anything identifiable.
    >>
    >>(Linux 2.4.31~ellen) [powem] [ /home]
    >> 520 $ --> !head
    >>head -c 512 powem12242005.tar.recovered
    >>powem/#_ascp_bcgaw@cgaw.org_b_ahome_acgaw_apublic__html_ alayout.html#0000600000175000001440000000000010343 042116026673 0ustar powemusers00000000000000


    >In the olden days, man 5 tar would get you an explanation of the tar header.
    >This seems to be gone from anything touched by GNU.


    >Maybe you can find some other flavor of Unix system to look at. Or read
    >the source for GNU tar to get some clues.


    man 5 tar on a FreeBSD system will list the tar structures for
    four different tar implementations going back to the original tar,
    ustar [Unix standard tar], gnutar, and one for spare-headers.

    Bill
    --
    Bill Vermillion - bv @ wjv . com

  19. Re: rescue damaged tar file

    "Michael Powe" wrote in message
    news:u64pdnebu.fsf@trollope.org

    > I have a compressed archive that I created using GNU tar on a gentoo
    > system. The tar file is about 900 MB compressed and about 2.5 GB
    > uncompressed.


    You will of course have verified that your filesystem is capable of sizes >
    2 GB?

    > This archive has bad data about 1/3 to 1/2 way through it. I'm trying
    > to find a way to get at the data below the damaged part. At the point
    > of failure, tar reports:
    >
    > tar: skipping to next header
    > tar: archive contains obsolescent base-64 headers
    > tar: error exit delayed from previous errors

    ....
    > If anyone can tell me how I can recover this data, I will be eternally
    > grateful.


    You might try modifying the code (e.g., s:/dev/rfd0:/your/tar/filename: )
    in
    http://paxutils.progiciels-bpi.ca/sh...vaging&index=3 compile and run to see if that works around the corruption.


  20. Re: rescue damaged tar file

    Michael Powe, dom20051225@20:53:25(CET):
    >


    Maybe I'm a bit late but anyway...


    > This archive has bad data about 1/3 to 1/2 way through it. I'm trying
    > to find a way to get at the data below the damaged part.


    > I used the gzip recovery kit to unpack the file. This worked and gave
    > me the 2.5 GB uncompressed file. However, tar will only unpack up to
    > the damaged part, and even with --ignore-failed-read, blows up.


    I was once in the same problem and found a Perl script in the web that made
    my day. Let's reproduce a session:

    $ tar cf bin.tar /bin/
    tar: Removing leading `/' from member names
    tar: Removing leading `/' from hard link targets
    $ ll bin.tar
    -rw------- 1 hue hue 2836480 20060202:233334+0100 bin.tar
    $ dd if=/dev/urandom of=bin.tar bs=1k seek=1k count=300 conv=notrunc
    300+0 records in
    300+0 records out
    307200 bytes (307 kB) copied, 0,101178 seconds, 3,0 MB/s
    $ tar tf bin.tar
    bin/
    bin/bash
    bin/rbash
    bin/sh
    bin/cat
    bin/chgrp
    bin/chmod
    bin/chown
    bin/cp
    bin/date
    bin/dd
    bin/df
    bin/dir
    tar: Skipping to next header
    tar: Archive contains obsolescent base-64 headers
    tar: Error exit delayed from previous errors
    $ perl ~hue/lang/perl/find_tar_headers.pl bin.tar
    bin.tar:0:bin/:0
    bin.tar:512:bin/bash:2467624
    bin.tar:685056:bin/rbash:0
    bin.tar:685568:bin/sh:0
    bin.tar:686080:bin/cat:40550
    bin.tar:703488:bin/chgrp:100664
    bin.tar:737280:bin/chmod:73560
    bin.tar:768512:bin/chown:105250
    bin.tar:804864:bin/cp:153524
    bin.tar:860672:bin/date:126274
    bin.tar:905728:bin/dd:113410
    bin.tar:945152:bin/df:103344
    bin.tar:980480:bin/dir:226110
    bin.tar:1368064:bin/vdir:226110
    bin.tar:1445888:bin/sleep:33104
    bin.tar:1460736:bin/stty:110140
    [...]
    $ _

    We can see that the offset of the next tar header is that of vdir, at
    1368064. In order to make dd fast, we obtain offset in Kb:

    $ bc
    [...]
    scale=4
    1368064/1024
    1336.0000
    $ dd if=bin.tar of=bin-tail.tar bs=1k skip=1336
    1434+0 records in
    1434+0 records out
    1468416 bytes (1,5 MB) copied, 0,017176 seconds, 85,5 MB/s
    $ file bin-tail.tar
    bin-tail.tar: POSIX tar archive
    $ tar xf bin.tar
    tar: Skipping to next header
    tar: Archive contains obsolescent base-64 headers
    tar: Error exit delayed from previous errors
    $ tar xf bin-tail.tar
    $ rm -f bin/dir ## probably corrupted

    And now we have everything recoverable:

    $ ls bin|wc -l
    83
    $ ls /bin/|wc -l
    95
    $ _

    This is find_tar_headers.pl:

    -----------------------------
    #!/usr/bin/perl -w
    use strict;

    # 99.9% of all credits for this script go
    # to Tore Skjellnes
    # who is the originator.

    my $tarfile;
    my $c;
    my $hit;
    my $header;

    # if you don't get any results, outcomment the line below and
    # decomment the line below the it and retry
    my @src = (ord('u'),ord('s'),ord('t'),ord('a'),ord('r'),ord(" "), ord(" "),0);
    #my @src = (ord('u'),ord('s'),ord('t'),ord('a'),ord('r'),0,or d('0'),ord('0'));

    die "No tar file given on command line" if $#ARGV != 0;

    $tarfile = $ARGV[0];

    open(IN,$tarfile) or die "Could not open `$tarfile': $!";

    $hit = 0;
    $| = 1;
    seek(IN,257,0) or die "Could not seek forward 257 characters in `$tarfile': $!";
    while (read(IN,$c,1) == 1)
    {
    ($hit = 0, next) unless (ord($c) == $src[$hit]);
    $hit = $hit + 1;
    ( print "hit: $hit", next ) unless $hit > $#src;


    # we have a probable header at (pos - 265)!
    my $pos = tell(IN) - 265;
    seek(IN,$pos,0)
    or (warn "Could not seek to position $pos in `$tarfile': $!", next);

    (read(IN,$header,512) == 512)
    or (warn "Could not read 512 byte header at position $pos in `$tarfile': $!", seek(IN,$pos+265,0),next);

    my ($name, $mode, $uid, $gid, $size, $mtime, $chksum, $typeflag,
    $linkname, $magic, $version, $uname, $gname,
    $devmajor, $devminor, $prefix)
    = unpack ("Z100a8a8a8Z12a12a8a1a100a6a2a32a32a8a8Z155", $header);
    $size = int $size;
    printf("%s:%s:%s:%s\n",$tarfile,$pos,$name,$size);

    $hit = 0;
    }

    close(IN) or warn "Error closing `$tarfile': $!";
    -----------------------------

    Good luck.


    --
    David Serrano

+ Reply to Thread