TAR & GZIP into multiple files - Setup

This is a discussion on TAR & GZIP into multiple files - Setup ; Hi all, Is it possible to tar + gzip a huge directory and split into multiple archives ? The command I used to use is cd /data/ (which has a directory "output" under it) tar -czf /backup/data_May2007.tar.gz output This output ...

+ Reply to Thread
Results 1 to 6 of 6

Thread: TAR & GZIP into multiple files

  1. TAR & GZIP into multiple files

    Hi all,

    Is it possible to tar + gzip a huge directory and split into multiple
    archives ?

    The command I used to use is

    cd /data/ (which has a directory "output" under it)

    tar -czf /backup/data_May2007.tar.gz output

    This output folder contains about a million files and after
    compression the archive take the space about 15 Gb.

    I want to separate the archive into DVD size but I don't want to use
    up too much space to do all the operation.Any quick one line command I
    can achieve it? Thanks.


  2. Re: TAR & GZIP into multiple files

    Christopher Ho wrote:
    > Hi all,
    >
    > Is it possible to tar + gzip a huge directory and split into multiple
    > archives ?
    >
    > The command I used to use is
    >
    > cd /data/ (which has a directory "output" under it)
    >
    > tar -czf /backup/data_May2007.tar.gz output
    >
    > This output folder contains about a million files and after
    > compression the archive take the space about 15 Gb.
    >
    > I want to separate the archive into DVD size but I don't want to use
    > up too much space to do all the operation.Any quick one line command I
    > can achieve it? Thanks.
    >



    IIRC you can 'split' a tarfile using some command or other..yup.
    'split' haha.

    Probably best to do the tar and pipe it to split, then compress the
    resultant files..not sure about reassembling a zipped file..guess it
    SHOULD work..

    However i'd be more inclined to split the data directory into subdirs,
    and make a tar of each one..


  3. Re: TAR & GZIP into multiple files

    On Thu, 12 Jul 2007 15:06:53 +0000, Christopher Ho wrote:

    > Hi all,
    >
    > Is it possible to tar + gzip a huge directory and split into multiple
    > archives ?
    >
    > The command I used to use is
    >
    > cd /data/ (which has a directory "output" under it)
    >
    > tar -czf /backup/data_May2007.tar.gz output
    >
    > This output folder contains about a million files and after
    > compression the archive take the space about 15 Gb.
    >
    > I want to separate the archive into DVD size but I don't want to use
    > up too much space to do all the operation.Any quick one line command I
    > can achieve it? Thanks.
    >

    I am not sure about minimizing space used by potential intermediate files,
    but one trick is to use the command split. man split.
    Split breaks a larger file into blocks of the specified size.

    The individual block files can be written to DVD. The blocks must be
    pasted back together when restoring. Device mapper's "linear" module may
    be useful in virtually pasting the components back together. Without that
    trick, this is the basic idea:

    $ cat /mnt/dvd/part.xxx >>restore.tgz

    (For every block, and in sequence.)

    Notes:
    1. The entire archive may be unusable if any one DVD fails or becomes
    unreadable.

    2. Compute md5sums for each component written to DVD to provide a method
    in the future of ensuring that the disc is still readable.

    3. Consider adding an encryption layer to system backups.

    4. As far as minimizing space, there are probably some tricks that can be
    used to minimize intermediate files. These tricks would probably use
    fifo's, loopbacks, dd, and split. These tricks could work because, AIUI,
    tar and gzip do not require seeking within a file, and simply work with
    the stream data. It might not work, if loopbacks are prevented from
    working with a fifo- which apparently they are. I am not sure of the
    solution at this time.

    --
    Douglas Mayne



  4. Re: TAR & GZIP into multiple files

    Douglas Mayne wrote:
    > On Thu, 12 Jul 2007 15:06:53 +0000, Christopher Ho wrote:
    >
    >> Hi all,
    >>
    >> Is it possible to tar + gzip a huge directory and split into multiple
    >> archives ?
    >>
    >> The command I used to use is
    >>
    >> cd /data/ (which has a directory "output" under it)
    >>
    >> tar -czf /backup/data_May2007.tar.gz output
    >>
    >> This output folder contains about a million files and after
    >> compression the archive take the space about 15 Gb.
    >>
    >> I want to separate the archive into DVD size but I don't want to use
    >> up too much space to do all the operation.Any quick one line command I
    >> can achieve it? Thanks.
    >>

    > I am not sure about minimizing space used by potential intermediate files,
    > but one trick is to use the command split. man split.
    > Split breaks a larger file into blocks of the specified size.
    >
    > The individual block files can be written to DVD. The blocks must be
    > pasted back together when restoring. Device mapper's "linear" module may
    > be useful in virtually pasting the components back together. Without that
    > trick, this is the basic idea:
    >
    > $ cat /mnt/dvd/part.xxx >>restore.tgz
    >
    > (For every block, and in sequence.)
    >
    > Notes:
    > 1. The entire archive may be unusable if any one DVD fails or becomes
    > unreadable.


    This is a good reason to split the data before tarring..alphabetical
    type wildcards can be used to eg. select files starting with a-g, h-o,
    p-z etc..

    >
    > 2. Compute md5sums for each component written to DVD to provide a method
    > in the future of ensuring that the disc is still readable.
    >
    > 3. Consider adding an encryption layer to system backups.
    >
    > 4. As far as minimizing space, there are probably some tricks that can be
    > used to minimize intermediate files. These tricks would probably use
    > fifo's, loopbacks, dd, and split. These tricks could work because, AIUI,
    > tar and gzip do not require seeking within a file, and simply work with
    > the stream data. It might not work, if loopbacks are prevented from
    > working with a fifo- which apparently they are. I am not sure of the
    > solution at this time.
    >


    At some level. whether its RAM paged out to swap, or system memory,
    large parts of the backup process will use temporary disk.

    Pipes certainly help..

    Something like
    cd /data
    find . -iregex | tar cv- | gzip >
    /backup./image1.gz




  5. Re: TAR & GZIP into multiple files

    Em Quinta, 12 de Julho de 2007 16:06, Christopher Ho escreveu:

    > Hi all,
    >
    > Is it possible to tar + gzip a huge directory and split into multiple
    > archives ?
    >


    rar is more apropriate for that job.

  6. Re: TAR & GZIP into multiple files

    On 12 Jul, 16:06, Christopher Ho wrote:
    > Hi all,
    >
    > Is it possible to tar + gzip a huge directory and split into multiple
    > archives ?
    >
    > The command I used to use is
    >
    > cd /data/ (which has a directory "output" under it)
    >
    > tar -czf /backup/data_May2007.tar.gz output


    Think about creating a loop:

    # cd /data
    # for name in *; tar czvf /backup/data_May2007_$name.tar.gz $name;
    done

    You can get more complex than that to get any dotfiles in in /data,
    but you get the idea.

    You can also split the tar.gz into multiple pieces with "split", but I
    don't recommend that. It means having to grab and recover all the
    DVD's to recover things reliably.


+ Reply to Thread