Rsync compression problem - sometimes ineffective? - Tools

This is a discussion on Rsync compression problem - sometimes ineffective? - Tools ; Running rsync 2.6.9-1.el4.rf on CentOS 4.4 client and remote server. Backing up user data from 2 different clients using following: su - $HOSTID -c 'rsync -azr --timeout=600 --log-file=$DEBUGFILE --log-file-format="%o %f %b %l %i" --stats --delete --bwlimit=$BANDWDT --rsh="ssh -P ____" $STAGE ...

+ Reply to Thread
Results 1 to 2 of 2

Thread: Rsync compression problem - sometimes ineffective?

  1. Rsync compression problem - sometimes ineffective?


    Running rsync 2.6.9-1.el4.rf on CentOS 4.4 client and remote server.
    Backing up user data from 2 different clients using following:

    su - $HOSTID -c 'rsync -azr --timeout=600 --log-file=$DEBUGFILE
    --log-file-format="%o %f %b %l %i" --stats --delete --bwlimit=$BANDWDT
    --rsh="ssh -P ____" $STAGE $TARGET:$TARGETDIR'

    Using "bytes sent"/"literal data" from statistics as a rough estimation
    (I know there is overhead in the bytes sent) of the effectiveness of
    compression, most days I see reasonable compression, such as from our
    summary (X MBytes compressed=bytes sent; XMbytes uncompressed=Literal
    data):

    rsync $HOSTID transferred 46.20 MBytes compressed (210.45 MBytes
    uncompressed)
    52 minutes and 6 seconds
    45.50 kBps
    6,896 files changed out of 81,720 total files (8.44%)

    or

    rsync $HOSTID transferred 543.53 MBytes compressed (3.66 GBytes
    uncompressed)
    2 hours, 16 minutes and 38 seconds
    89.12 kBps
    7,343 files changed out of 79,944 total files (9.19%)

    Some days, I see no evidence of compression, such as this:

    rsync $HOSTID transferred 52.10 MBytes compressed (50.06 MBytes
    uncompressed)
    59 minutes and 48 seconds
    53.98 kBps
    5,350 files changed out of 80,257 total files (6.67%)

    or similarly this:

    rsync $HOSTID transferred 1007.55 MBytes compressed (1004.59 MBytes
    uncompressed)
    3 hours, 38 minutes and 47 seconds
    92.27 kBps
    9,888 files changed out of 79,306 total files (12.47%)


    My initial thought was that days of no apparent compression were when
    the majority of the changed files were small files (like when gzipping a
    small ASCII file doubles it size) or already compressed files. But so
    far I haven't been able to confirm this. I'm not sure this logic
    applies since rsync compresses data blocks (at least as I understand
    it), and those blocks would be fairly consistent in size (I think). Is
    this general understanding of rsync's compression correct?

    I searched the samba.org local archives first, and then Internet wide,
    using +rsync +compression +problem, but didn't find any similar posts.
    Less restrictive searches didn't help any either. I also didn't see
    anything in the FAQ or current issues and debugging areas.

    Has anyone seen this sort of behaviour before? Can you offer
    suggestions of additional diagnostics to attempt? What additional
    information might be useful to support my contention that this is
    related to the data being changed on those "uncompressed" days?

    Thanks

    Donald E. Bodle, Jr.
    Sr. Systems Developer
    The Reynolds and Reynolds Co.
    (937) 485-1954

    Are you okay with today, if tomorrow is the end?
    - Superchick (So Bright)

    This message is confidential and may contain confidential information.
    It is intended only for the individual[s] named herein. If this message
    is being sent from a member of the legal department, it may also be
    legally privileged. If you are not the named addressee[s] you must
    delete this email immediately. Do not disseminate, distribute or copy.


    --
    Please use reply-all for most replies to avoid omitting the mailing list.
    To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
    Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


  2. Re: Rsync compression problem - sometimes ineffective?

    On Thu, 2008-06-12 at 13:35 -0400, Bodle, Donald E wrote:
    > Using "bytes sent"/"literal data" from statistics as a rough estimation
    > (I know there is overhead in the bytes sent) of the effectiveness of
    > compression, most days I see reasonable compression


    > My initial thought was that days of no apparent compression were when
    > the majority of the changed files were small files (like when gzipping a
    > small ASCII file doubles it size) or already compressed files. But so
    > far I haven't been able to confirm this. I'm not sure this logic
    > applies since rsync compresses data blocks (at least as I understand
    > it), and those blocks would be fairly consistent in size (I think). Is
    > this general understanding of rsync's compression correct?


    My guess is that the files are already compressed.

    To see the actual size (compressed if applicable) of the delta rsync is
    sending for each file, use the %b log option, e.g.,
    --out-format='%b %i %n%L' . You can compare those numbers with and
    without compression to see which deltas aren't compressing as well as
    you expect. Unfortunately, %b only seems to work on a run that really
    updates a destination, so you'll have to use a throwaway destination
    (perhaps with --compare-dest to the real one) for the tests; %b ought to
    work in --only-write-batch mode. To investigate why a particular delta
    isn't compressing, you could use rdiff to write the delta to a file and
    then look at the data inside.

    Matt

    --
    Please use reply-all for most replies to avoid omitting the mailing list.
    To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
    Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.9 (GNU/Linux)

    iEYEABECAAYFAkhRc6oACgkQC+xSYN/RlfvAwgCgth/jjgdOzr3O7fxhME4t7snS
    Mc0AniC1apgSMMjMa0SNTfRYAQDj8t1n
    =GnuM
    -----END PGP SIGNATURE-----


+ Reply to Thread