Rsync efficiency on SQL dump files - Tools

This is a discussion on Rsync efficiency on SQL dump files - Tools ; G'day, I'm backing up production MySQL database servers. The tables are almost all MyISAM. My plan is to use MySQL binary logging and then rsync the binary logs offsite hourly. The binary log files are only appended to, with new ...

+ Reply to Thread
Results 1 to 2 of 2

Thread: Rsync efficiency on SQL dump files

  1. Rsync efficiency on SQL dump files

    G'day,

    I'm backing up production MySQL database servers. The tables are almost
    all MyISAM.

    My plan is to use MySQL binary logging and then rsync the binary logs
    offsite hourly. The binary log files are only appended to, with new
    queries logged at the end of the file. So I'm assuming the rsync
    algorithm will be highly effective at reducing the bandwith required to
    transfer these files.

    My question is regarding MySQL dump files. From one backup to the next,
    I'd estimate that the MySQL data will be 90% consistent. So the dump
    file will have 10% new data. That new data will be scattered at random
    points through the file.

    Typically I'd pipe the output from mysqldump into gzip. Then I'd copy
    the gzipped file offsite. I wonder if saving the file uncompressed, and
    then using rsync to copy the file offsite will be more efficient.

    Any advice? Will rsync be effective at finding the 90% consistency
    between the new dump file and the old dump file? Will it be able to
    transfer only the 10% of the file that has changed?

    I'd welcome any other advice or feedback on my proposed setup.

    Cheers - Callum.

    PS> Apologies if this double posted. I tried it a few days ago but
    haven't seen it on the list yet.



    --
    Please use reply-all for most replies to avoid omitting the mailing list.
    To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
    Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


  2. Re: Rsync efficiency on SQL dump files

    On Wed 01 Oct 2008, Callum Macdonald wrote:
    >
    > I'm backing up production MySQL database servers. The tables are almost
    > all MyISAM.
    >
    > My plan is to use MySQL binary logging and then rsync the binary logs
    > offsite hourly. The binary log files are only appended to, with new
    > queries logged at the end of the file. So I'm assuming the rsync
    > algorithm will be highly effective at reducing the bandwith required to
    > transfer these files.


    I'd recommend using --append --inplace with these binlogs (not with most
    other data!).


    > My question is regarding MySQL dump files. From one backup to the next,
    > I'd estimate that the MySQL data will be 90% consistent. So the dump
    > file will have 10% new data. That new data will be scattered at random
    > points through the file.
    >
    > Typically I'd pipe the output from mysqldump into gzip. Then I'd copy
    > the gzipped file offsite. I wonder if saving the file uncompressed, and
    > then using rsync to copy the file offsite will be more efficient.
    >
    > Any advice? Will rsync be effective at finding the 90% consistency
    > between the new dump file and the old dump file? Will it be able to
    > transfer only the 10% of the file that has changed?


    That depends on how large the files are (which affects the block size
    rsync uses) and how close together the changes are.

    Also consider using --order-by-primary with mysqldump, which will at
    least ensure that the order of rows won't change, which may help.
    However, if you database is very large, this may impact the dump time
    too much.

    Your gzip may also have a --rsyncable option which optimizes the
    compressed file so that small changes in the original lead to smaller
    changes in the compressed result, at a cost of a slightly larger
    compressed file. Experimenting with different combinations of things
    will show the best strategy in your particular case.

    > PS> Apologies if this double posted. I tried it a few days ago but
    > haven't seen it on the list yet.


    Didn't see it either


    Paul Slootman
    --
    Please use reply-all for most replies to avoid omitting the mailing list.
    To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
    Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


+ Reply to Thread