memory usage in rsync 3.0.3 -- how much RAM should I have totransfer 13 million files? - Tools

This is a discussion on memory usage in rsync 3.0.3 -- how much RAM should I have totransfer 13 million files? - Tools ; Hi. I am trying to recursively rsync a directory containing 13 million files. Right now this is killing my server, in terms of memory usage. I've upgraded from rsync 2.6.9 to 3.0.3 on both ends, but memory usage is still ...

+ Reply to Thread
Results 1 to 5 of 5

Thread: memory usage in rsync 3.0.3 -- how much RAM should I have totransfer 13 million files?

  1. memory usage in rsync 3.0.3 -- how much RAM should I have totransfer 13 million files?

    Hi. I am trying to recursively rsync a directory containing 13 million files.

    Right now this is killing my server, in terms of memory usage.

    I've upgraded from rsync 2.6.9 to 3.0.3 on both ends, but memory usage
    is still too high. I killed the rsync process when it reached 256 MB
    in size.

    I only have 1 GB of RAM in this server.

    We've planned an outage to upgrade it to 3 GB, but will that be enough?

    How much memory will rsync use? I didn't specificy any of the
    switches that disable incremental recursion.

    Thanks,
    Aleksey
    --
    Please use reply-all for most replies to avoid omitting the mailing list.
    To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
    Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


  2. Re: memory usage in rsync 3.0.3 -- how much RAM should I have totransfer 13 million files?

    Aleksey Tsalolikhin wrote:

    > I've upgraded from rsync 2.6.9 to 3.0.3 on both ends, but memory usage
    > is still too high.

    Why should rsync 3's memory usage depend on the number of files? Does it
    keep files it already knows should not be transferred in memory?

    If not, then maybe we should hold back rsync's very useful, very speed
    productive, read ahead of the file list. If we see that the "todo list"
    piles up, maybe we should hold of the continued scan until the back log
    gets smaller.

    Yes, I know, it's the typical someone sitting on the fence, hardly ever
    doing anything useful for the project, and dispensing "invaluable"
    advice. Fact is, I need this. If Wayne doesn't do it, I will get around
    to it eventually. The problem is that the key word here is "eventually".

    Shachar
    --
    Please use reply-all for most replies to avoid omitting the mailing list.
    To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
    Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


  3. Re: memory usage in rsync 3.0.3 -- how much RAM should I have totransfer 13 million files?

    On Tue, Aug 12, 2008 at 11:46:11AM -0700, Aleksey Tsalolikhin wrote:
    > How much memory will rsync use? I didn't specificy any of the
    > switches that disable incremental recursion.


    It depends on your options, and possibly on the maximum number of files
    in a directory. I've seen a recursive scan use about 20MB for a huge
    set of files, which is not much more memory than a typical bash or zsh
    process uses.

    You should make extra sure that you're not disabling incremental
    recursion by specifying at least one -v option and checking that rsync
    says that it is sending/receiving an incremental file list at the start
    (not the older "building file list" or "receiving file list" messages).

    If it is doing an incremental scan, it still processes the immediate
    contents of every individual directory as a unit (and has at least one
    directory of read-ahead in memory), so if you have really huge numbers
    of files in your directories, that will increase the maximum memory
    used.

    Other options will also affect memory use, such as -H which makes rsync
    search for hard-link matches over the whole of the hierarchy, and that
    can really bloat things, particularly if a large percentage of the files
    in the transfer have more than one link. (Since you didn't cite your
    options, I can't be more specific.)

    ...wayne..
    --
    Please use reply-all for most replies to avoid omitting the mailing list.
    To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
    Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


  4. Re: memory usage in rsync 3.0.3 -- how much RAM should I have totransfer 13 million files?

    Thank you, Wayne. My options are:

    --server --sender --numeric-ids --perms --owner --group -D --links
    --hard-links --times --block-size=2048 --recursive . /

    We don't have hard links, AFAIK.

    I am archiving 2 months of data, and then I will trying doing another
    rsync run, and I'll add the -v option as you suggested.


    Best,
    Aleksey
    --
    Please use reply-all for most replies to avoid omitting the mailing list.
    To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
    Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


  5. Re: memory usage in rsync 3.0.3 -- how much RAM should I have totransfer 13 million files?

    To answer my own question, my rsync process has grown to 1 Gig
    resident in memory and seems to be holding steady at size as it keeps
    chugging along.

    Boy am I glad incremental recursion has been added - thanks!! I
    would never have been able to do this rsync otherwise.

    Aleksey

    On 8/14/08, Aleksey Tsalolikhin wrote:
    > Hi. Let's say I have 10,000 files per directory. If I understood
    > Wayne, rsync builds a list for all the files in the current dir, plus
    > another list for the directory being read-ahead.
    >
    > So how much memory should rsync use, for 20,000 files?
    >
    > I did double-check with -v, and it says "receiving incremental file
    > list" at the start of the rsync session.
    >
    > I've also removed the -H option.
    >
    > My options now are:
    >
    > --numeric-ids
    > --perms
    > --owner
    > --group
    > -D
    > --links
    > --times
    > --block-size=2048
    > --recursive
    > -v
    > --ignore-times
    >
    > What should I expect in terms of rsync's memory usage, please?
    >
    > Best,
    > Aleksey
    >

    --
    Please use reply-all for most replies to avoid omitting the mailing list.
    To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
    Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


+ Reply to Thread