File Locking - Tools

This is a discussion on File Locking - Tools ; On Mon 18 Aug 2008, [EMAIL PROTECTED] wrote: > > >> The way I solved this problem for a data-mirroring system was to use a > >> small wrapper script that ensures only one invocation of rsync is ever > ...

+ Reply to Thread
Results 1 to 3 of 3

Thread: File Locking

  1. File Locking

    On Mon 18 Aug 2008, [EMAIL PROTECTED] wrote:
    >
    > >> The way I solved this problem for a data-mirroring system was to use a
    > >> small wrapper script that ensures only one invocation of rsync is ever
    > >> running at one time. This proved to be a robust solution for our situation.

    >
    > >> --Kyle

    >
    > Thanks for the idea but I sometimes need more than one copy to run at the
    > same time so that won't work. File locking is about the only option I can
    > see.


    >> If you need to run more than one rsync over the same tree at the same
    >> time, you may need to rethink your tree layout and/or your approach to
    >> whatever problem you're trying to fix. Running two rsyncs sequentially
    >> will usually be faster than two concurrent ones, as that might cause
    >> disk thrashing (the heads are continually seeking to and from where the
    >> respective rsync processes are working).


    >> Paul Slootman


    Thanks again, but that is still not what I am looking for. Perhaps more info is needed.

    I have an identical set of directories at two locations. When a file is added to one location, I'll call it the source side, I want to run a script that picks up that file and copies it to the other location, say the destination side. Simple enough.

    However, I want to schedule the script to run, say every 15 minutes. That way if a file is put on the source side, the cript will pick it up and begin copying it. However, if the file is a few hundred MB, it might take longer than 15 minutes to copy it. So the next time the script runs, I need rsync to skip that file since it is still being copied from the first run and move to the next file. That same thing might be repeated during the next run.

    In other words, I can't wait until the first run has completed the large copy to begin copying additional files. I want to start a second, third, fourth, etc copy that begins working on any additional files that may have been placed on the source side.


    Thank you all again for the input, but it still looks to me like I need some type of file locking. Again, all other input is welcome because there might be a better way.

    - Kyle
    --
    Please use reply-all for most replies to avoid omitting the mailing list.
    To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
    Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


  2. Re: File Locking

    Another Kyle wrote:
    > However, I want to schedule the script to run, say every 15 minutes.
    > That way if a file is put on the source side, the cript will pick it up
    > and begin copying it. However, if the file is a few hundred MB, it might
    > take longer than 15 minutes to copy it.
    >
    > In other words, I can't wait until the first run has completed the large
    > copy to begin copying additional files.


    If you can't copy a large file within a fifteen minute time span, you've
    implied that you are working over a low-bandwidth link; if that's the case,
    why would you want to start another rsync that would compete with the already
    running rsync for bandwidth? Two rsync processes running over a single slow
    uplink are not going to be any faster than two rsync processes running in
    series; if anything, the reverse is more likely to be true-- that two
    or more (and, given what you've described, "more" sounds likely) rsync
    processes running simultaneously are likely to be *slower* than running
    rsync repeatedly in series.

    This sounds more and more like my own situation, which again, I solved by
    ensuring that only one rsync was running at any one time; I invoke the
    wrapper script every sixty seconds, and it takes less than 20 seconds to
    inspect the directory tree for changes and abort in the absence of files
    that need to be transferred.

    --Kyle
    --
    Please use reply-all for most replies to avoid omitting the mailing list.
    To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
    Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


  3. Re: File Locking

    On Mon, 2008-08-18 at 10:36 -0500, lists@trcintl.com wrote:
    > I have an identical set of directories at two locations. When a file
    > is added to one location, I'll call it the source side, I want to run
    > a script that picks up that file and copies it to the other location,
    > say the destination side. Simple enough.
    >
    > However, I want to schedule the script to run, say every 15 minutes.
    > That way if a file is put on the source side, the cript will pick it
    > up and begin copying it. However, if the file is a few hundred MB, it
    > might take longer than 15 minutes to copy it. So the next time the
    > script runs, I need rsync to skip that file since it is still being
    > copied from the first run and move to the next file. That same thing
    > might be repeated during the next run.
    >
    > In other words, I can't wait until the first run has completed the
    > large copy to begin copying additional files. I want to start a
    > second, third, fourth, etc copy that begins working on any additional
    > files that may have been placed on the source side.


    Fixing the problem with locking is trickier than it might appear.
    Suppose two large files A and B are added to the source. The script
    runs and starts copying A; the rsync generator works ahead and tells the
    sender that B also needs a transfer. The generator shouldn't lock B at
    this point, because that would force B to wait for A, defeating the
    purpose of using multiple concurrent rsyncs. 15 minutes later, a second
    instance of the script runs, skips A, and starts copying B. When the
    first rsync sender finishes A, it needs to know to skip B even though
    the generator has requested a transfer, and even if the second instance
    has exited (and released any lock?).

    I'm thinking that it may be easier to use one rsync run per file and
    have the script keep track of what is working on what.

    If the goal is primarily to avoid having small files wait behind large
    files, another approach would be to have multiple periodic rsync jobs,
    each of which deals with files in a different size range (using
    --min-size and --max-size). At most one instance of each job would run
    at a time.

    Matt

    --
    Please use reply-all for most replies to avoid omitting the mailing list.
    To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
    Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.4.9 (GNU/Linux)

    iEYEABECAAYFAkipqQEACgkQC+xSYN/Rlfv27QCfSFQ7TY5jSE1lXmwaJ6fz6VJv
    P6UAniZGDC1GqHUH0f6fT0QtS6HJQyLR
    =RIhJ
    -----END PGP SIGNATURE-----


+ Reply to Thread