Tuesday 20 March 2012

gigasync - Tool that enables rsync to mirror enormous directory trees.

gigasync - Tool that enables rsync to mirror enormous directory trees.

http://samba.org/rsync/) has a couple issues with mirroring large (> 100K) directory trees.

rsync's memory usage is directly proportional to the number of files in a tree. Large directories take a large amount of RAM.
rsync can recover from previous failures, but always determines the files to transfer up-front. If the connection fails before that determination can be made, no forward progress in the mirror can occur.
The solution? Chop up the workload by using perl to recurse the directory tree, building smallish lists of files to transfer with rsync. Most of the time these small lists of files transfer over fine, but if they fail, this script can look for that specific failure and retry that set a couple times before giving up.
http://matthew.mceachen.us/geek/gigasync/gigasync.pod.html

download at 

No comments:

Post a Comment

Datanami, Woe be me