Andrew Hodgson
2006-Jun-14 18:07 UTC
Rsyncing a very large directory tree (over 50,000 files)
Hi, I need to rsync a very large directory tree (over 50,000 files). This is not a regular job, after the initial rsync is done, I can do a nightly rsync and only a few files will change. Is there anything I need to be aware of before doing this? I started the script this morning, but it was still building the file list after around 15 minutes. Is it better to do it using several points, then when I have the structure on the other machine I can then do the whole tree in one go? Will the complete file list need to be sent across each time I run the program? Also, I wish to exclude some directories, as they contain DBF files which are bing manipulated on a regular basis (!). However, I wish to exclude some pathnames but not others under subdirectories. For example, - /bin/ + /binbackup/bin/ - /binbackup/** Is that the correct way of doing this? I want to exclude the bin directory and anything in the binbackup directory apart from the bin directory within. Any suggestions? Thanks. Andrew. -------------- next part -------------- HTML attachment scrubbed and removed
Matt McCutchen
2006-Jun-14 18:24 UTC
Rsyncing a very large directory tree (over 50,000 files)
On Wed, 2006-06-14 at 19:07 +0100, Andrew Hodgson wrote:> Is there anything I need to be aware of before doing this? I started > the script this morning, but it was still building the file list after > around 15 minutes. Is it better to do it using several points, then > when I have the structure on the other machine I can then do the whole > tree in one go? Will the complete file list need to be sent across > each time I run the program?Yes, rsync will send the complete file list each time it runs. It seems odd to me that building the file list would take 15 minutes; when I back up the system partition of my computer (300,000 files) rsync takes perhaps 5 minutes to build the file list. I don't think using several points would be better or worse than doing it all at once, just more complicated.> > - /bin/ > + /binbackup/bin/ > - /binbackup/** > I want to exclude the bin directory and anything in the binbackup > directory apart from the bin directory within.The third filter will match /binbackup/bin/foo and cause it to be excluded, which isn't what you want. You can do this: - /bin/ + /binbackup/bin/ - /binbackup/*>That way, the third filter does not exclude /binbackup/bin/foo. /binbackup/bar/baz is technically included, but since /binbackup/bar is excluded, rsync never even goes inside it. Or you can do this: - /bin/ + /binbackup/bin/*** - /binbackup/** The second filter overrides the third one and causes /binbackup/bin and everything inside to be included. Matt
Jamie Lokier
2006-Jun-14 18:43 UTC
Rsyncing a very large directory tree (over 50,000 files)
Matt McCutchen wrote:> Yes, rsync will send the complete file list each time it runs. It seems > odd to me that building the file list would take 15 minutes; when I back > up the system partition of my computer (300,000 files) rsync takes > perhaps 5 minutes to build the file list.That surely depends on the computer and your disk. My laptop disk has about 2,000,000 files, and it takes longer than 15 minutes to build the list when doing a backup with rsync. Also I have to do two rsync runs, as it can run out of memory if I do the whole tree with one. (192MB RAM). -- Jamie
Andrew Hodgson
2006-Jun-14 22:35 UTC
Rsyncing a very large directory tree (over 50,000 files)
-----Original Message----- From: Jamie Lokier [mailto:jamie@shareable.org] Sent: 14 June 2006 19:43 To: Matt McCutchen Cc: Andrew Hodgson; rsync@lists.samba.org Subject: Re: Rsyncing a very large directory tree (over 50,000 files)>Matt McCutchen wrote: >> Yes, rsync will send the complete file list each time it runs. Itseems>> odd to me that building the file list would take 15 minutes; when Iback>> up the system partition of my computer (300,000 files) rsync takes >> perhaps 5 minutes to build the file list.>That surely depends on the computer and your disk.>My laptop disk has about 2,000,000 files, and it takes longer than 15 >minutes to build the list when doing a backup with rsync. Also I have >to do two rsync runs, as it can run out of memory if I do the whole >tree with one. (192MB RAM).Hardware shouldn't be an issue, but I was a bit worried about it reaching some limit somewhere and not doing something correctly. I will try again when at work tomorrow. Andrew.