filecmp.dircmp performance
Peter Otten
__peter__ at web.de
Sat Jan 8 11:28:56 EST 2011
dads wrote:
> I'm creating a one way sync program, it's to automate backing up data
> over the wan from our shops to a server at head office. It uses
> filecmp.dircmp() but the performance seems poor to me.
>
> for x in dc.diff_files:
> srcfp = os.path.join(src, x)
> self.fn777(srcfp)
> if os.path.isfile(srcfp):
> try:
> shutil.copy2(srcfp, dst)
> self.lg.add_diffiles(src, x)
> except Exception, e:
> self.lg.add_errors(e)
>
> I tested it at a store which is only around 50 miles away on a 10Mbps
> line, the directory has 59 files that are under 100KB. When it gets to
> dc.diff_files it takes 15mins to complete. Looking at the filecmp.py
> it's only using os.stat, it seems excessively long.
As a baseline it would be interesting to see how long it takes to copy those
59 files using system tools.
However, there are efficient tools out there that work hard to reduce the
traffic over the net which is likely to be the bottleneck. I suggest that
you have have a look at
http://en.wikipedia.org/wiki/Rsync
More information about the Python-list
mailing list