[Module] pysync 1.2
Donovan Baarda
abo@minkirri.apana.org.au
Fri, 02 Mar 2001 12:14:10 +1100 (EST)
pysync 1.2
----------
A Python implementation of the rsync algorithm.
This is a demonstration implementation of the rsync algorithm in Python.
It is not fast and is not optimised. The primary aim is to provide a
simple example implementation of the algorithm for reference, so code
clarity is more important than performance. Ideas have been liberaly
taken from libhsync, xdelta and rsync.
Release 1.2 introduced the new zlib-like API, allowing for incremental
calculation of deltas and applying patches. The comments at the top of
pysync.py explains it all;
# Low level API signature calculation
sig=calcsig(oldfile)
# Low level API rsync style incremental delta calc from sig and newdata
delta=rdeltaobj(sig)
# or for xdelta style incremental delta calc from oldfile and newdata
# delta=xdeltaobj(oldfile)
incdelta=delta.calcdelta(newdata)
:
incdelta=delta.flush()
# Low level API applying incremental delta to oldfile to get newdata
patch=patchobj(oldfile)
newdata=patch.calcpatch(incdelta)
:
The rdeltaobj.flush() method supports R_SYNC_FLUSH and R_FINISH flush modes
that behave the same as their zlib equivalents. Next on the TODO list is
incremental signature calculation, and further cleanups. Eventualy I plan to
create a md4sum module and move the rolling checksum stuff into C code.
The performance has been marginaly hurt by this new API. Interestingly, the
python profiler shows that most of the time is wasted performing string-copies
when taking slices from input buffers, not actualy doing the rsync. This
suggests that significant performance increases might be achievable by re-
arranging things a bit, rather than moving python code into C.
I have also added a pysync-test.py script for thorough formal testing of
pysync. It generates/reuses random test files that make pysync really work
hard, verifying that it behaves as it should.
Incidentaly, release 1.2 also fixed a rather embarassing bug introduced in
release 0.9's adler32.py that corrupted the rolling checksums, resulting in
heaps of missed matches. This bug caused seriously bad performance and very
large deltas.
URL: http://freshmeat.net/projects/pysync/
Download: ftp://minkirri.apana.org.au/pub/python/pysync/pysync-1.2.tar.bz2
License: LGPL
Categories: Encryption/Encoding
Donovan Baarda (abo@minkirri.apana.org.au)
http://sourceforge.net/users/abo/
--
ABO: finger abo@minkirri.apana.org.au for more information.