[Module] pysync 1.2

Donovan Baarda abo@minkirri.apana.org.au
Fri, 02 Mar 2001 12:14:10 +1100 (EST)


                               pysync 1.2
                               ----------

A Python implementation of the rsync algorithm.

This is a demonstration implementation of the rsync algorithm in Python.
It is not fast and is not optimised. The primary aim is to provide a
simple example implementation of the algorithm for reference, so code
clarity is more important than performance. Ideas have been liberaly
taken from libhsync, xdelta and rsync.

Release 1.2 introduced the new zlib-like API, allowing for incremental 
calculation of deltas and applying patches. The comments at the top of 
pysync.py explains it all; 

# Low level API signature calculation 
sig=calcsig(oldfile) 

# Low level API rsync style incremental delta calc from sig and newdata 
delta=rdeltaobj(sig) 
# or for xdelta style incremental delta calc from oldfile and newdata 
# delta=xdeltaobj(oldfile) 
incdelta=delta.calcdelta(newdata) 
: 
incdelta=delta.flush() 

# Low level API applying incremental delta to oldfile to get newdata 
patch=patchobj(oldfile) 
newdata=patch.calcpatch(incdelta) 
: 

The rdeltaobj.flush() method supports R_SYNC_FLUSH and R_FINISH flush modes 
that behave the same as their zlib equivalents. Next on the TODO list is 
incremental signature calculation, and further cleanups. Eventualy I plan to 
create a md4sum module and move the rolling checksum stuff into C code. 

The performance has been marginaly hurt by this new API. Interestingly, the 
python profiler shows that most of the time is wasted performing string-copies 
when taking slices from input buffers, not actualy doing the rsync. This 
suggests that significant performance increases might be achievable by re-
arranging things a bit, rather than moving python code into C. 

I have also added a pysync-test.py script for thorough formal testing of 
pysync. It generates/reuses random test files that make pysync really work 
hard, verifying that it behaves as it should. 

Incidentaly, release 1.2 also fixed a rather embarassing bug introduced in 
release 0.9's adler32.py that corrupted the rolling checksums, resulting in 
heaps of missed matches. This bug caused seriously bad performance and very 
large deltas.

       URL:  http://freshmeat.net/projects/pysync/
  Download:  ftp://minkirri.apana.org.au/pub/python/pysync/pysync-1.2.tar.bz2

   License:  LGPL

  Categories:  Encryption/Encoding

Donovan Baarda (abo@minkirri.apana.org.au)
http://sourceforge.net/users/abo/

--
ABO: finger abo@minkirri.apana.org.au for more information.