[Tutor] ftp synch

Danny Yoo dyoo@hkn.eecs.berkeley.edu
Fri, 12 Apr 2002 13:46:32 -0700 (PDT)


On Fri, 12 Apr 2002, Carlo Bifulco wrote:

> Hi folks,
> first of all thanks for the nice tutoring. Just hanging around has been very
> instructive.
> I have a question concerning the ftplib python module.
> I wrote a small application which synchronizes a local and remote
> directories through FTP (basically it uploads files absent on the server dir
> but present in the local directory and downloads files present in the remote
> directory and absent in the local dir). Unfortunately I haven't found a good
> way to distinguish files looking at the modification time. I  would like
> both directories to share the latest version of each file. I'm having a
> problem witht the fact  that the 2 machines have different system clocks. Is
> there an easy way to solve this ?
> Thanks for any help,
> Carlo Bifulco


Would it be possible to have a "checksum" file on your FTP server?
Python comes with a nice 'md5' function that you can use to quickly create
a numeric signature of a file.  For example:

###
>>> def digestFile(f):
...     m = md5.new()
...     while 1:
...         chunk = f.read(1024)
...         if not chunk: break
...         m.update(chunk)
...     return m.hexdigest()
...
>>> myfile = open("/usr/share/dict/words")
>>> digestFile(myfile)
'c649076690e43c051de2fd58f91eac3d'
###

This 'digest' is a signature of the contents of a file, and will radically
change if the contents of the words is different.


You can set up a scheduled cron job on your FTP server that will
periodically update the signatures of your files into a separate signature
file.  These signatures are very small, so they should be very easy to
download.  Later, when you're updating a file, you can see if the
signatures of your local file and the server file match up.  If their
signatures do match, it's very likely that nothing's changed.


For more information on these string-signature "hash" functions, you can
see:

    http://www.python.org/doc/lib/module-md5.html
    http://www.python.org/doc/lib/module-sha.html
    http://www.python.org/doc/lib/module-hmac.html


Good luck to you!