binary file compare...
steven at REMOVE.THIS.cybersource.com.au
Wed Apr 15 11:03:16 CEST 2009
On Wed, 15 Apr 2009 07:54:20 +0200, Martin wrote:
>> Perhaps I'm being dim, but how else are you going to decide if two
>> files are the same unless you compare the bytes in the files?
> I'd say checksums, just about every download relies on checksums to
> verify you do have indeed the same file.
The checksum does look at every byte in each file. Checksumming isn't a
way to avoid looking at each byte of the two files, it is a way of
mapping all the bytes to a single number.
>> You could hash them and compare the hashes, but that's a lot more work
>> than just comparing the two byte streams.
> hashing is not exactly much mork in it's simplest form it's 2 lines per
Hashing is a *lot* more work than just comparing two bytes. The MD5
checksum has been specifically designed to be fast and compact, and the
algorithm is still complicated:
The reference implementation is here:
SHA-1 is even more complicated still:
Just because *calling* some checksum function is easy doesn't make the
checksum function itself simple. They do a LOT more work than just a
simple comparison between bytes, and that's totally unnecessary work if
you are making a one-off comparison of two local files.
More information about the Python-list