md5 and large files
jepler at unpythonic.net
Sun Oct 17 18:55:06 CEST 2004
It seems likely that 2 files would have the same 4k "preamble".
For instance, a unix tar file containing a 16k "file1" and then a 1k
"file2" would have the same leading bytes as a unix tar file containing
a 16k "file1" and a 1k "file3", and therefore the md5sum over the first
4k would match. (these two tar files would also have the same byte
If all pages on some website begin
the initial 4k might match, too.
But anyway, if s1 != s2, then the odds that hash(s1) != hash(s2) should
be small, and that shouldn't depend on the length of the string.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 196 bytes
Desc: not available
More information about the Python-list