sorting with expensive compares?
steve at holdenweb.com
Fri Dec 23 19:42:04 CET 2005
bonono at gmail.com wrote:
> Dan Stromberg wrote:
>>I've been using the following compare function, which in short checks, in
>>1) device number
>>2) inode number
>>3) file length
>>4) the beginning of the file
>>5) an md5 hash of the entire file
>>6) the entire file
> Why would #5 not enough as an indicator that the files are indentical ?
Because it doesn't guarantee that the files are identical. It indicates,
to a very high degree of probability (particularly when the file lengths
are equal), that the two files are the same, but it doesn't guarantee it.
Technically there are in infinite number of inputs that can produce the
same md5 hash.
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/
More information about the Python-list