BUG? sha-moduel returns same crc for different files

Thomas Weholt thomas at cintra.no
Sun Sep 17 03:34:40 CEST 2000

In article <3d7l8drbd7.fsf at kronos.cnri.reston.va.us>, Andrew Kuchling
<akuchlin at mems-exchange.org> wrote: 
> "Thomas Weholt" <thomas at cintra.no> writes:
>> K:\KimIglinsky-0101-1.jpg 9486845232ae19c8fc1f9dc10d65ae2f4ac4d95e
>> 158275  K:\ShirleyMallman-1216-1.jpg
>> 9486845232ae19c8fc1f9dc10d65ae2f4ac4d95e 161972 CRC1 == CRC2 :  1 Size1
>> == Size2: 0 The output clearly says the size is different, but the crc
>> the same.
> Fascinating.  I'll bet that the problem is that you're not opening the 
> files in binary mode, so the .read() is hitting an EOF (byte 26) early
> in both files, and this prefix is the same.  You can check this by doing
> 'data1=open(filename1).read() ; data2=...' and then comparing data1 and
> data2.  
> In that case, the fix is to use open(filename1, 'rb'). 
> --amk

Well, that didn't change much. :-<

d1 = open(filename,'r').read()
d2 = open('filename,'rb').read()

doing a len(d1) == len(d2) returns true, so to me it looks like both methods 
reads equal amounts of data, and d1 == d2 equals true too.

If this doesn't work I'll have to find another way of telling if files 
are equal. This is pretty vital for my project and I thought sha 
should be the module for this. Does anybody have any other 
tips of how this can be done, or another way of using the
sha-module for the purpose of testing equality of two files?

In desperate need of tips and help. Thanks so far.


More information about the Python-list mailing list