[Tutor] hashlib weirdness
Terry Carroll
carroll at tjc.com
Mon Apr 2 19:45:38 CEST 2007
On 30 Mar 2007, Greg Perry wrote:
> Here's one that has me stumped.
>
> I am writing a forensic analysis tool that takes either a file or a
> directory as input, then calculates a hash digest based on the contents
> of each file.
>
> I have created an instance of the hashlib class:
>
> m = hashlib.md5()
>
> I then load in a file in binary mode:
>
> f = open("c:\python25\python.exe", "rb")
>
> According to the docs, the hashlib update function will update the hash
> object with the string arg. So:
>
> m.update(f.read())
> m.hexdigest()
>
> The md5 hash is not correct for the file.
Odd. It's correct for me:
In Python:
>>> import hashlib
>>> m = hashlib.md5()
>>> f = open("c:\python25\python.exe", "rb")
>>> m.update(f.read())
>>> m.hexdigest()
'7e7c8ae25d268636a3794f16c0c21d7c'
Now, check against the md5 as calculated by the md5sum utility:
>md5sum c:\Python25\python.exe
\7e7c8ae25d268636a3794f16c0c21d7c *c:\\Python25\\python.exe
> f.seek(0)
> hashlib.md5(f.read()).hexdigest()
No difference here:
>>> f.close()
>>> f = open("c:\python25\python.exe", "rb")
>>> hashlib.md5(f.read()).hexdigest()
'7e7c8ae25d268636a3794f16c0c21d7c'
More information about the Tutor
mailing list