[Tutor] hashlib weirdness

Greg Perry gregp at liveammo.com
Sat Mar 31 02:17:00 CEST 2007


Here's one that has me stumped.

I am writing a forensic analysis tool that takes either a file or a directory as input, then calculates a hash digest based on the contents of each file.

I have created an instance of the hashlib class:

m = hashlib.md5()

I then load in a file in binary mode:

f = open("c:\python25\python.exe", "rb")

According to the docs, the hashlib update function will update the hash object with the string arg.  So:

m.update(f.read())
m.hexdigest()

The md5 hash is not correct for the file.

However, this works:

f.seek(0)
hashlib.md5(f.read()).hexdigest()

Why the difference I wonder?

Thanks in advance.




More information about the Tutor mailing list