binary file comparison with the md5 module

Christian Reyes christian at
Wed Jun 13 20:22:01 CEST 2001

after some more research i have discovered the very handy "filecmp" module.
problem solved.

"Christian Reyes" <christian at> wrote in message
news:9g8ahr$s6t$1 at
> I'm trying to write a script that takes two binary files and returns
> or not their data is completely matching.
> One of my peers suggested that an efficient way to do this would be to run
> the md5 algorithm on each file and then compare the resultant output.
> md5 returns a unique 128-bit checksum of it's input, this should
> theoretically work.
> The problem i'm having is with reading the binary file in as a string.
> I tried opening the file with the built-in python open command, and then
> reading the contents of the file into a buffer.  But I think my problem is
> that when I read the binary file into a buffer, the contents get tweaked
> somehow.  I would expect the print statement to give me some huge string
> gibberish but instead what I get is 'RIFFnap'.  Regardless of what size
> file is.  I'll try to read in a 5 meg file and all I get when I try to
> the buffer is some variation of 'RIFFxxx' (where xxx is any arbitrary set
> 3 characters).
> >>> x = open('d:\\binary.wav')
> >>> buf =
> >>> print buf
> 'RIFFnap'
> Anyway, if any of you have a better suggestion for me, I'd really
> it.
> Basically all i'm looking for is an efficient method of comparing binary
> data files.
> Thanks for your time,
> christian

More information about the Python-list mailing list