md5 and large files

Nelson Minar nelson at
Mon Oct 18 03:07:25 CEST 2004

Brad Tilley <rtilley at> writes:
> I would like to verify that the files are not corrupt so what's the
> most efficient way to calculate md5 sums on 4GB files? The machine
> doing the calculations is a small desktop with 256MB of RAM.

If all you want to do is verify that a file is not corrupt, MD5 is the
wrong algorithm to use. Use something fast like crc32.

If you're worried about corruption anywhere in the file, then testing
the first 4k isn't going to help you very much.

If you really need it to be efficient, don't use Python. Use a native
program like md5sum or sum or something.

If this is you're homework, you'll learn a lot more by figuring it out

More information about the Python-list mailing list