Efficient MD5 (or similar) hashes

Erik Max Francis max at alcyone.com
Sun Dec 7 20:21:04 EST 2003


Kamus of Kadizhar wrote:

> I want to check the integrity of the files after transfer.  I can
> check
> the obvious - date, file size - quickly, but what if I want an MD5
> hash?
>
>  From reading the python docs, md5 reads the entire file as a string.
> That's not practical on a 1 GB file that's network mounted.

Python's md5 module just accepts updating strings; the driving code
certainly doesn't have to read the file all in at once.  Just read it in
a chunk at a time:

	hasher = md5.new()
	while True:
	    chunk = theFile.read(CHUNK_SIZE)
	    if not chunk:
	        break
	    hasher.update(chunk)
	theHash = hasher.hexdigest()

-- 
 __ Erik Max Francis && max at alcyone.com && http://www.alcyone.com/max/
/  \ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
\__/ Be able to be alone. Lose not the advantage of solitude.
    -- Sir Thomas Browne




More information about the Python-list mailing list