Does hashlib support a file mode?

Chris Rebert clp2 at
Wed Jul 6 08:44:28 CEST 2011

On Tue, Jul 5, 2011 at 10:54 PM, Phlip <phlip2005 at> wrote:
> Pythonistas:
> Consider this hashing code:
>  import hashlib
>  file = open(path)
>  m = hashlib.md5()
>  m.update(
>  digest = m.hexdigest()
>  file.close()
> If the file were huge, the would allocate a big string and
> thrash memory. (Yes, in 2011 that's still a problem, because these
> files could be movies and whatnot.)
> So if I do the stream trick - read one byte, update one byte, in a
> loop, then I'm essentially dragging that movie thru 8 bits of a 64 bit
> CPU. So that's the same problem; it would still be slow.
> So now I try this:
>  sum = os.popen('sha256sum %r' % path).read()
> Those of you who like to lie awake at night thinking of new ways to
> flame abusers of 'eval()' may have a good vent, there.

Indeed (*eyelid twitch*). That one-liner is arguably better written as:
sum = subprocess.check_output(['sha256sum', path])

> Does hashlib have a file-ready mode, to hide the streaming inside some
> clever DMA operations?

Barring undocumented voodoo, no, it doesn't appear to. You could
always read from the file in suitably large chunks instead (rather
than byte-by-byte, which is indeed ridiculous); see
io.DEFAULT_BUFFER_SIZE and/or the os.stat() trick referenced therein
and/or the block_size attribute of hash objects.


More information about the Python-list mailing list