Erik Bray embray at stsci.edu
Tue May 24 18:08:04 EDT 2011

On 05/24/2011 03:47 PM, Ole Streicher wrote:
> Hi,
> is there a method in pyfits to calc or verify the MD5 checksum (keyword
> DATAMD5 in the primary header) for a HDUList? As far as I understand,
> the writeto() method just calcs some other checksums.
> I only found some method that calcs this for a file on the disk; however
> this uses the internal hdu._datLoc and hdu._datSpan methods, and I would
> like to apply it also to a newly created pyfits.HDUList (or to one that
> is changed after reading). As far as I understand, it just takes the
> MD5sum of all data for all HDUs from _datLoc to _datLoc+_datSpan. How
> would I do that "legally" with the different HDU types?
> Best regards
> Ole

I don't think pyfits has anything built in for handling MD5 sums.  Is 
there some particular standard this relates to, such that it would be a 
good feature to add?

At any rate, in the meantime you can use hashlib to generate a checksum 
on hdu.data--no need to use any internal attributes:

import hashlib
md5sum = hashlib.md5()
hdu.header['DATAMD5'] = md5sum.hexdigest()

My version of hashlib seems to know how to efficiently handle objects 
that implement the buffer interface (i.e. numpy arrays), so this should 
be pretty fast.


