[AstroPy] DATAMD5 calculation

Erik Bray embray at stsci.edu
Tue May 24 18:08:04 EDT 2011

On 05/24/2011 03:47 PM, Ole Streicher wrote:
> Hi,
> is there a method in pyfits to calc or verify the MD5 checksum (keyword
> DATAMD5 in the primary header) for a HDUList? As far as I understand,
> the writeto() method just calcs some other checksums.
> I only found some method that calcs this for a file on the disk; however
> this uses the internal hdu._datLoc and hdu._datSpan methods, and I would
> like to apply it also to a newly created pyfits.HDUList (or to one that
> is changed after reading). As far as I understand, it just takes the
> MD5sum of all data for all HDUs from _datLoc to _datLoc+_datSpan. How
> would I do that "legally" with the different HDU types?
> Best regards
> Ole

I don't think pyfits has anything built in for handling MD5 sums.  Is 
there some particular standard this relates to, such that it would be a 
good feature to add?

At any rate, in the meantime you can use hashlib to generate a checksum 
on hdu.data--no need to use any internal attributes:

import hashlib
md5sum = hashlib.md5()
hdu.header['DATAMD5'] = md5sum.hexdigest()

My version of hashlib seems to know how to efficiently handle objects 
that implement the buffer interface (i.e. numpy arrays), so this should 
be pretty fast.


More information about the AstroPy mailing list