[AstroPy] super-high-speed parsing of large FITS files

Thu Mar 2 18:59:05 EST 2017

Hi Stuart,
> 
> We are choosing a file format for a high speed photometer. One proposal is a MEF file. Frame rates are kHz, and there are five detectors so each FITS file may have many thousands of HDUs.
> 
> One of our requirements is to be able to reduce the data in real time, whilst the data is being written. This means retrieving a given HDU from such a file in a msec or less. It would be OK if finding an initial HDU corresponding to a given frame number is slower than this,  if we can grab *subsequent* blocks of 5 HDUs (1 per detector) in < 1 msec.
> 
> So I am wondering:
> 
>  a) is this even possible in principle, with low-level C-code?
>  b) Can it be done via some un-sanctioned use of the private functions in the astropy FITs library?
> 
if I understood right that you are not fully settled on using FITS as a format at all at this point, it might be
worth looking into HDF5 instead. You may find some ideas about potential performance and optimisation
possibilities in the PyTables documentation:
http://www.pytables.org/usersguide/optimization.html

Unfortunately not all of these (in particular the Blosc compression library, afaik) are available with the
h5py library used in astropy, but perhaps writing a direct interface to pytables could still be an option.

Cheers,
				Derek