[AstroPy] super-high-speed parsing of large FITS files

Paul Hirst phirst at gemini.edu
Thu Mar 2 18:36:17 EST 2017


Hi Stuart,

A couple of quick thoughts:

I'd consider storing the data as a cube per detector with time as the third
axis, so one FITS extension per detector rather than one per 2D frame. Then
you can just mmap the cube (or potentially chunks of it in the time axis)
and I think that would be about as efficient as it gets.

The disadvantage of that is that you can't just append data to the file as
it comes in, you'd need to know a-priori how much data was going to arrive
so that you could allocate the appropriate size blocks of data for each
extension. Or you could go with a separate file for each detector which
seems less elegant, but solves that problem.

Just a point of sanity check here - I don't know how many pixels you have,
but it sounds like it would require some thought be put into the I/O and
storage hardware, especially if you want to read it back to a data
processing machine while it is being written. If you process it on the same
machine as you are writing it from, you can probably arrange for the kernel
filesystem cache to help out significantly with that, especially if you
mmap the data.

Regards,
 Paul


On Thu, Mar 2, 2017 at 1:01 PM, Stuart P Littlefair <
s.littlefair at sheffield.ac.uk> wrote:

> Perhaps not the right place to ask this, but the official FITS mailing
> lists are overrun with spam.
>
> We are choosing a file format for a high speed photometer. One proposal is
> a MEF file. Frame rates are kHz, and there are five detectors so each FITS
> file may have many thousands of HDUs.
>
> One of our requirements is to be able to reduce the data in real time,
> whilst the data is being written. This means retrieving a given HDU from
> such a file in a msec or less. It would be OK if finding an initial HDU
> corresponding to a given frame number is slower than this,  if we can grab
> *subsequent* blocks of 5 HDUs (1 per detector) in < 1 msec.
>
> So I am wondering:
>
>  a) is this even possible in principle, with low-level C-code?
>  b) Can it be done via some un-sanctioned use of the private functions in
> the astropy FITs library?
>
> Many thanks in advance to anyone who can help
>
> --
> Stuart Littlefair
>
> -------------------------------------------------------
> Dept. of Physics & Astronomy,
> Univ. of Sheffield, Sheffield, S3 7RH.
>
> email: S.Littlefair at sheffield.ac.uk
> phone: +44 114 2224525 <+44%20114%20222%204525>
>
> _______________________________________________
> AstroPy mailing list
> AstroPy at scipy.org
> https://mail.scipy.org/mailman/listinfo/astropy
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/astropy/attachments/20170302/fd60a871/attachment.html>


More information about the AstroPy mailing list