[AstroPy] writing FITS files fast

Ivan Zolotukhin ivan.zolotukhin at gmail.com
Sun Apr 19 17:40:16 EDT 2015


Hi,

I need to write a FITS file from a web application in python and I
need to do it fast. When I write file using astropy.io.fits module
(v0.3.x), it takes several seconds for a table with 300+ columns and
only 20 rows. Here's the hotshot profiler output of the relevant piece
of code:

         8506345 function calls (8480504 primitive calls) in 18.801 seconds

   Ordered by: internal time, call count
   List reduced from 2343 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    10635    5.409    0.001   13.575    0.001
/usr/local/lib/python2.7/dist-packages/astropy/io/fits/column.py:953(__getattr__)
  3660493    4.787    0.000    8.172    0.000
/usr/local/lib/python2.7/dist-packages/astropy/io/fits/column.py:996(__getitem__)
  3650946    3.386    0.000    3.386    0.000
/usr/local/lib/python2.7/dist-packages/astropy/io/fits/util.py:589(_is_int)
     2099    0.393    0.000    0.393    0.000
/usr/local/lib/python2.7/dist-packages/astropy/io/fits/header.py:1779(_updateindices)

For the more recent astropy version 1.0.2 situation is somewhat better
but it's still significantly slower than I need:

         1153615 function calls (1138771 primitive calls) in 4.364 seconds

   Ordered by: internal time, call count
   List reduced from 736 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     2099    0.419    0.000    0.425    0.000
/usr/local/lib/python2.7/dist-packages/astropy/io/fits/header.py:1808(_updateindices)
      346    0.312    0.001    0.852    0.002
/usr/local/lib/python2.7/dist-packages/astropy/io/fits/column.py:1245(__getattr__)
   124176    0.275    0.000    0.406    0.000
/usr/local/lib/python2.7/dist-packages/astropy/io/fits/column.py:1285(__getitem__)
      327    0.215    0.001    0.222    0.001
/usr/local/lib/python2.7/dist-packages/django/db/backends/utils.py:58(execute)
      357    0.201    0.001    0.293    0.001
/usr/local/lib/python2.7/dist-packages/astropy/io/fits/header.py:1399(index)
   137557    0.183    0.000    0.183    0.000
/usr/local/lib/python2.7/dist-packages/astropy/io/fits/column.py:421(__get__)
   124226    0.132    0.000    0.132    0.000
/usr/local/lib/python2.7/dist-packages/astropy/io/fits/util.py:906(_is_int)

My code is very simple however, it just creates necessary set of
columns (astropy.table.MaskedColumn objects) casting data from python
lists coming from the database to numpy arrays. Are there ways to save
on checks like _is_int() in case the input array datatype has been
already enforced?

When the code is rewritten to use fitsio module
(https://pypi.python.org/pypi/fitsio/), the execution time is
significantly better:

         129409 function calls (128693 primitive calls) in 0.811 seconds

   Ordered by: internal time, call count
   List reduced from 289 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      327    0.218    0.001    0.221    0.001
/usr/local/lib/python2.7/dist-packages/django/db/backends/utils.py:58(execute)
        4    0.083    0.021    0.083    0.021
/usr/local/lib/python2.7/dist-packages/fitsio/fitslib.py:1311(_update_info)

...but fitsio does not seem to support 64 bit integers which is another
requirement I have.

What's the fastest solution to write FITS files that's available on
the python market? Am I missing something in the astropy
configuration?

--
With best regards,
 Ivan



More information about the AstroPy mailing list