[AstroPy] question about fpack and astropy.io.fits

Erik Bray embray at stsci.edu
Fri Oct 10 10:46:50 EDT 2014


On 10/09/2014 08:01 PM, Octavi Fors wrote:
> Thanks a lot Erik for your detailed reply.
>
> Your answer allows me to focus my questions/concerns about astropy.io.fits and
> fpack <http://heasarc.nasa.gov/fitsio/fpack/> in my initial message:
>
>     In other words, the convention states how compressed FITS files should be
>     read and written, while fpack is simply a program that knows how to write
>     files with that convention.  There's no sense (I can think of) in which it
>     makes sense for a general purpose FITS library (like astropy.io.fits) to
>     support fpack itself. It does however support the same compression scheme
>     and can read files compressed by fpack.  And likewise can write compressed
>     files that can be read by funpack.
>
>
> 1-by "it supports the same compression scheme" do you mean that astropy.io.fits
> reads/writes tile compressed FITS images in the same *exact* way (in terms of
> compression ratio and checksum file) as fpack does from command line (of course
> using the same set of parameters)?

Yes, for the most part they are using the same code.  The checksum should be the 
same for the data itself, but may not be identical for the whole file due to 
possible minor differences in the header, such as the order in which keywords 
are written, but those are trivial differences.  If using the FITS checksum 
convention this means that the CHECKSUM *might* not be identical, but the 
DATASUM should be.

Another gotcha to be aware of when comparing files is that when using 
subtractive dithering a random seed is used by default to initialize the 
randomizer.  To avoid that for testing purposes, make sure to pass the 
dither_seed=-1 argument into the CompImageHDU intitializer (this is covered in 
more detail in the API docs I linked to).  This uses a checksum of the tile data 
to determine the random seed, and so will always be the same for the same file.

> 2-my other concern is speed when reading/writing tiled-compressed FITS images.
> Is astropy.io.fits performance comparable to fpack command line one?
> Our example is 6576x4384pix images, with tiles size of 100x100pix.

That would be something for you determine in your particular application.

> 3-one of our pipeline most intensive I/O operations is to extract small
> (500x500pix) subimages from the big ones (6576x4384pix) upon ~10^3 targets
> (RA,DEC) x ~10^6 exposures = ~10^9 requests.
>
> In the fpack website <http://heasarc.nasa.gov/fitsio/fpack/> I've seen these two
> features:
>
>     -Each HDU of a multi-extension FITS file is compressed separately, so it is
>     not necessary to uncompress the entire file to read a single image in a
>     multi-extension file.
>
>
>     -Dividing the image into tiles before compression enables faster access to
>     small subsections of the image.
>
>
> I know it's a basic question, but...: are the compressed tiles stored into
> separate HDUs?
>
> In other words, if my big (6576x4384pix) images are tile-compressed with fpack
> (or with astropy.io.fits if possible) and properly multi-extension HDU arranged,
> could astropy.io.fits (or pyfits) support the first of the two fpack features
> mentioned above?

These aren't features of fpack so much as just how the Tiled Image convention 
works.  I strongly recommend reading the documentation for the convention if 
doing any significant work with it (even if just using fpack to do the work):

http://fits.gsfc.nasa.gov/registry/tilecompression/tilecompression2.3.pdf

> That would really improve the reading performance for subimage extraction process.

One thing that astropy.io.fits/PyFITS does not currently support is 
decompressing one tile at a time for reading slices of an image.  Instead it 
just decompresses the entire image at once (sort of defeating the benefit of 
tiling to some extent, though it still helps things on the CFITSIO level). 
Obviously that's on the TODO list but I haven't had time to work on that feature 
myself.  Patches are welcome.

Erik

P.S. on the matter of reading slices of the compressed image, the fitsio library 
might currently do better: https://github.com/esheldon/fitsio  though I haven't 
tried it for that use case.

> On Thu, Oct 9, 2014 at 4:24 PM, Erik Bray <embray at stsci.edu
> <mailto:embray at stsci.edu>> wrote:
>
>     Hi Octavi,
>
>     On 10/08/2014 11:06 PM, Octavi Fors wrote:
>     > Hello everyone,
>     >
>     > this might sounds like a newbie question, but do you know if fpack compression
>     > package is supported by astropy.io.fits version 0.4.2?
>
>     It depends on what you mean, exactly.  It would help to add some clarification
>     about what *fpack* is--apologies if this isn't new to you but it's worth
>     mentioning for anyone else reading.
>
>     fpack itself (and its counterpart funpack (which I always enjoy reading "fun
>     pack")) are just command-line programs that write new FITS files with one or
>     more of the HDUs compressed according the FITS Tile Compression convention--or
>     decompress them as the case may be.
>
>     In other words, the convention states how compressed FITS files should be read
>     and written, while fpack is simply a program that knows how to write files with
>     that convention.  There's no sense (I can think of) in which it makes sense for
>     a general purpose FITS library (like astropy.io.fits) to support fpack itself.
>     It does however support the same compression scheme and can read files
>     compressed by fpack.  And likewise can write compressed files that can be read
>     by funpack.
>
>     > In
>     >http://docs.astropy.org/en/stable/io/fits/usage/unfamiliar.html#compressed-image-data
>     > the module "astropy.io.fits.compression" is mentioned to access compressed HDUs,
>     > but it doesn't say if fpack is supported and how should this package be
>     > specified in fits.open method.
>
>     The docs you linked to at
>
>     http://astropy.readthedocs.org/en/stable/io/fits/usage/unfamiliar.html#compressed-image-data
>
>     could probably be expanded on with more examples, but as they indicate files
>     containing compressed HDUs are handled transparently.  The compressed HDUs are
>     accessed the same way as normal uncompressed IMAGE HDUs.  The data is
>     automatically decompressed when it is read.  fpack has no role to play in that.
>
>
>     > My images images are 16 and -32bit. and I would like to make use of tiling
>     > pattern (FZTILE='(n,m)', see fpack manual
>      > <http://heasarc.gsfc.nasa.gov/FTP/software/fitsio/c/docs/fpackguide.pdf>).
>     > For the -32bit images, any advise about noise-sensitive scaling (SCALE_FACTOR?)
>     > and subtractive dithering to achieve high compression ratios while preserving
>     > the scientific content of data, would be most than welcome.
>
>     The FZTILE keyword, as well as the other "Fpack compression directive keywords"
>     listed in section 3.3 of the fpack guide are not standard FITS keywords and are
>     only understood by the fpack program itself.
>
>     These can be used, for example, in a pipeline where raw data is written out
>     uncompressed in a FITS file.  The file is later processed through fpack which
>     will compress any HDUs marked with these keywords according to the indicated
>     settings.  The end result of that is a new HDU compatible with the tile
>     compression convention, and keywords like FZTILE become ZTILEn, or for example
>     FZALGOR becomes ZCMPTYPE.
>
>     In other words, those FZ keywords indicate how the HDU *should* be compressed
>     when passed through fpack.  The keywords on the actually compressed HDU
>     (ZCMPTYPE, ZTILEn, etc.) indicate how the HDU *was* compressed.
>
>     The FZ* keywords have no effect in astropy.io.fits (though it might not be a bad
>     idea to create a facility that understands them, for creating compressed HDUs
>     from existing uncompressed HDUs).  Instead, to create a compressed HDU you just
>     create a CompImageHDU object as demonstrated here:
>
>     http://astropy.readthedocs.org/en/stable/io/fits/usage/unfamiliar.html#creating-a-compressed-image-hdu
>
>     This works the same as other HDU types in the library.  Though it does take
>     several optional keywords that are described in more detail in the API docs:
>
>     http://astropy.readthedocs.org/en/stable/io/fits/api/images.html#astropy.io.fits.CompImageHDU
>
>     For example to specify a tiling use
>
>       >>> hdu = fits.CompImageHDU(data=data, header=header, tile_size=(n, m))
>
>     Subtractive dithering is used by default where applicable.  It otherwise
>     supports all the features of the convention except perhaps the ZMASKCMP option
>     (not for any particular technical reason--it's definitely something that can and
>     should be done, though I haven't had any requests for it either...)
>
>     As to how to find a balance between compression quality and data quality I think
>     others with more experience may wish to weigh in on that, though I will point
>     out that documentation for the tiled image convention has some useful tips
>     on that:
>
>     http://fits.gsfc.nasa.gov/registry/tilecompression/tilecompression2.3.pdf
>
>
>     Hope that helps,
>     Erik
>     _______________________________________________
>     AstroPy mailing list
>     AstroPy at scipy.org <mailto:AstroPy at scipy.org>
>     http://mail.scipy.org/mailman/listinfo/astropy
>
>
>
>
> --
> Octavi Fors
> Postdoctoral Research Associate
> Department of Physics and Astronomy
> The University of North Carolina at Chapel Hill
> CB #3255, #157 Phillips Hall
> Chapel Hill, NC 27599
> Office: (919) 962-3606
> Fax:    (919) 962-0480
>
>
> _______________________________________________
> AstroPy mailing list
> AstroPy at scipy.org
> http://mail.scipy.org/mailman/listinfo/astropy
>




More information about the AstroPy mailing list