# [AstroPy] Data Analysis and Programming in Python

Warren J. Hack hack at stsci.edu
Thu Apr 9 11:08:33 EDT 2009

```Hi,

I would recommend that you take a look at the 'Interactive Data
Analysis' tutorial (pydatatut-2.pdf) in Chapter 1 for a much simpler
discussion of numpy arrays and their available operations.

http://stsdas.stsci.edu/perry/pydatatut.pdf

The actual Numpy manual itself is really designed as a reference book
for hard-core programmers, not necessarily a tutorial for end users.
This lends to a very confusing, detail-oriented description which all
but the most dedicated programmers find difficult to use when learning
how to work with numpy arrays.  The tutorial by Perry and Robert really
tries to introduce the user to numpy arrays and how to work with them on
a practical basis.

My earlier suggestion regarding writing separate functions to return
numpy arrays comes from reviewing Megan's example and from your interest
in working with multiple image formats.  Each function could take as
input the name of the image and return a list of numpy arrays, with one
numpy array for each color (to handle 24-bit color images as well as
simple gray-scale images in the same manner).  For example,

import Image, numpy
def jpg2array(filename):
image = Image.open(filename)
xsize,ysize=image.size
r,g,b=image.split()
rdata=r.getdata() #data is now an array of length ysize*xsize
gdata=g.getdata()
bdata=b.getdata()

npr=numpy.reshape(rdata,(ysize,xsize))
npg=numpy.reshape(gdata,(ysize,xsize))
npb=numpy.reshape(bdata,(ysize,xsize))

return [npr,npg,npb]

def write_array(array,output,**keywords):
phdu = pyfits.PrimaryHDU(data=array)
for kw in keywords.keys():
phdu.writeto(output)

filename = 'hs-2009-14-a-web.jpg'
keywords = {'LATOBS':"32:11:56",'LONGOBS':"110:56",'TIME':time.ctime()}
if '.jpg' in filename:
arrays = jpg2array(filename)
array_names = ['red.fits','green.fits','blue.fits']
elif '.hdf' in filename:
arrays = hdf2array(filename)
array_names = ['hdf_array.fits']
for arr,outname in zip(arrays,array_names):
write_array(arr,outname,keywords)

These 2 functions should allow you to read JPG images as numpy arrays
and then write each array out to a separate FITS file using PyFITS.  The
sample I provide above reproduces what Megan suggested but using modular
functions and demonstrates how to handle multiple formats in a
manageable manner.  All you would need to do would be define the
functions you need to interpret the image formats from your hardware
into numpy arrays, and plug it into this example to write it out as a
FITS file.  You could also pass it along to whatever plotting routine
you want to use as numpy arrays as well, but at least this condenses the
work of generating the numpy arrays and FITS files into a couple of
manageable functions to be included in the rest of your code.

Cheers,
Warren

Wayne Watson wrote:
> Hi, a couple of things. Is there something a little more compact than
> the numPy manual in its descriptions? Probably the first 2x3 example
> was enough. Roughly page 19, maybe 26. It seems like the manual goes a
> bit overboard. The ndarray description goes on for pages. It looks
> like the whole ndarray story goes on to page 86.
>
> When you say write a jpg2numpy() function, it should produce what? A
> numpy array? Something like this I suppose:
> image = Image.open('hs-2009-14-a-web.jpg')
>
>     #image.show()
>     xsize,ysize=image.size
>     r,g,b=image.split()
>     rdata=r.getdata() #data is now an array of length ysize*xsize
>     gdata=g.getdata()
>     bdata=b.getdata()
>     ...
>
> Then use fitspy to put the data into the proper places.
>
> I find this mail-list a bit different. It looks like posts are being
> made from outside (no reference to astropy at scipy.org), or sometimes
> have multiple e-mail addresses. Is there some protocol here to always
> post something back the the list? Are there rules regarding text vs
> html format?
>
> Warren J. Hack wrote:
>> Hello again,
>>
>> I noticed that Megan (a co-worker) provided you with something that
>> covers the primary aspects of your problem; namely, reading an image
>> in and writing it out as a FITS file.  The basic step that you need
>> to sort out is simply how to convert your native data format into a
>> numpy array.  Megan's example shows how you can do this for jpg
>> files, but it goes for any non-FITS format. I would actually
>> recommend writing a separate function to perform the translation from
>> each image format you want/need to support into numpy (jpg2numpy(),
>> hdf2numpy(),...), then write a generic function to write out the
>> numpy array(s) to FITS files.  The numpy format can then be used to
>> view the image directly in matplotlib or DS9 (through numdisplay)
>> depending on what you need. The display through matplotlib can be
>> easily integrated into another GUI if needed whereas DS9 would need
>> to be started up as a separate process, with the rest of the thread
>> covering how to display the data.
>>
>> If you need any more help, please don't hesitate to ask me directly.
>>
>> Cheers,
>> Warren
>>
>>
>>
>> Wayne Watson wrote:
>>> Hi, Warren.   I'm picking up on the part of the thread above that
>>> you branched off on here, because I'd like to highlight and clarify
>>> what I'm after without having it buried in that long thread.
>>>
>>> Well, I took a peek at the data analysis pdf file. My impression is
>>> that this is for people who want to do interactive data analysis,
>>> and probably know little about Python. What I'm doing is modifying a
>>> 2000+ line of Python code to add new features that will do analysis
>>> of meteor images. To collect images, it uses a special hardware box.
>>> Ultimately, the user sees these images in a non-standard file
>>> format, which only the app understands.   I would like to, for one
>>> of many things, allow users to convert these images to fits. Suppose
>>> I do that and stop there. It _might_ be possible to convince the
>>> users to use the data analysis program. It's doubtful though, since
>>> they are pretty much neophytes. They do understand how to operate
>>> the program in the present form, which basically provides no
>>> analytic capabilities. If I can add fits and analyticd features to
>>> the program, there is a good chance they will use them. I think
>>> sending them off to the data analysis tool here would not work. Not
>>> yet anyway.
>>>
>>> To put it simply, I'm looking for some capability to add features
>>> that will allow the program to both read and write fits format
>>> files. Adding analytic capabilites is important, but not the point
>>> here.  It appears that, at least, one can use pyfits to read fits
>>> files, and one can embed this code in a Python program.  I posted
>>> the coding aspects of doing this in a thread above, and finally got
>>> someone to respond with code that looks like:
>>>
>>>    from matplotlib import pyplot as plt
>>>    import pyfits
>>>
>>>    image = pyfits.getdata('mpl51.fits')
>>>
>>>    fig = plt.figure()
>>>    ax.imshow(image)
>>>    fig.canvas.draw()
>>>    plt.show()
>>>
>>> This is close to what I'm looking for. Right now I'm lacking away to
>>> read, say, a jpg, fits, or gif file, and convert it to fits. I'm
>>> also lacking a way to convert the internal image format to fits. The
>>> latter is very important.
>>>
>>>
>>
>
> --
>
>              (121.01 Deg. W, 39.26 Deg. N) GMT-8 hr std. time)****
>
>           "Less than all cannot satisfy Man." -- William Blake
>

--
Warren J. Hack
Science Software Branch
Space Telescope Science Institute
3700 San Martin Drive
Baltimore MD, 21218
(410) 338-4943

```