[Tutor] plotting several datasets and calling data from afar

Tue Mar 27 09:13:02 CEST 2012

See below, but not all the way. Interspersed in the code.

<snipped some text; to keep it digestible>

> >    I am trying to set up a code to do some plotting and before I get too far I wanted to ask some structure questions.  Basically I want to tell python to read 2 datasets, plot them on the same scale on the same x-y axis , read a third dataset and match the name from the first dataset, then label certain values from the third... complicating matters is that all these data are part of much, much larger sets in seperate files, the paths look like:
> > pathway1/namered.dat
> > pathway2/nameblue.dat
> > matchingfile.txt
> >
> > so I do fopen on the matchingfile, read it with asciitable, and then I have a column in that file called 'name' and a column called 'M', I sort the file, return a subset that is interesting, and get name1, name2, etc for every subset.  I want to make a plot that looks like:
> >
> > plot pathway1/namered.dat and pathway2/nameblue.dat with label 'M' for every value in the subset name1, each row[i] I need to assign to a seperate window so that I get a multiplot with a shared x-axis, and stacking my plots up the y-axis.  I do have multiplot working and I know how to plot 'M' for each subset.
> >
> > The conceptual trouble has come in, how do I match 'name' variable of my subset 'name1' with the plot I want to do for pathway1/namered.dat and pathway2/nameblue.dat... the key feature that is the same is the 'name' variable, but in one instance I have to match the 'name'+'red.dat' and in the other the 'name'+'blue.dat'
> 
> It's not 100% clear to me what you precisely want to do, but here are a few possibilites:
> 
> - use a dictionary. Assign each dataset to a dictionary with the name as the key (or the name + color). Eg, dataset['name1red'], dataset['name1blue'], dataset['name2red'] etc. Each value in this dictionary is a dataset read by asciitable.
>  the (big) disadvantage is that you would read every single *.dat file beforehand

>    The dictionaries look a bit like the right idea, but in the end I was able to manipulate my input table so they aren't quite necessary.  The code I have now, which partially works, is as follows:
> ----------------------------
> #!/usr/bin/python
> 

<snipped lots of imports>

> #File from Read_All
> x=open('LowZ_joinAll')
> 
> dat=asciitable.read(x,Reader=asciitable.NoHeader, fill_values=['--','-999.99'])
> #gives dat file where filenames are first two columns
> 
> ###############################
> bluefilename1=dat['col1']
> filename1=dat['col2']  
> 
> #other stuff I need
> 
> #Ra/Dec in decimal radians  
> Radeg1=dat[ 'col6']*180./math.pi   #ra-drad  
> Decdeg1=dat['col7']*180./math.pi    #dec-drad 
> Vmag=dat['col8']        
> Mag=dat['col15']     
> EW1=dat['col16']      
> EW2=dat['col17']      
> EW3=dat['col18']  
> #Horizontal Branch Estimate
> VHB=18.0
> EWn = (0.5*abs(EW1)) + abs(EW2) + (0.6*abs(EW3))
> # NEED ABS VALUE FOR FORMULA
> FEHn = -2.66 + 0.42*(EWn + 0.64*(Vmag - VHB))   
> EW1_G=dat['col23']    
> EW2_G=dat['col24']    
> EW3_G=dat['col25']
> EWg = (0.5*abs(EW1_G)) + abs(EW2_G) + (0.6*abs(EW3_G))
> FEHg = -2.66 + 0.42*(EWg + 0.64*(Vmag - VHB))
> #use 0.15-0.2 dex as standard error -- R. Ibata
> FEHerror=0.2  
> #corrected velocity  
> Vhel=dat['col37']          
> V_err=dat['col38']  
> m_H=dat['col74']
> alpha_Fe=dat['col75'] 
> microturb=dat['col76'] 
> Vrot=dat['col77']
> Chisq=dat['col78'] 
> RVorig=dat['col79']  
> RVcorr=dat['col80'] 
> Heliocentric_RV=RVorig+RVcorr
> 
> #now if I want to make plots I have to access paths
> 
> #example, trying with one element
> path1r='../Core2dfdr-red-sorted/'+filename1[1]
> path1b='../Core2dfdr-blue/'+bluefilename1[1]
> 
> #and use a title in the plot 
> title1='Data Fe/Hn='+str(FEHn[1])+' Fe/Hg='+str(FEHg[1])
> 
> for i in xrange(len(Radeg1)):
>     if i<=5:
> #subset1
>         print 'filename= ',filename1[i],'  ',bluefilename1[i],'  FEHn= ',FEHn[i],'  FEHg= ',FEHg[i]
> #multiplot1
>         fig1, (ax1, ax2, ax3, ax4, ax5) = plt.subplots(5, sharex=True)
>         ax1.plot(path1r,'--ro')
>         ax1.plot(path1b,'--bo')

Here's where you go wrong.
You're feeding plot() a path, not data. plot() just takes an x and y array of (float) values (and more options if wanted). Not like gnuplot for example, which can deal with a filename as input.
This is also the ValueError you get below: plot() sees a string (the filename), which it tries to interpret as an array of y values (it assumes x is [0, 1, 2, …] if it's missing), and that fails as it tries to convert the string to floats.

So, you'll first have to read the file, extract the data, then plot those data.
pyfits is the module you want to use for reading your data.

Hope that gets you further,

  Evert

>         plt.show()
> #subset2
> #    if i>5 and i<=10:
> #        print 'filename= ',filename1[i],'  FEHn= ',FEHn[i],'  FEHg= ',FEHg[i]
> 
> .....
> this is where it breaks down.
>    The path name it cannot follow, and the title is not fully a string.  Since I will want to do this for many subset's I would prefer to loop but I don't see a good way to do that.  If I try with just one element I get the error: 
> 
> filename=  100604F2_1red.0063.fits    100604F2_1blueb.0063.fits   FEHn=  -3.16295   FEHg=  -3.216626
> Traceback (most recent call last):
>   File "plot_LowZ_targets.py", line 232, in <module>
>     ax1.plot(path1r,'--ro')
>   File "/Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/site-packages/matplotlib/axes.py", line 3849, in plot
>     self.add_line(line)
>   File "/Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/site-packages/matplotlib/axes.py", line 1443, in add_line
>     self._update_line_limits(line)
>   File "/Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/site-packages/matplotlib/axes.py", line 1451, in _update_line_limits
>     p = line.get_path()
>   File "/Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/site-packages/matplotlib/lines.py", line 644, in get_path
>     self.recache()
>   File "/Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/site-packages/matplotlib/lines.py", line 401, in recache
>     y = np.asarray(yconv, np.float_)
>   File "/Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/site-packages/numpy/core/numeric.py", line 235, in asarray
>     return array(a, dtype, copy=False, order=order)
> ValueError: could not convert string to float: ../Core2dfdr-red-sorted/100604F2_2red.0063.fits
> 
> ..........
> Note that the print statement I put in works, so the correct values are called.  I just haven't assigned in a way that python can read, and look at the path.  Trying to make my path variables strings doesn't seem to work either, so I'm a little lost.  Thanks everyone, and yes, small world Evert!
> ~Elaina
> 
> 
> -- 
> PhD Candidate
> Department of Physics and Astronomy
> Faculty of Science
> Macquarie University
> North Ryde, NSW 2109, Australia