[AstroPy] ESA Summer of Code in Space 2013
Thøger Emil Rivera-Thorsen
thoger.emil at gmail.com
Tue Jun 18 19:26:57 EDT 2013
On 18-06-2013 22:53, Joe Harrington wrote:
> Just to interject a few quick thoughts into this discussion...
>
> Knowing little of the situation on the ground WRT spectral line fitting,
> it does seem to me that it's been done a million times. If any of those
> implementations is decent, OSS, and compiled, why not just wrap it?
It's been done a million times, but I don't know of any one solution
that people are generally happy with.
Sherpa is maybe the best shot, but it comes either as part of the
enormous CXC software package, or in a good but poorly supported
stand-alone version with long-standing installation bugs etc. The GUI is
limited to the interactive capabilities of Matplotlib, the procedural
CLI interface is very easy to use from a console but not well fit for
scripting/programming, and the OO interface is not too well documented
and seemed generally obscure to me.
I've personally used it, but given the problems in connection to
installing new versions or on new computers, I have finally gone over to
LMfit. I have to do more myself, but I know that it works.
IRAF/PyRAF has finally come in a manageable package, but by IRAF
standards, "manageable" means that it comes embedded in its own Python
distribution; and Iraf/pyraf is not exactly pythonic in its usage and
interface
Another problem is that when you "just" wrap compiled routines in e.g. C
or Fortran, compiler and library dependency issues usually show up in
great numbers. The hard number crunching can sometimes be wrappped - and
has been, in e.g. mpmath - but anything more than that often relies on
archaic libraries hard-coded to need old versions not in the standard
package manager of the major OSs, etc.
I think the right way ahead is to build it on well-tried and
well-supported, actively developed packages and libraries, that should
protect against the old dependency walls and endless install failures.
> Probably the GUI would want to be in Python, but there's a lot of other
> stuff under the hood that would be time-consuming and likely slower to
> re-(re-re-re-)write. At the very least, you might save some time by
> studying what existing codes do and what people like about them.
I agree as far as if something reliable already exists, by all means,
use it. But "sometimes frugality cheats wisdom",
as the Swedish saying goes: puttig in extra effort now can save a lot of
work later.
As far as I can see, what makes sense to reuse is number crunching code.
On top of this comes
another layer of infrastructure, and yet another layer of (G)UI.
The infrastructure is the model classes that keep track of free, tied
and bound parameters, initial and best-fit values etc.,
and the way they keep track of each other.
This is where I think an extra effort could really pay off in the long
run, this is the hard part and if Done Right™, it can really
make a difference. This means that the classes for this should be either
already existing astropy classes or inherit heavily from them,
and in general, the astropy framework should be used with as little
modification as possible.
> Jon Slavin> - robust fitting routines that return error bars on fitted parameters
>
> Regarding the reporting of fits with errors, as many of us have
> painfully learned, minimizers don't give these, they only pretend to.
> Since that's often not good enough, it would be nice to see a relatively
> plug-n-play MCMC (e.g., DE-) put forward as a fitting package. It would
> have to evaluate its own distribution, gave errors, and otherwise
> behaved a bit like lmfit. Yes, there are subtleties to Markov Chains,
> but this is also true of minimizers. Getting something out there and
> inviting people to improve it could produce something usable in a few
> iterations. My group can contribute some DE-MCMC code that someone
> could adapt.
Sounds absolutely great. Sherpa has similar routines and they are really
strong work horses, so possibly, some of their code could be studied too.
> To Jon's list of requirements, I'd add:
>
> - Able to use a GUI for user-cue input OR take such input from a text
> file
> - Able to write such a text file from the GUI user-cue input, for
> subsequent runs (and add to it from a second run, etc.)
I personally think that the ideal GUI at any (or most) moment can take
both command line inputs and point-and-click inputs,
as I have tried to do in the GUI I wrote about earlier. And when it can
be given commands during runtime, of course it can also be scripted.
> One thing that isn't clear to me from the discussion is whether the
> scope is merely to identify the center and width vs. pixel number to get
> a wavelength solution or calculate Doppler shifts, or whether to do the
> whole job of reading multiple line lists, broadening the lines, and
> fitting all the lines to the data, returning column densities and
> temperatures vs. depth. In other words, will it calibrate the spectrum
> or reproduce it with a model?
I don't think that there is a general consensus on this, but in my view,
the question
must be which tasks make sense to do from the command line and which
ones will be made significantly easier
by means of a GUI, and implementing them in that order. I always shun
from software that makes me set everything through
point-and-click, file choosing dialogs etc., when it could be done
easier by CLI.
Reading a line list from a file should be a no-brainer from the command
line, and possibly, a good
fitting package would provide a few convenience functions for loading
data and line lists into the appropriate data structures.
I think that the package should start with the simpler models first, and
then move on to the more complicated stuff later.
Which is another reason why I think that a well designed infrastructure
layer is very important: once it has been made properly
modular, it is easy to write more different modules and plug them in. In
that way, if someone comes up with a smart way to implement a very
sophisticated model, we can plug it in to the existing infrastructure
and give it as an option. One example where this is relevant is in ISM
absorption lines: does one just want to fit them to quick but not too
accurate gaussians or lorentzians, or does one want to fit them to a
"geometrically" (width, depth, etc) defined Voigt profile, or to a
full-fledged voigt profile in terms of N and b and z? Proper modularity
could mean that the latter could be relatively simply plugged in to
existing infrastructure once this is built (I have made a half,and not
yet succesful, attempt at this myself).
Something I have implemented in my half-cooked GUI so far is:
- Load FITS or ascii file into a 2d-spectrum class (as CLI function)
- extract any continuous group of rows into properly weighted 1D
spectrum (interactive, can easily get a CLI convenience function).
- interactively model line profiles of up to 10 components each, and
immediately fit them to data using lmfit as backend (have a half
finished sherpa backend too).
- Saves the model for all transitions for all rows of potentially
several spectra in one big, flat, human-readable ascii file which during
runtime is handed as a pandas DataFrame, which of course also means that
all pandas and numpy operations are available for it, and the graphical
representation gets updated accordingly.
- Assign one-letter labels (represeted as colors in the plot) to each
peak. Can be done both CLI and GUI way.
- Set/edit wavelength range(s) for inclusion in fit and which ones to
ignore, CLI and GUI.
Things on my current wish list:
- Proper continuum modeling (only takes an additive constant right now).
- instrument convolution of model
- more different line profiles (lorentz, voigt, etc.). The tricky part
is the GUI.
- more convenience functions and CLI options.
- seamlessly interchangeable velocity and wavelength representation on
the diffraction axis
- more sophisticated parameter handling (freeze/thaw, tie etc.)
On the other hand, I think that things like e.g. stacking, flux
calibration, and other clearly calibration phase tasks are better left to
other kinds of software.
Cheers,
Emil
> --jh--
> _______________________________________________
> AstroPy mailing list
> AstroPy at scipy.org
> http://mail.scipy.org/mailman/listinfo/astropy
More information about the AstroPy
mailing list