[AstroPy] Python version of IDL Astron library

Perry Greenfield perry at stsci.edu
Wed Sep 8 17:53:11 EDT 2004


<x-flowed>
On Sep 8, 2004, at 10:49 AM, Joe Harrington wrote:

> Hi all,
>
> A Python version of the IDLastro library is a great idea, and crucial
> for the success of Python in the astronomy community.  The main issues
> raised so far:
>
> underlying packages (numeric vs. Numarray, plotting)
> Python, C, C++, FORTRAN?
> closeness of implementation to IDL library
>
> to which I'll add (interspersed):
>
> documentation
> code conversion
> coordination
>
> UNDERLYING PACKAGES
>
> There isn't much of a standard in the IDL library, except for
> consistently following the IDL doc header format.  However, Python's
> richness and the fact that this won't be a first implementation by a
> few people opens us up to greater risk than the IDL developers faced.
> I think that we should develop a set of standards.  At SciPy '04, we
> came up with the following:
>
> 1. Numarray for all new code, and to teach to new users.  New docs
>    teach only Numarray.  Soon Numarray vs. numeric will be a run-time
>    choice, and hopefully in a year or two there won't be a compelling
>    reason to use numeric over Numarray.  Right now the issues are
>    mainly performance issues for small arrays (less than 1000
>    elements), which matter more to mathematicians and biologists than
>    to astronomers.
>
I'll agree with that (no surprise).

> 2. Matplotlib for all plotting.  Try it.  You'll like it!  It has a
>    matlab-like interface, but that's just a front end to the full OO
>    plotting functionality.  You don't have to learn the matlab-like
>    interface if you don't want to.  It's interactive so you can adjust
>    your axis scaling with the mouse, etc.
>
I'll just add that it needs some more documentation. John Hunter is 
developing
a user guide which looks to be more than half done now. There are still 
some
rough edges but it is pretty functional at this point. Nothing come 
close to
it in terms of cross platform support and cross GUI support.

> 3. Ipython.  This is a wonderful and 100% compatible improvement to
>    the interactive parser.  It doesn't affect programming at all, so
>    it's not very relevant to the current topic.
>
True, but it is really a cool tool that all should be aware of.

> 4. All of the above to be provided in binary packages as well as
>    source packages and tarballs.  Each currently-popular platform must
>    be supported by its native packager:
> 	Fedora		RPM/YUM
> 	Debian		Debian Package Manager/APT
> 	Sun		tar+checkinstall?  or is there something better now?
> 	PC		I forget the name, but there is one now
> 	MAC		I forget the name, but there is one now
>
> 5. A community effort to (re)write all needed documentation.
>
> 6. A community effort to create an excellent web site at scipy.org,
>    with (links to) discipline-specific pages that collect, organize,
>    rate, and distribute topical application software.  The goal here
>    is to proactively take the user to where he or she needs to go, not
>    just to throw a lot of information around and baffle people.
>
Well, it raises the question of whether it should all be hosted at 
scipy.org
or not. I think of an astron equivalent as being primarily 
astronomy-oriented
in scope; to the extent that scipy.org will have areas for specialties 
such as
astronomy, that makes sense (something we should pursue with them).
  If not, an astropy site for astron-type tools and scipy.org for 
general science
and engineering tools.

Would you volunteer to manage the top-level scipy web page?
>
> PYTHON/C/C++/FORTRAN
>
> Here's how this affects AstroPy (or PyAstro, or whatever Paul wants to
> call it).  First, the choice of what underlying code to use (pure
> Python, C, C++, FORTRAN) should be the developer's, so long as that
> person can get it working in scipy_distutils on all platforms.  I was
> told at the conference that Python, C, and C++ are all easy to get
> working, but that FORTRAN is problematic.  However, there was a
> presentation at the conference of a new build utility (it doesn't have
> a name, though I jokingly called it "makeover") that claims to have
> cataloged all the options for every FORTRAN compiler in existence,
> including Borland's older ones.  So, that may become the new standard,
> or that code may be borrowed by distutils.  Stay tuned on that one.
>
Since there is a lot of good Fortran code, this would be very nice.
Much of scipy is based on Fortran code (and the source of most of the
build and installation hassles).

> There are strong benefits to writing in a compiled language, the main
> one being portability to later interactive languages.  Python will not
> last forever.  A good run for an interpreted language is 15 years.  In
> contrast, FORTRAN and C are 2-3 times as old and going strong.  Python
> may grow and last longer, or may not.  When we move on to the next
> great thing, we will again go through an agonizing process of
> rewriting and conversion.  The stuff that gets rewritten first will be
> the stuff that's just wrappers around C, C++, and FORTRAN.  Wouldn't
> it be great if that's nearly everything?  If you look at SciPy now,
> the majority of it is wrapped compiled libraries that implement
> practically all of the basic numerical routines.  Imagine where we'd
> be if we had to *write* that stuff in Python, rather than wrap it.
> Imagine how hard the next switch in interactive languages would be if
> we did that.
>
> There is also a strong argument against rewriting code that already
> works.  We have a lot of work to do, so if you have working compiled
> code, please just wrap it.  If you are writing new code, it might make
> sense for large, monolithic algorithms to be written in C or C++ and
> to be wrapped for Python.  Smaller routines (the majority, by number
> at least) are probably best done in Python directly.  These won't be
> hard to redo in the next language.  In any case, if you don't want to
> learn distutils and won't be providing distutils-building code, please
> stick to Python, or hook up with a distutils hacker.
>
On the other hand, I'd argue that things that are currently written in 
IDL
are probably best mapped to python given that the effort to write them
in C or another compiled language is going to be substantially larger.
It's not clear to me that writing a library in C that could be done just
as well (i.e., run efficiently) in Python should be written in C. Sure, 
it
can be used with other languages, including future scripting languages, 
but
if it entails 5 times a much work, one always has to ask if the 
tradeoff is
worth it. Often it won't be (for much the same reason it wasn't for 
IDL).

I'll argue that Python will be around considerably longer than 15 years
(it is almost that old now and Perl certainly is). But it won't last 
forever.
It may be that there will be good tools for translating Python into new
languages in the future though.

> DOCS
>
> Documentation is a strong suit of IDL's and we need to be as good.  I
> am not aware of a standard doc header for Python.  If there is one, we
> should use it.  If not, I suggest we essentially copy IDL's, with some
> modifications to rationalize it and to make it work for Python.  We
> also need a code that will extract the doc pages from a package and
> will collect the docs, turn them into HTML and PDF, and put them in a
> searchable database.  This code will be central to SciPy as a whole so
> upstream coordination with the soon-to-be-born doc effort would be
> great.
>
Epydoc looked promising. It was presented at pycon (link to paper:
http://www.python.org/pycon/dc2004/papers/37/epydoc.pdf ). We haven't
used it but it looked good as the first thing to try.

While we are speaking of such things, I'd suggest use of doctest as
the default testing framework. It's true that some kinds of tests are
better handled using unittest, but for simplicity and transparency, it's
hard to beat doctest.

> The source language of docs is another issue.  It has to be open and
> functional on all platforms.  It has to handle simple markup and
> inline figures.  It has to produce PDF and HTML.  MSWord is out.
> Research Structured Text or LaTeX are my votes.  Some like LyX and
> TeXmacs.  We'll see what the community wants.
>
Some have suggested openoffice as a portable and free alternative.
Did you mean ReStructured Text? That's generally good, but has problems
handling equations well, and for this application that seems like a
killer. LaTeX is clumsy, but does equations well.

> SYNTACTIC CLOSENESS
>
> Syntactic closeness to the IDL library would be nice but isn't crucial
> from my perspective.  Where things are reasonable, keep them.  Where
> they are not (what does "sxpar" mean, and to whom?), make them so.  We
> could provide a compatibility wrapper for fans of sxpar.
>
I'd agree (and a dual interface probably makes sense for some things).
As for the FITS examples, I'd argue we would be doing a disservice by
trying to retain compatibility. The IDL interface is generally harder to
use than PyFITS now I'd say.

> CODE CONVERTER
>
> One idea that has been tossed around is a code converter.  There are
> two approaches to this, which I call the 80% solution and the 100%
> solution.  The 80% solution converts all the procedural code on a
> line-by-line basis.  It would do the gruntwork, and would ensure that
> array access, etc., is converted correctly (otherwise subtle bugs will
> creep in).  The developer would still have some gruntwork to do to get
> it to actually work, but not a lot, particularly if the code was
> simple and didn't depend on a lot of IDL library routines.  The 100%
> solution would be 100 times harder to write.  It would convert all the
> code, including the OO parts, and would also implement the IDL
> library, generally in terms of a set of wrapper routines that called
> SciPy routines.  It would actually be a re-implementation of IDL.  I
> claim this is unnecessary and not even very desirable.  We want people
> to switch to Python and to contribute new code to the community.  I
> have no personal interest in providing them a free IDL.
>
> Anyway, if we had a converter, this entire project, as well as the
> project of getting astronomers to switch, would be *much* easier!  It
> would be safest from a legal point of view for the converter authors
> NEVER to have run IDL nor to have read RSI's manuals.  The IDL license
> prohibits reverse-engineering, and most of us have agreed to it.  I
> don't believe it's reverse engineering to write a code converter from
> a language that you know into another language that you know that has
> the same capability.  However, I'm not a lawyer, and I don't have the
> means to fight that battle in court.  IDL is simple enough that
> someone who doesn't know it can get a commercial book on IDL, such as
> Gumley's Practical IDL Programming, that provides all the information
> that's needed.  Any of us experienced IDL users can then publish a FAQ
> on the web that answers any questions the programmers might have,
> without our actually writing or looking at any code.  This should keep
> the whole effort legally above reproach.
>
I guess I would like some experience trying the brute force approach 
before
tackling code conversion for this (to get a little experience anyway).
Besides, I've used IDL previously (does the fact that a company or 
university
has purchased a License mean that all that work for it, whether or not 
they
have used it, have effectively agreed to the license?)

This certainly would be useful for persuading IDL users so that they
could more easily convert their code.

> COORDINATION
>
> Finally, to address coordination issues, I think community testing and
> review is key.  Let's ask people to post their development plans in
> advance for community design review and to update a public-read CVS
> often.  Let's have public review and testing of all new code before
> accepting it into the library.  That way we'll address
> interoperability issues and catch bugs early.  If we're a successful
> community project, Paul's greatest contribution will be the
> coordination of the effort and the moderation of the review process,
> rather than the bits of code he produces himself.
>
> As always, I welcome your comments.

Paul mentioned that you or someone suggested that the mailing list move 
to a
scipy supported one (same name? scipy-astro?). I'd say we should do
that as soon as possible to get the features of mailman (archives in
particular).

Perry

_________________________________________________
AstroPy mailing list     -      astropy at stsci.edu
http://www.astro.washington.edu/owen/AstroPy.html
</x-flowed>


More information about the AstroPy mailing list