Standard Forth versus Python: a case study

Wed Oct 11 16:38:51 EDT 2006

I realized that I have a little job on the table that is a fine test of 
the Python versus Standard Forth code availability and reusability issue.

Note that I have little experience with either Python or Standard Forth 
(but I have much experience with a very nonstandard Forth). I've noodled 
around a bit with both gforth and Python, but I've never done a serious 
application in either. In my heart, I'm more of a Forth fan: Python is a 
bit too much of a black box for my taste. But in the end, I have work to 
get done.

The problem:

I have a bunch of image files in FITS format. For each raster row in 
each file, I need to determine the median pixel value and subtract it 
from all of the pixels in that row, and then write out the results as 
new FITS files.

This is a real problem I need to solve, not a made-up toy problem. I was 
originally thinking of solving it in C (I know where to get the pieces 
in that language), but it seemed like a good test problem for the Python 
versus Forth issue.

I looked to import FITS reading/writing, array manipulation, and median 
determination. From there, the solution should be pretty easy.

So, first look for a median function in Python. A little googling finds:

http://www.astro.cornell.edu/staff/loredo/statpy/

Wow! This is good stuff! An embarrassment of riches here! There are even 
several FITS modules, and I wasn't even looking for those yet. And just 
for further gratification, the page's author is an old student of mine 
(but I'll try not to let this influence my attitude). So, I followed the 
link to:

http://www.nmr.mgh.harvard.edu/Neural_Systems_Group/gary/python.html

 From there, I downloaded stats.py, and the two other modules the page 
says it requires, and installed them in my site-packages directory. Then 
"from stats import median" got me a function to approximately determine 
the median of a list. It just worked. The approximation is good enough 
for my purposes here.

Pyfits required a little more resourcefulness, in part because STSCI's 
ftp server was down yesterday, but I got it installed too. It helps that 
when something is missing, the error message gives you a module name. It 
needs the numarray module, so I got array manipulation as a side effect.

I haven't finished the program, but I've tried out the pieces and all 
looks well here.

OK, now for Forth. Googling for "forth dup swap median" easily found:

http://www.taygeta.com/fsl/library/find.seq

At first blush, this looked really good for Forth. The search zeroed in 
on just what I wanted, no extras. The algorithm is better than the one 
in the Python stats module: it gives exact results, so there's no need 
to check that an approximation is good enough. But then, the 
disappointment came.

What do you do with this file? It documents the words it depends on, but 
not where to get them. I'm looking at a significant effort to assemble 
the pieces here, an effort I didn't suffer through with Python. So, my 
first question was: "Is it worth it?".

The answer came from searching for FITS support in Forth. If it exists 
in public, it must be really well hidden. That's a "show stopper", so 
there was no point in pursuing the Forth approach further.

In the end, it was like comparing a muzzle-loading sharpshooter's rifle 
with a machine gun: Forth got off one really good shot, but Python just 
mowed the problems down.

The advocates of the idea that Standard Forth has been a boon to code 
reusability seem mostly to be people with large private libraries of 
Forth legacy code. No doubt to them it really has been a boon. But I 
think this little experiment shows that for the rest of us, Python has a 
published base of reusable code that puts Forth to shame.

-- 
John Doty, Noqsi Aerospace, Ltd.
--
Specialization is for robots.