[Somewhat Off Topic] AI Contest

Tim Peters tim.one at home.com
Tue Aug 7 01:37:50 EDT 2001


[Alex Martelli]
> The 'automatic' parallelizers needed so much hand-holding and source
> tweaking to work halfway-right that it was much less work to structure
> the parallel-computation aspects explicitly (well, on a FEW big CPU's,
> at least; I had no real experience with arrays of a LOT of
> middling CPU's).

[François Pinard]
> My own experience was with Cray FORTRAN.  It is true that there were
> many functions you could use to have better control over how things were
> generated.  Your whole DO loops needed to be rather simple if you wanted
> them to be turned automatically into a few vector instructions.

CFT was, for most of its life, in fact limited to vectorizing loops with a
single basic block -- it gave up if they were fancier than that.  CFT77
relaxed those restrictions over the years.  I worked on both, and my
clearest memory of the first CFT77 release is of the bitching and bitching
we endured because compile-time had dropped from CFT's 200,000 lines/minute
to CFT77's 100,000.  Never mind that it was doing global optimization while
its predecessor didn't, it was just unbearable that recompiling the world
should take 6 minutes instead of 3.  That's the last time I took users
seriously <wink>.

After years of writing higher-powered vectorizers and parallelizers at Kuch
and Associates, Michael Wolfe returned to academia.  He was shocked to
discover that virtually *none* of his university's code auto-parallelized
when he first tried it.  From that he deduced an obvious <wink> truth:  the
real gain achieved thru auto-vectorizing/parallelizing compilers was that
they taught users *how* to write code clearly worth vectorizing and
parallelizing.  Indeed, nothing made a geek happier than staying up all
night dreaming up ways to get one more "loop vectorized" msg out of the
compiler!

The long-term thriving businesses in this field didn't even try to perform
magic, and Intel has taken much the same approach with processor "multimedia
extensions" (newspeak for feeble little vector registers <0.5 wink>):  if
you want the speed enough to call their highly hand-optimized libraries to
get it, you can have it.  Else you're out of luck unless you wrestle with
VTune and assembler yourself.

downright-pythonic-in-its-practical-wisdom-ly y'rs  - tim





More information about the Python-list mailing list