[Numpy-discussion] Optimization question for ufuncs

Travis Oliphant oliphant at ee.byu.edu
Fri Feb 4 14:23:08 EST 2005

I've been thinking lately about ufuncs and I would love to hear the 
opinion of others.

I like what numarray has done with the temporary buffer ideas so that 
full copies are never made if they are just going to be thrown away.   
This has led to other thoughts about possible improvements to the ufunc 
object to support "ufunc chaining" so that array operations on 
expressions don't have to create any temporary copies (using buffers 
instead) --- I think I remember the numarray guys thinking along these 
lines as well.

Regardless, there is always an inner for loop (for each type) that 
performs the requested operation.   The question I have is whether to 
assume unit strides for the inner loop.  The current Numeric ufunc inner 
loops allow for discontiguous memory to be accessed during the loop 
(non-unit strides).   I'm not sure what numarray does, I think it only 
allows for unit strides and uses temporary buffers to support 
discontiguous arrays.

Is this requirement for unit-strides on the inner loop a good one?  Does 
it allow faster code to be compiled?   Is it part of the reason that 
numarray is a little faster on large arrays?

I am not an optimization expert, though I've read a bit as of late.  I'm 
just wondering what the experts on this list think about unit-strides 
versus non unit-strides on the inner loop?


Travis O.

More information about the NumPy-Discussion mailing list