[Numpy-discussion] NumPy re-factoring project

Pauli Virtanen pav at iki.fi
Fri Jun 11 04:17:40 EDT 2010


Thu, 10 Jun 2010 23:56:56 +0200, Sturla Molden wrote:
[clip]
> Also about array iterators in NumPy's C base (i.e. for doing something
> along an axis): we don't need those. There is a different way of coding
> which leads to faster code.
> 
> 1. Collect an array of pointers to each subarray (e.g. using
> std::vector<dtype*> or dtype**)
> 2. Dispatch on the pointer array...

This is actually what the current ufunc code does.

The innermost dimension is handled via the ufunc loop, which is a simple 
for loop with constant-size step and is given a number of iterations. The 
array iterator objects are used only for stepping through the outer 
dimensions. That is, it essentially steps through your dtype** array, 
without explicitly constructing it.

An abstraction for iteration through an array is useful, also for 	
building the dtype** array, so we should probably retain them.

For multidimensional arrays, you run here into the optimization problem 
of choosing the order of axes so that the memory access pattern makes a 
maximally efficient use of the cache. Currently, this optimization 
heuristic is really simple, and makes the wrong choice even in the simple 
2D case.

We could in principle use OpenMP to parallelize the outer loop, as long 
as the inner ufunc part is implemented in C, and does not go back into 
Python. But if the problem is memory-bound, it is not clear that 
parallelization helps.

Another task for optimization would perhaps be to implement specialized 
loops for each operation, currently we do one function indirection per 
iteration which probably costs something. But again, it may be that the 
operations are already bounded by memory speed, and this would not 
improve performance.

-- 
Pauli Virtanen




More information about the NumPy-Discussion mailing list