[Numpy-discussion] Optimization suggestion sought

Enzo Michelangeli enzomich at gmail.com
Sat Jan 1 06:23:37 EST 2011


----- Original Message ----- 
From: "Robert Bradshaw" <robertwb at math.washington.edu>
Sent: Wednesday, December 29, 2010 4:47 PM

[...]
>> Regarding Justin's suggestion, before trying Cython (which, according to
>> http://wiki.cython.org/tutorials/numpy , seems to require a bit of work
>> to
>> handle numpy arrays properly)
>
> Cython doesn't have to be that complicated. For your example, you just
> have to unroll the vectorization (and account for the fact that the
> result is mutated in place, which was your original goal).

Thanks, but the full de-vectorization forces to give up any use of BLAS (I
suppose that for array products numpy relies on its routines). In my tests,
the performance in terms of speed is more or less the same as the original
pure-numpy code (which may be made less memory-hungry with the chunking
suggested by Josef).

Instead, it would be nice to have a native function able to perform
evaluation of arbitrary numpy expressions without converting the
intermediate results in Python format (a sort of "better weave.blitz", able
to understand slicing, broadcasting rules etc.). That would give us the best
of both worlds: code execution at BLAS speeds, and savings in unnecessary
conversions and temporary variable allocations. Such "numpy calculator"
could also be a simple interpreter, avoiding the complexities and site
dependencies deriving from the use of a C compiler: it should build
temporary C data structures for the parameters in input, call the relevant C
ATLAS/BLAS/LAPACK functions in the right order (possibly allocating
temporary C arrays), and convert only the final result back to a Python
object.

Enzo




More information about the NumPy-Discussion mailing list