NumPy loop efficiency

Fri Mar 30 08:03:34 EST 2001

At 14:11 30/03/01 +0200, Konrad Hinsen wrote:
>Something like that (but not entirely) is possible but little known:
>
>      Numeric.add(a, b, c)
>
>does the same as
>
>      c = a + b
>
>but storing the result in the preallocated array c (which must of
>course have the right dimensions). And c can well be a or b, in which
>case the data is overwritten. By using only these functions, many
>temporary arrays can be avoided, but at the cost of producing
>unreadable code.

Another take it would be to use in-place operations such as:

z = a + b
z += c

Anyway, I'm thinking about a more general approach without the need to 
implement full-fledged support for lazy evaluation. Take this example:

z = a + b + c

It is evaluated this way:

i1 = a + b
12 = b + c
z  = i2

Two temporary arrays gets allocated. The second one is returned as the 
result of the operation. This is quite expensive both in performance and 
memory terms (specially for big arrays), and leads to the adoption of some 
less-readable code to squeeze out more performance. The question is, who 
generated the temporary array?

Logic tells me that the temporary array is generated inside __add__() - the 
method that does the operation on the array. Some optimization would be 
possible, using this technique:

1) when creating a array to store temporary results, flag the array as 
"temporary";
2) when doing some operation that takes two arrays, if one of them is 
temporary, do the operation in-place;
3) when assigning a temporary array to any other variable, reset the 
"temporay" flag.

The critical operation above is step (3). My proposal is to use a hook that 
gets called on assignment (the = operator). As far as I know, this hook 
does not exist today, but it's inclusion would allow for a number of 
optimizations similar to this one.

Carlos Ribeiro