[Numpy-discussion] In-place operations

Tue Sep 12 10:51:03 EDT 2006

Hello Pierre,

El dt 12 de 09 del 2006 a les 09:52 -0400, en/na Pierre Thibault va
escriure:
> Hi,
> 
> I would like to have information on the best techniques to do in-place
> calculations and to minimize temporary array creations. To me this
> seems to be very important whenever the arrays become very large.
> 
> I already know about the ufunc in-place functionality (which is great).
> 
> More specifically, here are examples that occured in my code
> 
> 1) FFTs:  Let A and B be two large arrays, already allocated. I want
> the fft of A to be stored in B. If I just type B = fft(A), there is a
> temprary array creation, right? Is it possible to avoid that?

Well, in some way, there is a temporary array creation that is
immediately bound to B, so in the end, the temporary is not so
temporary, but a new (bounded) object. Obviously, the object that was
referencing B is freed (unless there is another reference to it).

> 
> 2) Function output: In general, I think the same thing happens with
> functions like
> 
> def f1(array_in):
>    array_out = # something using array_in
>    return array_out
> 
> Then, if B is already allocated, writing B = f1(A) involves again a
> temporary array creation

Again, it depends what do you understand by 'temporary'.

> 
> I thought instead of doing something like
> 
> def f2(array_in, array_out):
>   array_out[:] = # something
>   # Is this good practice?
> 
> and call f2(A,B).
> 
> If I understand well, this still requires a temporary array creation.
> Is there another way of doing that (appart from actually looping
> through the indices of A and B)?

Here I'd say that yes, you are creating a truly temporary object and
then assign element by element to B.

> 
> I know that the use of f2 has one clear advantage: it makes sure that
> whatever was in B is discarded. With f1, the following could happen:
> 
> A # contains something
> B # contains something
> C = B
> B = f1(A)
> 
> C still contains whatever was in B. This could be what you wanted, but
> if you consider C just as a reference to B, this is not good.

This is not good if you want to get rid of the object pointed by B, but
in general, this is considered a nice feature of python.

> 
> I guess these considerations are not standard python problems because
> you expect python to take care of memory issues. With big arrays in
> scientific computations, I feel the question is more relevant. I might
> be wrong...

If you are worried about wasting memory, just get familiar with this
rule: an object only exists in memory when it is referenced (bounded) by
a variable. When the object is no longer referenced, its memory becomes
freed and is available to the system for later reuse. With this,

B = fft(A)

will create a new object (the FFT of A), the object pointed by B will be
freed (if there is not any other reference to it) and the new object
will be bound to B.

If what you want is to avoid having in memory the three objects (namely
A, old B and new B) at the same time, you can do something like:

del B   # deletes reference to object pointed by B
B = fft(A)  # B gets bounded to new FFT object

HTH,

-- 
>0,0<   Francesc Altet     http://www.carabos.com/
V   V   Cárabos Coop. V.   Enjoy Data
 "-"