[Numpy-discussion] Array and vector initialization good practice

Anne Archibald peridot.faceted at gmail.com
Wed Jun 18 11:34:50 EDT 2008


2008/6/18 Dinesh B Vadhia <dineshbvadhia at hotmail.com>:
> Say, there are two initialized numpy arrays:

It is somewhat misleading to think of numpy arrays as "initialized" or
"uninitialized". All numpy arrays are dynamically allocated, and under
many circumstances they are given their values on creation.

> A = numpy.blah
> B = numpy.blah.blah
>
> With, C = A*d + B*e which doesn't have to be initialized beforehand (as A
> and B are and so are d and e)

What actually happens in the evaluation of this expression is that a
temporary array (say, T1) is created containing A*d, another (say T2)
is created containing B*e, and then a new array is created containing
T1+T2. The name "C" is set to  point to this freshly-allocated array.
Since nothing references T1 and T2 anymore, they are freed.

> Now, place this in a function ie.
>
> def havingFun(d, e)
>     C = A*d + B*e
>     return C
>
> The main program calls havingFun(d, e) continuously with new vectors d and e
> each time (A and B do not change).
>
> Assume that A, B, C, d, e are all very large (>100,000 rows and/or
> columns).
>
> In the first call to havingFun, C is created.  In the next call, does
> Python/Numpy/Scipy have C available to overwrite it or does it create a new
> C (having deleted the previous C)?

The short answer is that yes, it allocates a new C.

> Wrt to efficiency and performance does it matter if C is initialized
> beforehand or not?

If you had first done
C = np.zeros(100000)
then when you write
C = A*d + B*e
the name "C" is made to point to the freshly-allocated array rather
than the old array (which is still full of zeros). If nothing else
points to the array full of zeros, it is deleted. So "initializing" C
in this way does nothing but waste time.

Memory allocation is a very fast operation - just a handful of CPU
cycles under normal circumstances. Generally it's better to write your
program without worrying about the creation of temporary arrays; only
if
(a) the program is too slow, and
(b) you have determined that it is the creation of temporary arrays is
what is making it slow
is it worth rewriting your program to reduce their use.

There *is* a way to reduce the generation of temporary arrays, but it
does not always make your program faster. It certainly makes it harder
to read. The last time I went through and did this to my program -
which allocated several hundred megabytes worth of arrays - it didn't
make the slightest difference to the runtime. But in case if helps,
here it is:

Most numpy element-wise calculations are carried out by so-called
"ufuncs", which I think stands for "universal functions". These come
with a certain amount of machinery to handle broadcasting, and they
are invoked both directly, as in "np.sin(x)" and "np.atan2(x,y)" and
implicitly as in "-x" and "x+y", which get translated to
"np.negative(x)" and "np.add(x,y)" under the hood. Among the machinery
they support is _output arguments_:

In [6]: x = np.linspace(0,2*np.pi,9)

In [7]: y = np.zeros(9)

In [8]: np.sin(x,y)
Out[8]:
array([  0.00000000e+00,   7.07106781e-01,   1.00000000e+00,
         7.07106781e-01,   1.22460635e-16,  -7.07106781e-01,
        -1.00000000e+00,  -7.07106781e-01,  -2.44921271e-16])

In [9]: y
Out[9]:
array([  0.00000000e+00,   7.07106781e-01,   1.00000000e+00,
         7.07106781e-01,   1.22460635e-16,  -7.07106781e-01,
        -1.00000000e+00,  -7.07106781e-01,  -2.44921271e-16])

Regrettably, they do not accept keyword arguments ("out=y") but if you
give a ufunc more arguments than it normally takes, the last one is an
output argument, indicating that the results are stored in the
existing array rather than a newly-allocated one. Some other numpy
functions (but not all of them) can accept an output argument as well.
Thus if for some reason you want to reduce memory allocation, you can
rewrite your program, so that
C = A*d+B*e
becomes
C = np.empty(n)
T = np.empty(n)
np.multiply(B,e,T)
np.multiply(A,d,C)
np.add(C,T,C)

You should be careful with this, because I have found that reducing
temporaries can *increase* the memory usage of my programs, since it
involves keeping around arrays that would normally be deleted and
reallocated only when needed. But once in a while, it may speed things
up.

Anne

P.S. There are other approaches to reducing temporaries; numpexpr was
created for the explicit purpose, so you might look into them if
you're interested. -A



More information about the NumPy-Discussion mailing list