[Numpy-discussion] Array concatenation performance

Anne Archibald aarchiba at physics.mcgill.ca
Thu Jul 15 15:12:28 EDT 2010


On 15 July 2010 13:38, Sturla Molden <sturla at molden.no> wrote:
> Sorry for the previous mispost.
>
> This thread remids me of something I've though about for a while: Would
> NumPy benefit from an np.ndarraylist subclass of np.ndarray, that has an
> O(1) amortized append like Python lists? (Other methods of Python lists
> (pop, extend) would be worth considering as well.) Or will we get the
> same performance by combining a Python list and ndarray?

This idea, an appendable ndarray, has been discussed before; the
conclusion was that yes, it's sometimes useful, and in fact I think
there was code written for it. The idea was an ndarray subclass that
allowed appending through realloc and simple indexing, but made copies
when slices were taken (so that realloc wouldn't move the memory out
from under it). It could be "frozen" to an ordinary ndarray when
appending was done.

To answer the OP's question, np.array is a do-what-I-mean function
that examines its argument and deduces the shape, size, and dtype for
the new array that it constructs. For example, if you pass it a python
list, it must walk through the list and examine the objects to find
the numeric dtype that contains them (e.g. integer real or complex);
if there are any peculiar objects in there it will construct an object
array. (I don't know whether it tries int() float() and complex() on
every object or whether it examines their type information.) In any
case, all this imposes some bookkeeping overhead that is unnecessary
for np.concatenate. For three-dimensional arrays you might try
np.dstack, by the way, or you can concatenate along a new axis (not
that reshaping is usually expensive):
In [1]: a = np.random.randn(4,5)

In [2]: b = a[np.newaxis,...]

In [3]: np.concatenate([b,b,b], axis=0).shape
Out[3]: (3, 4, 5)

Anne

> Sturla
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



More information about the NumPy-Discussion mailing list