[Numpy-discussion] numpy.concatenate slower than slice copying

Chris Colbert sccolbert at gmail.com
Tue Aug 17 17:42:54 EDT 2010


Yes, concatenate is doing other work under the covers. In short, in supports
concatenating a list of arbitrary python sequences into an array and does
checking on each element of the tuple to ensure it is valid to concatenate.

On Tue, Aug 17, 2010 at 9:03 AM, Zbyszek Szmek <zbyszek at in.waw.pl> wrote:

> Hi,
> this is a problem which came up when trying to replace a hand-written
> array concatenation with a call to numpy.vstack:
> for some array sizes,
>
>   numpy.vstack(data)
>
> runs > 20% longer than a loop like
>
>   alldata = numpy.empty((tlen, dim))
>   for x in data:
>        step = x.shape[0]
>        alldata[pos:pos+step] = x
>        pos += step
>
> (example script attached)
>
> $ python del_cum3.py numpy 10000 10000 1 10
> problem size: (10000x10000) x 1 = 10^8
> 0.816s <------------------------------- numpy.concatentate of 10 arrays
> 10000x10000
>
> $ python del_cum3.py concat 10000 10000 1 10
> problem size: (10000x10000) x 1 = 10^8
> 0.642s <------------------------------- slice manipulation giving the same
> result
>
> When the array size is reduced to 100x100 or so, the computation time goes
> to 0,
> so it seems that the dtype and dimension checking is negligible.
> Does numpy.concatenate do some extra work?
>
> Thanks for any pointers,
> Zbyszek
>
> PS. Architecture is amd64.
>    python2.6, numpy 1.3.0
>    or
>    python3.1, numpy 2.0.0.dev / trunk at 8510
>    give the same result.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20100817/f05a347d/attachment.html>


More information about the NumPy-Discussion mailing list