[Numpy-discussion] Array concatenation performance

John Porter jporter at cambridgesys.com
Thu Jul 15 12:23:20 EDT 2010


ok - except that vstack doesn't seem to work for 2d arrays (without a
reshape) which is what I'm actually after.

The difference between the numpy.concatenate version and numpy.array is fairly
impressive though, I get a factor of > 50x. It would be nice to know why.

On Thu, Jul 15, 2010 at 4:15 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
> On Thu, Jul 15, 2010 at 11:05 AM, John Porter <jporter at cambridgesys.com> wrote:
>> You're right - I screwed up the timing for the one that works...
>> It does seem to be faster.
>>
>> I've always just built arrays using nx.array([]) in the past though
>> and was surprised
>> that it performs so badly.
>>
>>
>> On Thu, Jul 15, 2010 at 2:41 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>> On Thu, Jul 15, 2010 at 5:54 AM, John Porter <jporter at cambridgesys.com> wrote:
>>>> Has anyone got any advice about array creation. I've been using numpy
>>>> for a long time and have just noticed something unexpected about array
>>>> concatenation.
>>>>
>>>> It seems that using numpy.array([a,b,c]) is around 20 times slower
>>>> than creating an empty array and adding the individual elements.
>>>>
>>>> Other things that don't work well either:
>>>>    numpy.concatenate([a,b,c]).reshape(3,-1)
>>>>    numpy.concatenate([[a],[b],[c]))
>>>>
>>>> Is there a better way to efficiently create the array ?
>>>>
>>>
>>> What was your timing for concatenate?  It wins for me given the shape of a.
>>>
>>> In [1]: import numpy as np
>>>
>>> In [2]: a = np.arange(1000*1000)
>>>
>>> In [3]: timeit b0 = np.array([a,a,a])
>>> 1 loops, best of 3: 216 ms per loop
>>>
>>> In [4]: timeit b1 = np.empty(((3,)+a.shape)); b1[0]=a;b1[1]=a;b1[2]=a
>>> 100 loops, best of 3: 19.3 ms per loop
>>>
>>> In [5]: timeit b2 = np.c_[a,a,a].T
>>> 10 loops, best of 3: 30.5 ms per loop
>>>
>>> In [6]: timeit b3 = np.concatenate([a,a,a]).reshape(3,-1)
>>> 100 loops, best of 3: 9.33 ms per loop
>>>
>
> One more.
>
> In [26]: timeit b4 = np.vstack((a,a,a))
> 100 loops, best of 3: 9.46 ms per loop
>
> Skipper
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



More information about the NumPy-Discussion mailing list