[Numpy-discussion] Array concatenation performance

Skipper Seabold jsseabold at gmail.com
Thu Jul 15 12:33:02 EDT 2010


On Thu, Jul 15, 2010 at 12:23 PM, John Porter <jporter at cambridgesys.com> wrote:
> ok - except that vstack doesn't seem to work for 2d arrays (without a
> reshape) which is what I'm actually after.
>

Ah, then you might want hstack.  There is also a column_stack and
row_stack if you need to go that route.

> The difference between the numpy.concatenate version and numpy.array is fairly
> impressive though, I get a factor of > 50x. It would be nice to know why.
>

Sorry, I don't have any deep insight here.  There is probably just
overhead in the array creation.  Consider if you try to use hstack and
company on lists.

In [1]: import numpy as np

In [2]: a = np.arange(1000*1000)

In [3]: b = a.tolist()

In [4]: timeit b0 = np.array((a,a,a))
1 loops, best of 3: 217 ms per loop

In [5]: timeit b1 = np.vstack((b,b,b))
1 loops, best of 3: 380 ms per loop

Skipper


> On Thu, Jul 15, 2010 at 4:15 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>> On Thu, Jul 15, 2010 at 11:05 AM, John Porter <jporter at cambridgesys.com> wrote:
>>> You're right - I screwed up the timing for the one that works...
>>> It does seem to be faster.
>>>
>>> I've always just built arrays using nx.array([]) in the past though
>>> and was surprised
>>> that it performs so badly.
>>>
>>>
>>> On Thu, Jul 15, 2010 at 2:41 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>>> On Thu, Jul 15, 2010 at 5:54 AM, John Porter <jporter at cambridgesys.com> wrote:
>>>>> Has anyone got any advice about array creation. I've been using numpy
>>>>> for a long time and have just noticed something unexpected about array
>>>>> concatenation.
>>>>>
>>>>> It seems that using numpy.array([a,b,c]) is around 20 times slower
>>>>> than creating an empty array and adding the individual elements.
>>>>>
>>>>> Other things that don't work well either:
>>>>>    numpy.concatenate([a,b,c]).reshape(3,-1)
>>>>>    numpy.concatenate([[a],[b],[c]))
>>>>>
>>>>> Is there a better way to efficiently create the array ?
>>>>>
>>>>
>>>> What was your timing for concatenate?  It wins for me given the shape of a.
>>>>
>>>> In [1]: import numpy as np
>>>>
>>>> In [2]: a = np.arange(1000*1000)
>>>>
>>>> In [3]: timeit b0 = np.array([a,a,a])
>>>> 1 loops, best of 3: 216 ms per loop
>>>>
>>>> In [4]: timeit b1 = np.empty(((3,)+a.shape)); b1[0]=a;b1[1]=a;b1[2]=a
>>>> 100 loops, best of 3: 19.3 ms per loop
>>>>
>>>> In [5]: timeit b2 = np.c_[a,a,a].T
>>>> 10 loops, best of 3: 30.5 ms per loop
>>>>
>>>> In [6]: timeit b3 = np.concatenate([a,a,a]).reshape(3,-1)
>>>> 100 loops, best of 3: 9.33 ms per loop
>>>>
>>
>> One more.
>>
>> In [26]: timeit b4 = np.vstack((a,a,a))
>> 100 loops, best of 3: 9.46 ms per loop
>>
>> Skipper
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



More information about the NumPy-Discussion mailing list