Mailman 3 Array concatenation performance - NumPy-Discussion

Array concatenation performance

John Porter

15 Jul 2010 15 Jul '10

3:54 a.m.

Has anyone got any advice about array creation. I've been using numpy for a long time and have just noticed something unexpected about array concatenation. It seems that using numpy.array([a,b,c]) is around 20 times slower than creating an empty array and adding the individual elements. Other things that don't work well either: numpy.concatenate([a,b,c]).reshape(3,-1) numpy.concatenate([[a],[b],[c])) Is there a better way to efficiently create the array ? See the following snippet: --------------------------------------- import time import numpy as nx print 'numpy version', nx.version.version t = time.time() # test array a = nx.arange(1000*1000) print 'a ',time.time()-t t = time.time() # create array in the normal way.. b0 = nx.array([a,a,a]) print 'b0',time.time()-t t = time.time() # create using empty array b1 = nx.empty((3,)+(a.shape)) b1[0] = a b1[1] = a b1[2] = a print 'b1',time.time()-t print nx.all((b0==b1)) ----------------------------------------- Produces the output: numpy version 1.3.0 a 0.0019519329071 b0 0.286643981934 b1 0.0116579532623 equal True

Show replies by date

Skipper Seabold

15 Jul 15 Jul

7:41 a.m.

On Thu, Jul 15, 2010 at 5:54 AM, John Porter <jporter@cambridgesys.com> wrote:

...

What was your timing for concatenate? It wins for me given the shape of a. In [1]: import numpy as np In [2]: a = np.arange(1000*1000) In [3]: timeit b0 = np.array([a,a,a]) 1 loops, best of 3: 216 ms per loop In [4]: timeit b1 = np.empty(((3,)+a.shape)); b1[0]=a;b1[1]=a;b1[2]=a 100 loops, best of 3: 19.3 ms per loop In [5]: timeit b2 = np.c_[a,a,a].T 10 loops, best of 3: 30.5 ms per loop In [6]: timeit b3 = np.concatenate([a,a,a]).reshape(3,-1) 100 loops, best of 3: 9.33 ms per loop Skipper

...

John Porter

9:05 a.m.

You're right - I screwed up the timing for the one that works... It does seem to be faster. I've always just built arrays using nx.array([]) in the past though and was surprised that it performs so badly. On Thu, Jul 15, 2010 at 2:41 PM, Skipper Seabold <jsseabold@gmail.com> wrote:

...

On Thu, Jul 15, 2010 at 5:54 AM, John Porter <jporter@cambridgesys.com> wrote:

...
Has anyone got any advice about array creation. I've been using numpy for a long time and have just noticed something unexpected about array concatenation.

It seems that using numpy.array([a,b,c]) is around 20 times slower than creating an empty array and adding the individual elements.

Other things that don't work well either: numpy.concatenate([a,b,c]).reshape(3,-1) numpy.concatenate([[a],[b],[c]))

Is there a better way to efficiently create the array ?

What was your timing for concatenate? It wins for me given the shape of a.

In [1]: import numpy as np

In [2]: a = np.arange(1000*1000)

In [3]: timeit b0 = np.array([a,a,a]) 1 loops, best of 3: 216 ms per loop

In [4]: timeit b1 = np.empty(((3,)+a.shape)); b1[0]=a;b1[1]=a;b1[2]=a 100 loops, best of 3: 19.3 ms per loop

In [5]: timeit b2 = np.c_[a,a,a].T 10 loops, best of 3: 30.5 ms per loop

In [6]: timeit b3 = np.concatenate([a,a,a]).reshape(3,-1) 100 loops, best of 3: 9.33 ms per loop

Skipper

...
See the following snippet: --------------------------------------- import time import numpy as nx print 'numpy version', nx.version.version t = time.time() # test array a = nx.arange(1000*1000) print 'a ',time.time()-t t = time.time() # create array in the normal way.. b0 = nx.array([a,a,a]) print 'b0',time.time()-t t = time.time() # create using empty array b1 = nx.empty((3,)+(a.shape)) b1[0] = a b1[1] = a b1[2] = a print 'b1',time.time()-t print nx.all((b0==b1)) ----------------------------------------- Produces the output: numpy version 1.3.0 a 0.0019519329071 b0 0.286643981934 b1 0.0116579532623 equal True _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Fabrice Silva

8:58 a.m.

Le jeudi 15 juillet 2010 à 16:05 +0100, John Porter a écrit :

...

Can anyone provide an explanation (or a pointer) to such differences ? Thanks

Skipper Seabold

9:15 a.m.

On Thu, Jul 15, 2010 at 11:05 AM, John Porter <jporter@cambridgesys.com> wrote:

...

One more. In [26]: timeit b4 = np.vstack((a,a,a)) 100 loops, best of 3: 9.46 ms per loop Skipper

John Porter

10:23 a.m.

ok - except that vstack doesn't seem to work for 2d arrays (without a reshape) which is what I'm actually after. The difference between the numpy.concatenate version and numpy.array is fairly impressive though, I get a factor of > 50x. It would be nice to know why. On Thu, Jul 15, 2010 at 4:15 PM, Skipper Seabold <jsseabold@gmail.com> wrote:

...

Skipper Seabold

10:33 a.m.

On Thu, Jul 15, 2010 at 12:23 PM, John Porter <jporter@cambridgesys.com> wrote:

...

ok - except that vstack doesn't seem to work for 2d arrays (without a reshape) which is what I'm actually after.

Ah, then you might want hstack. There is also a column_stack and row_stack if you need to go that route.

...

The difference between the numpy.concatenate version and numpy.array is fairly impressive though, I get a factor of > 50x. It would be nice to know why.

Sorry, I don't have any deep insight here. There is probably just overhead in the array creation. Consider if you try to use hstack and company on lists. In [1]: import numpy as np In [2]: a = np.arange(1000*1000) In [3]: b = a.tolist() In [4]: timeit b0 = np.array((a,a,a)) 1 loops, best of 3: 217 ms per loop In [5]: timeit b1 = np.vstack((b,b,b)) 1 loops, best of 3: 380 ms per loop Skipper

...

On Thu, Jul 15, 2010 at 4:15 PM, Skipper Seabold <jsseabold@gmail.com> wrote:

...
On Thu, Jul 15, 2010 at 11:05 AM, John Porter <jporter@cambridgesys.com> wrote:

...
You're right - I screwed up the timing for the one that works... It does seem to be faster.

I've always just built arrays using nx.array([]) in the past though and was surprised that it performs so badly.

On Thu, Jul 15, 2010 at 2:41 PM, Skipper Seabold <jsseabold@gmail.com> wrote:

...
On Thu, Jul 15, 2010 at 5:54 AM, John Porter <jporter@cambridgesys.com> wrote:

...
Has anyone got any advice about array creation. I've been using numpy for a long time and have just noticed something unexpected about array concatenation.

It seems that using numpy.array([a,b,c]) is around 20 times slower than creating an empty array and adding the individual elements.

Other things that don't work well either: numpy.concatenate([a,b,c]).reshape(3,-1) numpy.concatenate([[a],[b],[c]))

Is there a better way to efficiently create the array ?

What was your timing for concatenate? It wins for me given the shape of a.

In [1]: import numpy as np

In [2]: a = np.arange(1000*1000)

In [3]: timeit b0 = np.array([a,a,a]) 1 loops, best of 3: 216 ms per loop

In [4]: timeit b1 = np.empty(((3,)+a.shape)); b1[0]=a;b1[1]=a;b1[2]=a 100 loops, best of 3: 19.3 ms per loop

In [5]: timeit b2 = np.c_[a,a,a].T 10 loops, best of 3: 30.5 ms per loop

In [6]: timeit b3 = np.concatenate([a,a,a]).reshape(3,-1) 100 loops, best of 3: 9.33 ms per loop

One more.

In [26]: timeit b4 = np.vstack((a,a,a)) 100 loops, best of 3: 9.46 ms per loop

Skipper _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Sturla Molden

11:10 a.m.

Den 15.07.2010 15:41, skrev Skipper Seabold:

...

On Thu, Jul 15, 2010 at 5:54 AM, John Porter<jporter@cambridgesys.com> wrote:

...
Has anyone got any advice about array creation. I've been using numpy for a long time and have just noticed something unexpected about array concatenation.

It seems that using numpy.array([a,b,c]) is around 20 times slower than creating an empty array and adding the individual elements.

Other things that don't work well either: numpy.concatenate([a,b,c]).reshape(3,-1) numpy.concatenate([[a],[b],[c]))

Is there a better way to efficiently create the array ?

What was your timing for concatenate? It wins for me given the shape of a.

In [1]: import numpy as np

In [2]: a = np.arange(1000*1000)

In [3]: timeit b0 = np.array([a,a,a]) 1 loops, best of 3: 216 ms per loop

In [4]: timeit b1 = np.empty(((3,)+a.shape)); b1[0]=a;b1[1]=a;b1[2]=a 100 loops, best of 3: 19.3 ms per loop

In [5]: timeit b2 = np.c_[a,a,a].T 10 loops, best of 3: 30.5 ms per loop

In [6]: timeit b3 = np.concatenate([a,a,a]).reshape(3,-1) 100 loops, best of 3: 9.33 ms per loop

Skipper

...
See the following snippet: --------------------------------------- import time import numpy as nx print 'numpy version', nx.version.version t = time.time() # test array a = nx.arange(1000*1000) print 'a ',time.time()-t t = time.time() # create array in the normal way.. b0 = nx.array([a,a,a]) print 'b0',time.time()-t t = time.time() # create using empty array b1 = nx.empty((3,)+(a.shape)) b1[0] = a b1[1] = a b1[2] = a print 'b1',time.time()-t print nx.all((b0==b1)) ----------------------------------------- Produces the output: numpy version 1.3.0 a 0.0019519329071 b0 0.286643981934 b1 0.0116579532623 equal True _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Sturla Molden

11:38 a.m.

Sorry for the previous mispost. This thread remids me of something I've though about for a while: Would NumPy benefit from an np.ndarraylist subclass of np.ndarray, that has an O(1) amortized append like Python lists? (Other methods of Python lists (pop, extend) would be worth considering as well.) Or will we get the same performance by combining a Python list and ndarray? Sturla

Benjamin Root

12:47 p.m.

On Thu, Jul 15, 2010 at 12:38 PM, Sturla Molden <sturla@molden.no> wrote:

...

Another idea that I always thought was interesting comes from the C++ Standard Library. The .reserve() function call for the vector class, which would go ahead and allocate the specified length, but the array length is not set to that. It was useful in the case where you have a decent idea of the expected size of the array, but you still have to grow the array iteratively. Don't know how well that would fit into this context, but I thought I ought to mention that. Ben Root

Anne Archibald

1:12 p.m.

On 15 July 2010 13:38, Sturla Molden <sturla@molden.no> wrote:

...

This idea, an appendable ndarray, has been discussed before; the conclusion was that yes, it's sometimes useful, and in fact I think there was code written for it. The idea was an ndarray subclass that allowed appending through realloc and simple indexing, but made copies when slices were taken (so that realloc wouldn't move the memory out from under it). It could be "frozen" to an ordinary ndarray when appending was done. To answer the OP's question, np.array is a do-what-I-mean function that examines its argument and deduces the shape, size, and dtype for the new array that it constructs. For example, if you pass it a python list, it must walk through the list and examine the objects to find the numeric dtype that contains them (e.g. integer real or complex); if there are any peculiar objects in there it will construct an object array. (I don't know whether it tries int() float() and complex() on every object or whether it examines their type information.) In any case, all this imposes some bookkeeping overhead that is unnecessary for np.concatenate. For three-dimensional arrays you might try np.dstack, by the way, or you can concatenate along a new axis (not that reshaping is usually expensive): In [1]: a = np.random.randn(4,5) In [2]: b = a[np.newaxis,...] In [3]: np.concatenate([b,b,b], axis=0).shape Out[3]: (3, 4, 5) Anne

...

Christopher Barker

5:31 p.m.

Sturla Molden wrote:

...

yes, it would.

...

I don't think so. In general, the advice is to accumulate in a Python list, then turn it into an array. And this works pretty well. However, Python lists hold python objects, so it's bit inefficient, at least in terms of memory use. I wrote such an accumulating array a while back. It uses a numpy array internally to store the data - oversized at first, and expanded as need by with ndarray.resize() The result was that is was a bit slower than the list approach in most cases -- though should use less memory, and had an advantage is accumulating things more complex numdata types. I've thought about trying to write it in C or Cython, rather than Python -- I think the extra function call overhead on indexing and all may be fairly expensive. There may be other ways to optimize it as well. It's also only 1-d, though it wouldn't be hard to make it n-d, but expandable only on the first dimension. I've enclosed my current draft, along with some code to test and help profile it. Consider it a prototype, but I've used it successfully for a few things. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

Sturla Molden

16 Jul 16 Jul

5:42 a.m.

Christopher Barker skrev:

...

However, Python lists hold python objects, so it's bit inefficient, at least in terms of memory use.

I guess that is mainly in terms of memory use, as the Python (scalar) objects must be created for the call to append. np.array([]) can also be inefficient, as Anne explained yesterday, but an appendable ndarray would not have that problem. But I don't know how important it is to save a few milliseconds worth of cpu time for this. At least not enough to convince me to spend my time on it. The memory issue is more important though. But arrays with O(1) amortized append are memory inefficient on their own. So it might make more sence to save a list of equally long short buffers (e.g. ndarrays), and flatten to a contigous buffer when needed. At least that is what I have been doing when needing to gradually build up an ndarray. The worst case memory use will be like this, amortizing the Python object overhead for large N: list of ndarrays: N*elsize when building up 2*N*elsize when compacting N*elsize when used after compacting appendable array: 2*N*elsize when building up 2*N*elsize when used before compacting 3*N*elsize when compacting N*elsize when used after compacting Personally I am not convinced about an appendable array, neither speed nor memory wise, I just wanted to hear what other people think. And there is one more thing that makes me sceptical: We cannot use realloc() to grow an appendable array. That is because it would segfault if there are array views in use (the views might suddenly get a dangling data pointer). And prohibiting realloc(), the performance of an appendable array might not be fantastic (still O(1) on average though). But as Anne said, this has probably been discussed in details before. Sturla

Christopher Barker

1:33 p.m.

Sturla Molden wrote:

...

Christopher Barker skrev:

...
However, Python lists hold python objects, so it's bit inefficient, at least in terms of memory use.

I guess that is mainly in terms of memory use, as the Python (scalar) objects must be created for the call to append. np.array([]) can also be inefficient, as Anne explained yesterday, but an appendable ndarray would not have that problem. But I don't know how important it is to save a few milliseconds worth of cpu time for this.

not very.

...

At least not enough to convince me to spend my time on it. The memory issue is more important though.

I agree.

...

But arrays with O(1) amortized append are memory inefficient on their own.

well, yes, but by the amount that you buffer them -- I've seen reference to methods that allocate twice as much memory as needed on each expansion, so that would be a maximum of a 2X hit. In poking around the python code, it looks like lists and array.arrays over allocate by 1/16 (once the arrays are large, anyway). In my code, It defaults to over allocating by 1/4. In my very limited testing, I didn't see a performance gain by allocating more than that. So it's a maximum of 25% memory hit -- and it's user configurable.

...

So it might make more sence to save a list of equally long short buffers (e.g. ndarrays), and flatten to a contigous buffer when needed.

I did that in a C extension I wrote a while back for reading big text files. I found that the size of the buffer hardly mattered to performance. In my newer code, I found it a lot easier to over-allocate, and even if you over allocate by 1/10 or 1/16, performance is still pretty good.

...

Personally I am not convinced about an appendable array, neither speed nor memory wise, I just wanted to hear what other people think.

I still think it's a good idea -- though clearly not important enough for me to have gone further with my code (though I am using it now). For me, the reasons are memory and the ability to deal natively with numpy data types that don't map directly to Python types. I think there could be a real benefit to loadtxt() for instance.

...

there is one more thing that makes me sceptical: We cannot use realloc() to grow an appendable array. That is because it would segfault if there are array views in use (the views might suddenly get a dangling data pointer). And prohibiting realloc(), the performance of an appendable array might not be fantastic

I used ndarray.resize, which I'm pretty sure uses realloc(). And that's why I didn't subclass ndarray, but rather have an ndarray as the buffer, which never gets a view on it. When slicing, a copy is returned. But you're right this limits the utility, as it's not a first class citizen of numpy -- you still use it like you do list -- accumulate in it, then convert to a regular numpy array for further processing. I'd love to see someone with more C chops than me pick this up -- but Sturla's right -- it's hard to find the motivation for the relatively small gains. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

5297

Age (days ago)

5298

Last active (days ago)

List overview

Download

13 comments

7 participants

participants (7)

Anne Archibald
Benjamin Root
Christopher Barker
Fabrice Silva
John Porter
Skipper Seabold
Sturla Molden

Array concatenation performance

John Porter

Skipper Seabold

John Porter

Fabrice Silva

Skipper Seabold

John Porter

Skipper Seabold

Sturla Molden

Sturla Molden

Benjamin Root

Anne Archibald

Christopher Barker

Sturla Molden

Christopher Barker

tags

participants (7)