Mailman 3 flatten() without copy - is this possible? - NumPy-Discussion - python.org

newer
I hate for loops

flatten() without copy - is this possible?

older
numarray.fft causes fatal Python...

dmitrey

1 Jun 2007 1 Jun '07

5:05 p.m.

hi all. in the numpy for matlab users I read y = x.flatten(1) turn array into vector (note that this forces a copy) Is there any way to do the trick wthout copying? What are the problems here? Just other way of array elements indexing... Thx, D.

Reply

Sign in to reply online Use email software

Show replies by date

Anne Archibald

4 Jun 4 Jun

12:13 a.m.

On 01/06/07, dmitrey wrote:

y = x.flatten(1)

turn array into vector (note that this forces a copy)

Is there any way to do the trick wthout copying? What are the problems here? Just other way of array elements indexing...

It is sometimes possible to flatten an array without copying and sometimes not. For numpy, a vector is a single block of memory in which there are elements of uniform type spaced at a uniform distance. This last is the key; it's called the "stride", and it need not be the same size as an element (so arange(10)[::3] can be created without a copy). A multidimensional array simply has many strides, one for each dimension. Thus ones((10,10,10)) simply keeps track of the stride for a row, the stride for a column, and the stride for a layer. If you want to transpose two axes, the data is not copied, instead the strides are simply exchanged. Under normal circumstances one need not care what the strides are or how the cells are laid out in memory as numpy hides that from normal users. What about flattening an array? It should turn an array into a vector, that is, take an array with n different strides and lengths and create as single array with a single stride and length. The order of the resulting elements needs to be specified; numpy normally defaults to "C order", which means that A[3,4,5] and A[3,4,6] are adjacent in the resulting array but A[3,4,5] and A[4,4,5] are not. (Note that this is a logical operation; the organization of the underlying array is irrelevant for the result.) If you want to ensure that no copy is made, you need to ensure that the stride between elements of the array you're flattening is always the same. Taking a 10-by-10-by-10 array A, the spacing between A[3,4,5] and A[3,4,6] needs to be the same as the spacing between A[3,4,6] and A[3,4,7]. This is automatic. But the spacing also needs to be the same as the spacing between A[3,4,9] and A[3,5,0]. This is not automatic, and often does not occur. In such cases numpy must make a copy to ensure that the resulting array is uniformly strided. What cases *don't* require a copy? Well, let's look at some examples: A = ones((10,10,10)) reshape(A,(-1,)) # No copy needed reshape(A[:,:,:5],(-1,)) # Copy needed reshape(A[:,:,::2],(-1,)) # No copy needed reshape(A[:,::2,:],(-1,)) # Copy needed reshape(A[:5,:,:],(-1,)) # No copy needed reshape(A.transpose(),(-1,)) # Copy needed Note that none of the reindexing operations require a copy, but some of the reshapes do. It turns out to be nontrivial to detect all the cases where a copy can be avoided while reshaping, and IIRC numpy misses some (old versions of numpy almost always copied). But a freshly-created array is normally guaranteed to be reshapable without a copy. If you want to try reshaping an array without a copy, you can try assigning to .shape: In [3]: A = ones((10,10,10))[:,:5,:] In [4]: A.shape = (-1,) --------------------------------------------------------------------------- Traceback (most recent call last) /home/peridot/physics-projects/pulsed-flux/writings/<ipython console> in <module>() : incompatible shape for a non-contiguous array and In [7]: A = ones((10,10,10))[:5,:,:] In [8]: A.shape = (-1,) Anne

Reply

Sign in to reply online Use email software

dmitrey

5 Jun 5 Jun

10:36 p.m.

Thank you, but all your examples deal with 3-dimensional arrays. and I still misunderstood, is it possible somehow for 2-dimensional arrays or no? D. Anne Archibald wrote:

On 01/06/07, dmitrey wrote:

...
y = x.flatten(1)

turn array into vector (note that this forces a copy)

Is there any way to do the trick wthout copying? What are the problems here? Just other way of array elements indexing...

It is sometimes possible to flatten an array without copying and sometimes not.

For numpy, a vector is a single block of memory in which there are elements of uniform type spaced at a uniform distance. This last is the key; it's called the "stride", and it need not be the same size as an element (so arange(10)[::3] can be created without a copy).

A multidimensional array simply has many strides, one for each dimension. Thus ones((10,10,10)) simply keeps track of the stride for a row, the stride for a column, and the stride for a layer. If you want to transpose two axes, the data is not copied, instead the strides are simply exchanged. Under normal circumstances one need not care what the strides are or how the cells are laid out in memory as numpy hides that from normal users.

What about flattening an array? It should turn an array into a vector, that is, take an array with n different strides and lengths and create as single array with a single stride and length. The order of the resulting elements needs to be specified; numpy normally defaults to "C order", which means that A[3,4,5] and A[3,4,6] are adjacent in the resulting array but A[3,4,5] and A[4,4,5] are not. (Note that this is a logical operation; the organization of the underlying array is irrelevant for the result.)

If you want to ensure that no copy is made, you need to ensure that the stride between elements of the array you're flattening is always the same. Taking a 10-by-10-by-10 array A, the spacing between A[3,4,5] and A[3,4,6] needs to be the same as the spacing between A[3,4,6] and A[3,4,7]. This is automatic. But the spacing also needs to be the same as the spacing between A[3,4,9] and A[3,5,0]. This is not automatic, and often does not occur. In such cases numpy must make a copy to ensure that the resulting array is uniformly strided.

What cases *don't* require a copy? Well, let's look at some examples:

A = ones((10,10,10)) reshape(A,(-1,)) # No copy needed reshape(A[:,:,:5],(-1,)) # Copy needed reshape(A[:,:,::2],(-1,)) # No copy needed reshape(A[:,::2,:],(-1,)) # Copy needed reshape(A[:5,:,:],(-1,)) # No copy needed reshape(A.transpose(),(-1,)) # Copy needed

Note that none of the reindexing operations require a copy, but some of the reshapes do.

It turns out to be nontrivial to detect all the cases where a copy can be avoided while reshaping, and IIRC numpy misses some (old versions of numpy almost always copied). But a freshly-created array is normally guaranteed to be reshapable without a copy.

If you want to try reshaping an array without a copy, you can try assigning to .shape: In [3]: A = ones((10,10,10))[:,:5,:]

In [4]: A.shape = (-1,) --------------------------------------------------------------------------- Traceback (most recent call last)

/home/peridot/physics-projects/pulsed-flux/writings/<ipython console> in <module>()

: incompatible shape for a non-contiguous array

and In [7]: A = ones((10,10,10))[:5,:,:]

In [8]: A.shape = (-1,)

Anne _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion

Reply

Sign in to reply online Use email software

Charles R Harris

6 Jun 6 Jun

12:21 a.m.

On 6/5/07, dmitrey wrote:

Thank you, but all your examples deal with 3-dimensional arrays. and I still misunderstood, is it possible somehow for 2-dimensional arrays or no? D.

There is nothing special about the number of dimensions, all arrays have the same methods.. <snip> Chuck

Reply

Sign in to reply online Use email software

Anne Archibald

9 Jun 9 Jun

6:52 a.m.

On 05/06/07, Charles R Harris wrote:

On 6/5/07, dmitrey wrote:

...
Thank you, but all your examples deal with 3-dimensional arrays. and I still misunderstood, is it possible somehow for 2-dimensional arrays or no? D.

There is nothing special about the number of dimensions, all arrays have the same methods..

Of course. But he was asking whether the examples I was giving, of arrays that could and couldn't be flattened, would work in 2D. There is nothing special about 3D; there are 2D matrices that can be flattened and 2D matrices that can't. Think about the matrix in terms of strides and lengths specifying how the elements are laid out in memory and things should become much clearer. I suspect the numpy book (which is not expensive) does a better job of explaining it. Anne.

Reply

Sign in to reply online Use email software

Andrew Jaffe

4 Jun 4 Jun

7:47 p.m.

dmitrey wrote:

hi all. in the numpy for matlab users I read

y = x.flatten(1)

turn array into vector (note that this forces a copy)

Is there any way to do the trick wthout copying? What are the problems here? Just other way of array elements indexing...

One important question is whether you actually need the new vector, or whether you just want a flat index into the array; if the latter, you can always [I think] use x.flat[one_d_index]. (But note that y=x.flat gives an iterator, not a new array.) Andrew

Reply

Sign in to reply online Use email software

6165

Age (days ago)

6173

Last active (days ago)

Download

5 comments

4 participants

tags

participants (4)

Andrew Jaffe
Anne Archibald
Charles R Harris
dmitrey