On 5/23/07, Albert Strasheim <fullung@gmail.com> wrote:
Hello all

On Wed, 23 May 2007, Anne Archibald wrote:

> On 23/05/07, Albert Strasheim <fullung@gmail.com> wrote:
>
> > Consider the following example:
>
> First a comment: almost nobody needs to care how the data is stored
> internally. Try to avoid looking at the flags unless you're
> interfacing with a C library. The nice feature of numpy is that it
> hides all that junk - strides, contiguous storage, iteration, what
> have you - so that you don't have to deal with it.

As luck would have it, I am interfacing with a C library.

> > Is it correct that the F_CONTIGUOUS flag is set in the case of the fancy
> > indexed x? I'm running NumPy 1.0.3.dev3792 here.
>
> Numpy arrays are always stored in contiguous blocks of memory with
> uniform strides. The "CONTIGUOUS" flag actually means something
> totally different, which is unfortunate, but in any case, "fancy
> indexing" can't be done as a simple reindexing operation. It must make
> a copy of the array. So what you're seeing is the flags of a fresh new
> array, created from scratch (and numpy always creates arrays in C
> order internally, though that is an implementation detail you should
> not rely on).

If you are correct that this is in fact a fresh new array, I really
don't understand where the values of these flags. To recap:

In [19]: x = N.zeros((3,2))

In [20]: x.flags
Out[20]:
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

In [21]: x[:,[1,0]].flags
Out[21]:
  C_CONTIGUOUS : False
  F_CONTIGUOUS : True
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

So since x and x[:,[1,0]] are both new arrays, shouldn't their flags be
identical? I'd expect at least C_CONTIGUOUS and OWNDATA to be True.

The contiguous refers to how stuff is layed out in memory. In this case it appears that fancy indexing creates the new array by first copying column 1, then column 2, so that the new array is indeed F_CONTIGUOUS. Assuming I correctly understand the behaviour of the tostring argument, which is debatable, that is indeed what happens.

In [28]: x = arange(6, dtype=int8).reshape(3,2)

In [29]: x
Out[29]:
array([[0, 1],
       [2, 3],
       [4, 5]], dtype=int8)

In [30]: y = x[:,[1,0]]

In [31]: y
Out[31]:
array([[1, 0],
       [3, 2],
       [5, 4]], dtype=int8)

In [32]: x.tostring('A')
Out[32]: '\x00\x01\x02\x03\x04\x05'

In [33]: y.tostring('A')
Out[33]: '\x01\x03\x05\x00\x02\x04'

Chuck