[Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

Tue Apr 2 14:04:05 EDT 2013

Hi,

On Tue, Apr 2, 2013 at 12:29 PM, Chris Barker - NOAA Federal
<chris.barker at noaa.gov> wrote:
> On Mon, Apr 1, 2013 at 10:15 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>> Thank you for the compliment, it's more enjoyable than other potential
>> explanations of my confusion (sigh).
>>
>> But, I don't think that is the explanation.
>
> well, the core explanation is these are difficult and intertwined
> concepts...And yes, better names and better docs can help.
>
>> Last, as soon as we came to the distinction between index order and
>> memory layout, it was clear.
>>
>> We all agreed that this was an important distinction that would
>> improve numpy if we made it.
>
> yup.
>
>> I think you agree that there is potential for confusion, and there
>> doesn't seem any reason to continue with that confusion if we can come
>> up with a clearer name.
>
> well, changing an API is not to be taken lightly -- we are not
> discussion how we'd do it if we were to start from fresh here. So any
> change should make things enough better that it is worth dealing with
> the process of teh change.

Yes, for sure.  I was only trying to point out that we are not talking
about breaking backwards compatibility.

>> So here is a compromise proposal.
>
>> * Preferring the names 'c-style' and 'f-style' for the indexing order
>> case (ravel, reshape, flatiter)
>
>> * Leaving 'C" and 'F' as functional shortcuts, so there is no possible
>> backwards-compatibility problem.
>
> seems reasonable enough -- though even with the backward
> compatibility, users will be faces with many, many older examples and
> docs that use "C' and 'F', while the new ones refer to the new names
> -- might this be cause for even more confusion (at least for a few
> years...)

I doubt it would be 'even more' confusion.  They would only have to
read the docstrings to work out what is meant, and I believe, with
better names, they'd be less likely to fall into the traps I fell
into, at least.

> leaving me with an equivocal +0 on that ....
>
> antoher thought:
>
> """
> Definition: np.ravel(a, order='C')
>
> A 1-D array, containing the elements of the input, is returned.  A copy is
> made only if needed.
>
> Parameters
> ----------
> a : array_like
>     Input array.  The elements in ``a`` are read in the order specified by
>     `order`, and packed as a 1-D array.
> order : {'C','F', 'A', 'K'}, optional
>     The elements of ``a`` are read in this order. 'C' means to view
>     the elements in C (row-major) order. 'F' means to view the elements
>     in Fortran (column-major) order. 'A' means to view the elements
>     in 'F' order if a is Fortran contiguous, 'C' order otherwise.
>     'K' means to view the elements in the order they occur in memory,
>     except for reversing the data when strides are negative.
>     By default, 'C' order is used.
> """
>
> Does ravel need to support the 'A' and 'K' options? It's kind of an
> advanced use, and really more suited to .view(), perhaps?
>
> What I'm getting at is that this version of ravel() conflates the two
> concepts: virtual ordering and memory ordering in one function --
> maybe they should be considered as two different functions altogether
> -- I think that would make for less confusion.

I think it would conceal the confusion only.   If we don't have 'A'
and 'K' in there, it allows us to keep the dream of a world where 'C"
only refers to index ordering, but *only for this docstring*.   As
soon as somebody does ``np.array(arr, order='C')`` they will find
themselves in conceptual trouble again.

Cheers,

Matthew