[Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

Chris Barker - NOAA Federal chris.barker at noaa.gov
Tue Apr 2 12:29:15 EDT 2013


On Mon, Apr 1, 2013 at 10:15 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Thank you for the compliment, it's more enjoyable than other potential
> explanations of my confusion (sigh).
>
> But, I don't think that is the explanation.

well, the core explanation is these are difficult and intertwined
concepts...And yes, better names and better docs can help.

> Last, as soon as we came to the distinction between index order and
> memory layout, it was clear.
>
> We all agreed that this was an important distinction that would
> improve numpy if we made it.

yup.

> I think you agree that there is potential for confusion, and there
> doesn't seem any reason to continue with that confusion if we can come
> up with a clearer name.

well, changing an API is not to be taken lightly -- we are not
discussion how we'd do it if we were to start from fresh here. So any
change should make things enough better that it is worth dealing with
the process of teh change.

> So here is a compromise proposal.

> * Preferring the names 'c-style' and 'f-style' for the indexing order
> case (ravel, reshape, flatiter)

> * Leaving 'C" and 'F' as functional shortcuts, so there is no possible
> backwards-compatibility problem.

seems reasonable enough -- though even with the backward
compatibility, users will be faces with many, many older examples and
docs that use "C' and 'F', while the new ones refer to the new names
-- might this be cause for even more confusion (at least for a few
years...)

leaving me with an equivocal +0 on that ....

antoher thought:

"""
Definition: np.ravel(a, order='C')

A 1-D array, containing the elements of the input, is returned.  A copy is
made only if needed.

Parameters
----------
a : array_like
    Input array.  The elements in ``a`` are read in the order specified by
    `order`, and packed as a 1-D array.
order : {'C','F', 'A', 'K'}, optional
    The elements of ``a`` are read in this order. 'C' means to view
    the elements in C (row-major) order. 'F' means to view the elements
    in Fortran (column-major) order. 'A' means to view the elements
    in 'F' order if a is Fortran contiguous, 'C' order otherwise.
    'K' means to view the elements in the order they occur in memory,
    except for reversing the data when strides are negative.
    By default, 'C' order is used.
"""

Does ravel need to support the 'A' and 'K' options? It's kind of an
advanced use, and really more suited to .view(), perhaps?

What I'm getting at is that this version of ravel() conflates the two
concepts: virtual ordering and memory ordering in one function --
maybe they should be considered as two different functions altogether
-- I think that would make for less confusion.

Éric Depagne wrote:
> 'row-first' and 'column-first' (or anything else that may be more explicit) ?

I like more explicit, but 'row-first' and 'column-first' have two
issues: 1) what about higher dimension arrays?, and 2) the "row" and
"column" convention is only that -- a convention -- I guess it's the
way numpy prints, which gives it some meaning, but there are times
when arrays are ordered: (col, row), rather than (row, col) (PIL uses
that format for instance)

I like the Z and N, and  maybe even if they aren't used as flag names,
they could be used in teh docstring -- nice and ascii safe....

Nathaniel wrote:
>To see this, note that semantically it would be perfectly possible for .reshape() to
> take *two* order= arguments: one to specify the coordinate space mapping (2),
> and the other to specify the desired memory layout used by the result array (1). Of
> course we shouldn't actually do this, because in the unlikely event that someone
> actually wanted both of these they could just call asarray() on the output of
> reshape().

exactly -- my point about keeping the raveling with "virtual order"
separate from reveling with memory order -- it's really not critical
that you can do both with one function call.

-Chris










-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov



More information about the NumPy-Discussion mailing list