[Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

Matthew Brett matthew.brett at gmail.com
Tue Apr 2 17:21:18 EDT 2013


Hi,

On Tue, Apr 2, 2013 at 2:44 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Tue, Apr 2, 2013 at 6:59 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>> On Tue, Apr 2, 2013 at 7:32 AM, Nathaniel Smith <njs at pobox.com> wrote:
>>> Maybe we should go through and rename "order" to something more descriptive
>>> in each case, so we'd have
>>>   a.reshape(..., index_order="C")
>>>   a.copy(memory_order="F")
>>> etc.?
>>
>> That seems like a good idea.  If you are proposing it, I am "+1".
>
> Well, I'm just throwing it out there as an idea, but if people like
> it, nothing better turns up, and someone implements it, then I'm not
> going to say no...

I would certainly be happy to implement it if there was some agreement
it was the right way to go.

>>> This way if you just bumped into these while reading code, it would still be
>>> immediately obvious that they were dealing with totally different concepts.
>>> Compare to reading along without the docs and seeing
>>>   a.reshape(..., order="Z")
>>>   a.copy(order="C")
>>> That'd just leave me even more baffled than the current system -- I'd start
>>> thinking that "Z" and "C" somehow were different options for the same order=
>>> option, so they must somehow mean ways of ordering elements?
>>
>> I don't think you'd be more baffled than the current system, which, as
>> you say, conflates two orthogonal concepts.  Rather, I think it would
>> cause the user to stop, as they should, and consider what concept
>> order is using in this case.
>>
>> I don't find it difficult to explain this:
>>
>> There are two different but related concepts of 'order'
>>
>> 1) The memory layout of the array
>> 2) The index ordering used to unravel the array
>>
>> If you see 'Z' or 'N" for 'order' - that refers to index ordering.
>> If you see 'C' or 'F" for order - that refers to memory layout.
>
> Sure, you can write it down like this, but compare to this system:
>
> If you see 'Z' or 'N" for 'order' - that refers to memory ordering.
> If you see 'C' or 'F" for order - that refers to index layout.
>
> Now suppose I forget which system we actually use -- how do you
> remember which system is which? It's totally arbitrary.

I don't think it is completely arbitrary, as 'Z' / 'N' come from the
process of getting elements from a 2D array in a certain order, and C
/ F memory layouts correspond to exactly what C and Fortran do
(whereas the concept of index order cannot be separated from memory
order for C, Fortran).

> Now I have
> even more things to remember. And I'm certainly not going to work out
> this distinction just from seeing these used once or twice in someone
> else's code.

The extra things you have to remember are a) that there is a
distinction (and this is good) and b) which of the two things you need
to distinguish is 'Z' or 'C'.  I think the benefit from a) is much
greater than the small load from b).

> This is like observing that if I say "go North" then it's ambiguous
> about whether I want you to drive or walk, and concluding that we need
> new words for the directions depending on what sort of vehicle you
> use. So "go North" means drive North, "go htuoS" means walk North,
> etc. Totally silly. Makes much more sense to have one set of words for
> directions, and then make clear from context what the directions are
> used for -- "drive North", "walk North". Or "iterate C-wards", "store
> F-wards".
>
> "C" and "Z" mean exactly the same thing -- they describe a way of
> unraveling a cube into a straight line. The difference is what we do
> with the resulting straight line. That's why I'm suggesting that the
> distinction should be made in the name of the argument.

Could you unpack that for the 'ravel' docstring?  Because these
options all refer to the way of unraveling and not the memory layout
that results.

Cheers,

Matthew



More information about the NumPy-Discussion mailing list