[Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

Matthew Brett matthew.brett at gmail.com
Tue Apr 2 21:09:30 EDT 2013


On Tue, Apr 2, 2013 at 7:09 PM,  <josef.pktd at gmail.com> wrote:
> On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>>>> This is like observing that if I say "go North" then it's ambiguous
>>>> about whether I want you to drive or walk, and concluding that we need
>>>> new words for the directions depending on what sort of vehicle you
>>>> use. So "go North" means drive North, "go htuoS" means walk North,
>>>> etc. Totally silly. Makes much more sense to have one set of words for
>>>> directions, and then make clear from context what the directions are
>>>> used for -- "drive North", "walk North". Or "iterate C-wards", "store
>>>> F-wards".
>>>> "C" and "Z" mean exactly the same thing -- they describe a way of
>>>> unraveling a cube into a straight line. The difference is what we do
>>>> with the resulting straight line. That's why I'm suggesting that the
>>>> distinction should be made in the name of the argument.
>>> Could you unpack that for the 'ravel' docstring?  Because these
>>> options all refer to the way of unraveling and not the memory layout
>>> that results.
>> Z/C/column-major/whatever-you-want-to-call-it is a general strategy
>> for converting between a 1-dim representation and a n-dim
>> representation. In the case of memory storage, the 1-dim
>> representation is the flat space of pointer arithmetic. In the case of
>> ravel, the 1-dim representation is the flat space of a 1-dim indexed
>> array. But the 1-dim-to-n-dim part is the same in both cases.
>> I think that's why you're seeing people baffled by your proposal -- to
>> them the "C" refers to this general strategy, and what's different is
>> the context where it gets applied. So giving the same strategy two
>> different names is silly; if anything it's the contexts that should
>> have different names.
> And once we get into memory optimization (and avoiding copies and
> preserving contiguity), it is necessary to keep both orders in mind,
> is memory order in "F" and am I iterating/raveling in "F" order
> (or slicing columns).
> I think having two separate keywords give the impression we can
> choose two different things at the same time.

I guess it could not make sense to do this:

np.ravel(a, index_order='C', memory_order='F')

It could make sense to do this:

np.reshape(a, (3,4), index_order='F, memory_order='F')

but that just points out the inherent confusion between the uses of
'order', and in this case, the fact that you can only do:

np.reshape(a, (3, 4), index_order='F')

correctly distinguishes between the meanings.



More information about the NumPy-Discussion mailing list