[Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

josef.pktd at gmail.com josef.pktd at gmail.com
Wed Apr 3 08:19:23 EDT 2013


On Tue, Apr 2, 2013 at 9:09 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Tue, Apr 2, 2013 at 7:09 PM,  <josef.pktd at gmail.com> wrote:
>> On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>> On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>>>>> This is like observing that if I say "go North" then it's ambiguous
>>>>> about whether I want you to drive or walk, and concluding that we need
>>>>> new words for the directions depending on what sort of vehicle you
>>>>> use. So "go North" means drive North, "go htuoS" means walk North,
>>>>> etc. Totally silly. Makes much more sense to have one set of words for
>>>>> directions, and then make clear from context what the directions are
>>>>> used for -- "drive North", "walk North". Or "iterate C-wards", "store
>>>>> F-wards".
>>>>>
>>>>> "C" and "Z" mean exactly the same thing -- they describe a way of
>>>>> unraveling a cube into a straight line. The difference is what we do
>>>>> with the resulting straight line. That's why I'm suggesting that the
>>>>> distinction should be made in the name of the argument.
>>>>
>>>> Could you unpack that for the 'ravel' docstring?  Because these
>>>> options all refer to the way of unraveling and not the memory layout
>>>> that results.
>>>
>>> Z/C/column-major/whatever-you-want-to-call-it is a general strategy
>>> for converting between a 1-dim representation and a n-dim
>>> representation. In the case of memory storage, the 1-dim
>>> representation is the flat space of pointer arithmetic. In the case of
>>> ravel, the 1-dim representation is the flat space of a 1-dim indexed
>>> array. But the 1-dim-to-n-dim part is the same in both cases.
>>>
>>> I think that's why you're seeing people baffled by your proposal -- to
>>> them the "C" refers to this general strategy, and what's different is
>>> the context where it gets applied. So giving the same strategy two
>>> different names is silly; if anything it's the contexts that should
>>> have different names.
>>
>> And once we get into memory optimization (and avoiding copies and
>> preserving contiguity), it is necessary to keep both orders in mind,
>> is memory order in "F" and am I iterating/raveling in "F" order
>> (or slicing columns).
>>
>> I think having two separate keywords give the impression we can
>> choose two different things at the same time.
>
> I guess it could not make sense to do this:
>
> np.ravel(a, index_order='C', memory_order='F')
>
> It could make sense to do this:
>
> np.reshape(a, (3,4), index_order='F, memory_order='F')
>
> but that just points out the inherent confusion between the uses of
> 'order', and in this case, the fact that you can only do:
>
> np.reshape(a, (3, 4), index_order='F')
>
> correctly distinguishes between the meanings.

So, if index_order and memory_order are never in the same function,
then the context should be enough. It was always enough for me.

np.reshape(a, (3,4), index_order='F, memory_order='F')
really hurts my head because you mix a function that operates on
views, indexing and shapes with memory creation, (or I have
no idea what memory_order should do in this case).

np.asarray(a.reshape(3,4 order="F"), order="F")
or the example here
http://docs.scipy.org/doc/numpy/reference/generated/numpy.asfortranarray.html?highlight=asfortranarray#numpy.asfortranarray
http://docs.scipy.org/doc/numpy/reference/generated/numpy.asarray.html
keeps functions with index_order and functions with memory_order
nicely separated.

(It might be useful but very confusing to add memory_order to every function
 that creates a view if possible and a copy if necessary: "If you have to make
a copy, then I want F memory order, otherwise give me a view"
But I cannot find a candidate function right now, except for ravel and reshape
see first notes in
docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html
)

----
a day later (haven't changed my mind):

isn't specifying "index order" in the Parameter section enough as an
explanation?

something like:

```
def ravel

Parameters

order :
   index order how the array is stacked into a 1d array. F means we
stack by columns
   (fortran order, first index first),    C means we stack by rows
(c-order, last index first)
```

most array *creation* functions explicitly mention memory layout in
the docstring

Josef

>
> Best,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion



More information about the NumPy-Discussion mailing list