[Numpy-discussion] Raveling, reshape order keyword unnecessarily confuses index and memory ordering

josef.pktd at gmail.com josef.pktd at gmail.com
Tue Apr 2 20:02:54 EDT 2013


On Tue, Apr 2, 2013 at 7:09 PM,  <josef.pktd at gmail.com> wrote:
> On Tue, Apr 2, 2013 at 5:52 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> On Tue, Apr 2, 2013 at 10:21 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>>>> This is like observing that if I say "go North" then it's ambiguous
>>>> about whether I want you to drive or walk, and concluding that we need
>>>> new words for the directions depending on what sort of vehicle you
>>>> use. So "go North" means drive North, "go htuoS" means walk North,
>>>> etc. Totally silly. Makes much more sense to have one set of words for
>>>> directions, and then make clear from context what the directions are
>>>> used for -- "drive North", "walk North". Or "iterate C-wards", "store
>>>> F-wards".
>>>>
>>>> "C" and "Z" mean exactly the same thing -- they describe a way of
>>>> unraveling a cube into a straight line. The difference is what we do
>>>> with the resulting straight line. That's why I'm suggesting that the
>>>> distinction should be made in the name of the argument.
>>>
>>> Could you unpack that for the 'ravel' docstring?  Because these
>>> options all refer to the way of unraveling and not the memory layout
>>> that results.
>>
>> Z/C/column-major/whatever-you-want-to-call-it is a general strategy
>> for converting between a 1-dim representation and a n-dim
>> representation. In the case of memory storage, the 1-dim
>> representation is the flat space of pointer arithmetic. In the case of
>> ravel, the 1-dim representation is the flat space of a 1-dim indexed
>> array. But the 1-dim-to-n-dim part is the same in both cases.
>>
>> I think that's why you're seeing people baffled by your proposal -- to
>> them the "C" refers to this general strategy, and what's different is
>> the context where it gets applied. So giving the same strategy two
>> different names is silly; if anything it's the contexts that should
>> have different names.
>
> And once we get into memory optimization (and avoiding copies and
> preserving contiguity), it is necessary to keep both orders in mind,
> is memory order in "F" and am I iterating/raveling in "F" order
> (or slicing columns).
>
> I think having two separate keywords give the impression we can
> choose two different things at the same time.

as aside (math):
numpy.flatten made it into the Wikipedia page
http://en.wikipedia.org/wiki/Vectorization_%28mathematics%29#Programming_language
(and how it's different from R and Matlab/Octave,
but doesn't mention: use order="F" to get the same behavior as math
and the others)

and the corresponding code in statsmodels (tools for vector
autoregressive models by Wes)

Josef
baffled?

>
> Josef
>
>
>>
>> -n
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion



More information about the NumPy-Discussion mailing list