[Numpy-discussion] BOF notes: Fernando's proposal: NumPy ndarray with named axes

Thu Jul 8 11:55:00 EDT 2010

Joshua Holbrook writes:

> On Thu, Jul 8, 2010 at 3:13 AM, Lluís <xscript at gmx.net> wrote:
>> Rob Speer writes:
>> 
>>>>>> arr.country.named('Netherlands').year.named(2010)
>>>>>> arr.country.named('Spain').year.named(slice(1994, 2010))
>>>>>> arr.year.named(2006).country[0:2]
>> 
>> This looks too verbose to me.
>> 
>> As axis always have a total order, I'd go for the most compact representation
>> (assuming 'country' is the first axis, and 'year' the second one):
>> 
>>   arr['Netherlands','2010']
>>   arr['Spain','1994':'2010']
>>   arr[0:2,'2006']
>> 
[...]
>> 
>> Thus, we can use something in the middle:
>> 
>>   arr[0,1]
>>   arr.names['Netherlands',2010] # I'd rather go for 'names' instead of 'ticks'
>>   arr.country['Spain'].year[1994:2010]
>> 
[...]
>> arr['Netherlands','2010']

> Isn't this the __getitem___ action we were trying to avoid?

Sorry but I hooked into the whole naming discussion just now, so I'm not aware
of much previous discussions except for this thread.

What I assumed is that 'arr[...]' is not a desired syntax because of a possible
performance loss.

That's why I think 'arr.names[...]' might be a good compromise. Use 'arr[]' for
the standard integer-based indexing, and 'arr.names[]' for the fancy mixed
integer+string indexing.

My opinion is that no integer name/tick must be allowed (thus the above example
would be arr.names['Netherlands','2010']), such that the user is able to mix
"real" indexes with names. Whether this mix makes any sense or not, is something
that I'm not sure about, but I'd try to eliminate "unnecessary" typing as much
as possible.

Read you,
     Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth