[Numpy-discussion] Rank-0 arrays - reprise

Sun Jan 6 13:36:04 EST 2013

On 01/06/2013 05:52 PM, Nathaniel Smith wrote:
> On Sun, Jan 6, 2013 at 10:35 AM, Dag Sverre Seljebotn
> <d.s.seljebotn at astro.uio.no> wrote:
>> I should have been more precise: I like the proposal, but also believe
>> the additional complexity introduced have significant costs that must be
>> considered.
>>
>>    a) Making += behave differently for readonly arrays should be
>> carefully considered. If I have a 10 GB read-only array, I prefer an
>> error to a copy for +=. (One could use an ISSCALAR flag instead that
>> only affected +=...)
>
> Yes, definitely we would need to nail down the exact semantics here.
> My feeling is that we should see start by seeing if we can come up
> with a set of coherent rules for read-only arrays that does what we
> want before we add an ACT_LIKE_OLD_SCALARS flag, but either way is
> viable. (Or we could start with a PRETEND_TO_BE_SCALAR flag and then
> gradually migrate away from it.)

Sounds like a good plan.

>
>>    b) Things seems simpler since "indexing away the last index" is no
>> longer a special case, it is always true for a.ndim > 0 that "a[i]" is a
>> new array such that
>>
>> a[i].ndim == a.ndim - 1
>>
>> But in exchange, a new special-case is introduced since READONLY is only
>> set when ndim becomes 0, so it doesn't really help with the learning
>> curve IMO.
>
> Yes, indexing with a scalar (as opposed to slicing or fancy-indexing)
> remains a special case just like now. And not just because the result
> is read-only -- it also returns a copy, not a view.
>
> I don't think the comparison to the a[i] special-case is very useful,
> really. Scalar indexing and the wacky one-dimensional indexing thing
> where a[i] -> a[i, ..] (unless a is one-dimensional) would still be
> different in general, even aside from the READONLY part, because the
> one-dimensional indexing thing only applies to one-dimensional
> indexes. For a 3-d array,
>    a[i, j]
> gives an error; it's not the same as a[i, j, ...]. And while I
> understand why numpy does what it does for len() and __getitem__(int)
> on multi-dimensional arrays (it's to make multi-dimensional arrays act
> more like list-of-lists), this is IMO a confusing special case that we
> might be better off without, and in any case shouldn't be used as a
> guide for how to make the rest of the indexing system work.

Removing the single-index special case would be great. I see people 
doing stuff like a[i][j][k] all the time, just because that's what they 
tried first when they came to NumPy and then the habit sticks for years. 
OTOH, that means that it might have to stay for backwards compatability 
reasons.

Dag Sverre