[Numpy-discussion] in the NA discussion, what can we agree on?

T J tjhnson at gmail.com
Fri Nov 4 16:22:18 EDT 2011


On Fri, Nov 4, 2011 at 1:03 PM, Gary Strangman
<strang at nmr.mgh.harvard.edu>wrote:

>
> To push this forward a bit, can I propose that IGNORE behave as:   PnC
>>
>> >>> x = np.array([1, 2, 3])
>> >>> y = np.array([10, 20, 30])
>> >>> ignore(x[2])
>> >>> x
>> [1, IGNORED(2), 3]
>> >>> x + 2
>> [3, IGNORED(4), 5]
>> >>> x + y
>> [11, IGNORED(22), 33]
>> >>> z = x.sum()
>> >>> z
>> IGNORED(6)
>> >>> unignore(z)
>> >>> z
>> 6
>> >>> x.sum(skipIGNORED=True)
>> 4
>>
>>
> In my mind, IGNORED items should be skipped by default (i.e., skipIGNORED
> seems redundant ... isn't that what ignoring is all about?). Thus I might
> instead suggest the opposite (default) behavior at the end:
>
>
>  x = np.array([1, 2, 3])
>>>> y = np.array([10, 20, 30])
>>>> ignore(x[2])
>>>> x
>>>>
>>> [1, IGNORED(2), 3]
>
>> x + 2
>>>>
>>> [3, IGNORED(4), 5]
>
>> x + y
>>>>
>>> [11, IGNORED(22), 33]
>
>> z = x.sum()
>>>> z
>>>>
>>> 4
>
>> unignore(x).sum()
>>>>
>>> 6
>
>> x.sum(keepIGNORED=True)
>>>>
>>> 6
>
> (Obviously all the syntax is totally up for debate.)
>
>

I agree that it would be ideal if the default were to skip IGNORED values,
but that behavior seems inconsistent with its propagation properties (such
as when adding arrays with IGNORED values).  To illustrate, when we did
"x+2", we were stating that:

IGNORED(2) + 2 == IGNORED(4)

which means that we propagated the IGNORED value.  If we were to skip them
by default, then we'd have:

IGNORED(2) + 2 == 2

To be consistent, then it seems we also should have had:

>>> x + 2
[3, 2, 5]

which I think we can agree is not so desirable.   What this seems to come
down to is that we tend to want different behavior when we are doing
reductions, and that for IGNORED data, we want it to propagate in every
situation except for a reduction (where we want to skip over it).

I don't know if there is a well-defined way to distinguish reductions from
the other operations.  Would it hold for generalized ufuncs?  Would it hold
for other functions which might return arrays instead of scalars?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20111104/789ff96f/attachment.html>


More information about the NumPy-Discussion mailing list