[Numpy-discussion] Behavior of nan{max, min} and nanarg{max, min} for all-nan slices.

Thu Oct 3 16:38:25 EDT 2013

On Thu, Oct 3, 2013 at 8:40 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
>
> On Thu, Oct 3, 2013 at 1:11 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Thu, Oct 3, 2013 at 7:59 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> >
>> > <snip>
>> >
>> >>
>> >> Please, no. It's another thing to remember and another way to shoot
>> >> yourself in the foot and introduce casual bugs.
>> >>
>> >> FWIW, my vote is to raise an error or return a nan, which will likely
>> >> eventually raise an error. If I have all nans, it's usually the case
>> >> that something's off, and I'd like to know sooner rather than later.
>> >>
>> >
>> > Here is what I have currently implemented. First, define an AllNanError
>> >
>> > class AllNanError(ValueError):
>> >     def __init__(self, msg, result):
>> >         ValueError.__init__(self, msg)
>> >         self.result = result
>> >
>> > For nanmax/nanmin/nanargmax/nanargmin this error is raised for all-nan
>> > axis
>> > and the result is attached. The exception can then be caught and the
>> > result
>> > examined. A ValueError is what amax, amin return for empty arrays.
>> >
>> > For nanmax/nanmin the result for an empty slice is nan. For
>> > argnanmax/argnanmin the result of an empty slice is -1, which is easier
>> > to
>> > read and remember than intp.min. A ValueError is what argmin, argmax
>> > currently return for empty arrays. Note that both of these functions can
>> > give wrong results if they contain some min/max values respectively.
>> > That is
>> > an old bug and I haven't fixed it.
>> >
>> > The nanmean/nanvar/nanstd functions currently raise a warning for
>> > all-nan
>> > slices and the result for such is nan. These could also be made to raise
>> > an
>> > error.
>> >
>> > Thoughts?
>>
>> Is this intended for 1.8 or master?
>
> I was thinking both. The nanarg* functions are changing behavior anyway, so
> might as well get it all done in 1.8. I also think there will need to be an
> rc2 in anycase.

This is obviously a complicated and contentious enough issue that I
think for 1.8 we need to punt rather than try to force something out
under time pressure. We obviously need an rc2 anyway with all the
other stuff that's come in, but for 1.8 I'm going to suggest again we
go with:
- leave nanmax/nanmin as is
- make nanargmax/nanargmin just raise a simple ValueError on all-nans
(so that same as 1.7 for operations on multiple subarrays; the only
difference would be that for 1d operations 1.8 would raise where 1.7
returned nan).
I'm not saying we should stick with this forever, but it solves the
immediately problem that started this whole mess, and I can't see how
anyone could object to it as a temporary solution -- it's basically
the same as what 1.7 does. And it clears the deck for whatever
cleverer thing we come up with in master.

Does that make sense? I'm not going to out-right say -1 on doing
anything else for 1.8, but the longer this drags on the more tempting
it becomes, just because who can really evaluate a more interesting
proposal with the release breathing down their neck?

> the current non-nan aware mean/var/std only raise warnings on insufficient
> degrees of freedom and return nan. That's a bit of a change, they used to
> return nans, negative numbers, and other things in that situation.

Seems reasonable to me.

-n