[Numpy-discussion] 0/0 == 0?

Fri Oct 3 19:40:59 EDT 2014

On Sat, Oct 4, 2014 at 12:21 AM, Nathaniel Smith <njs at pobox.com> wrote:
> On Fri, Oct 3, 2014 at 8:12 AM, Robert Kern <robert.kern at gmail.com> wrote:
>> On Fri, Oct 3, 2014 at 4:29 AM, Nathaniel Smith <njs at pobox.com> wrote:
>>> On Fri, Oct 3, 2014 at 3:20 AM, Charles R Harris
>>> <charlesr.harris at gmail.com> wrote:
>>>>
>>>> On Thu, Oct 2, 2014 at 7:06 PM, Benjamin Root <ben.root at ou.edu> wrote:
>>>>>
>>>>> Out[1] has an integer divided by an integer, and you can't represent nan
>>>>> as an integer. Perhaps something weird was happening with type promotion
>>>>> between versions?
>>>>
>>>>
>>>> Also note that in python3 the '/' operator does float rather than integer
>>>> division.
>>>>
>>>>>>> np.array(0) / np.array(0)
>>>> __main__:1: RuntimeWarning: invalid value encountered in true_divide
>>>> nan
>>>
>>> Floor division still acts the same though:
>>>
>>>>>> np.array(0) // np.array(0)
>>> __main__:1: RuntimeWarning: divide by zero encountered in floor_divide
>>> 0
>>>
>>> The seterr warning system makes a lot of sense for IEEE754 floats,
>>> which are specifically designed so that 0/0 has a unique well-defined
>>> answer. For ints though this seems really broken to me. 0 / 0 = 0 is
>>> just the wrong answer. It would be nice if we had something reasonable
>>> to return, but we don't, and I'd rather raise an error than return the
>>> wrong answer.
>>
>> Well, actually, that's the really nice thing about seterr for ints!
>> CPUs have hardware floating point exception flags to work with. We had
>> to build one for ints. If you want an error, you can get an error. *I*
>> don't want an error, and I don't have to have one!
>
> Sure, that's fine for integer computations corner cases that have
> well-defined outputs, like wraparound. But it doesn't make sense for
> divide-by-zero.
>
> The key thing about the IEEE754 exception design is that it gives you
> the option of either raising an error immediately or else letting it
> propagate through the computation as a nan until you reach an
> appropriate place to handle it.
>
> With ints we don't have nan, so we don't have the second option. Our
> options are either raise an error immediately, or else return some
> nonsense value that will just cause you to get some meaningless
> result, with no way to detect or recover from this situation. (Why
> don't we define 0 / 0 == -72? It would make just as much sense.)
>
> The second option is terrible enough that I kinda don't believe you
> when you say you want it. Maybe I'm missing something but...

I fix the values after-the-fact because one *can* detect and recover
from this situation with just a smidgen of forethought.

<not-real-code>

mask = (denominator == 0)
x = numerator // denominator
# We don't care about the masked cases. Fill them with a value that
# will be harmless/ignored downstream. Here, it's 0. It might be something
# else in other contexts.
x[mask] = 0

</not-real-code>

> Even more egregiously, numpy currently treats the integer
> divide-by-zero case identically with the floating-point one -- so if
> you want 0 / 0 to be an error (as you have to if you care about
> getting correct results), then you have to make 0.0 / 0.0 an error as
> well.

If you would like to introduce a separate `integer_divide` setting for
errstate() and make it raise by default, I'd be marginally okay with
that. In the above pattern, I'd be wrapping it with an errstate()
context manager anyways to silence the warning, so silencing the
default exception would be just as easy. However, nothing else in
errstate() raises by default, so this would be the odd special case.

-- 
Robert Kern