[Numpy-discussion] 0/0 == 0?

Robert Kern robert.kern at gmail.com
Sat Oct 4 04:41:43 EDT 2014


On Sat, Oct 4, 2014 at 2:17 AM, Nathaniel Smith <njs at pobox.com> wrote:
> On Sat, Oct 4, 2014 at 12:40 AM, Robert Kern <robert.kern at gmail.com> wrote:
>> On Sat, Oct 4, 2014 at 12:21 AM, Nathaniel Smith <njs at pobox.com> wrote:
>>> On Fri, Oct 3, 2014 at 8:12 AM, Robert Kern <robert.kern at gmail.com> wrote:
>>>> On Fri, Oct 3, 2014 at 4:29 AM, Nathaniel Smith <njs at pobox.com> wrote:
>>>>> On Fri, Oct 3, 2014 at 3:20 AM, Charles R Harris
>>>>> <charlesr.harris at gmail.com> wrote:
>>>>>>
>>>>>> On Thu, Oct 2, 2014 at 7:06 PM, Benjamin Root <ben.root at ou.edu> wrote:
>>>>>>>
>>>>>>> Out[1] has an integer divided by an integer, and you can't represent nan
>>>>>>> as an integer. Perhaps something weird was happening with type promotion
>>>>>>> between versions?
>>>>>>
>>>>>>
>>>>>> Also note that in python3 the '/' operator does float rather than integer
>>>>>> division.
>>>>>>
>>>>>>>>> np.array(0) / np.array(0)
>>>>>> __main__:1: RuntimeWarning: invalid value encountered in true_divide
>>>>>> nan
>>>>>
>>>>> Floor division still acts the same though:
>>>>>
>>>>>>>> np.array(0) // np.array(0)
>>>>> __main__:1: RuntimeWarning: divide by zero encountered in floor_divide
>>>>> 0
>>>>>
>>>>> The seterr warning system makes a lot of sense for IEEE754 floats,
>>>>> which are specifically designed so that 0/0 has a unique well-defined
>>>>> answer. For ints though this seems really broken to me. 0 / 0 = 0 is
>>>>> just the wrong answer. It would be nice if we had something reasonable
>>>>> to return, but we don't, and I'd rather raise an error than return the
>>>>> wrong answer.
>>>>
>>>> Well, actually, that's the really nice thing about seterr for ints!
>>>> CPUs have hardware floating point exception flags to work with. We had
>>>> to build one for ints. If you want an error, you can get an error. *I*
>>>> don't want an error, and I don't have to have one!
>>>
>>> Sure, that's fine for integer computations corner cases that have
>>> well-defined outputs, like wraparound. But it doesn't make sense for
>>> divide-by-zero.
>>>
>>> The key thing about the IEEE754 exception design is that it gives you
>>> the option of either raising an error immediately or else letting it
>>> propagate through the computation as a nan until you reach an
>>> appropriate place to handle it.
>>>
>>> With ints we don't have nan, so we don't have the second option. Our
>>> options are either raise an error immediately, or else return some
>>> nonsense value that will just cause you to get some meaningless
>>> result, with no way to detect or recover from this situation. (Why
>>> don't we define 0 / 0 == -72? It would make just as much sense.)
>>>
>>> The second option is terrible enough that I kinda don't believe you
>>> when you say you want it. Maybe I'm missing something but...
>>
>> I fix the values after-the-fact because one *can* detect and recover
>> from this situation with just a smidgen of forethought.
>>
>> <not-real-code>
>>
>> mask = (denominator == 0)
>> x = numerator // denominator
>> # We don't care about the masked cases. Fill them with a value that
>> # will be harmless/ignored downstream. Here, it's 0. It might be something
>> # else in other contexts.
>> x[mask] = 0
>>
>> </not-real-code>
>
> I don't find this argument very convincing, except as an argument for
> having a deprecation period. In the unusual case where this is what
> you want, it's trivial and more explicit to write it directly --
> bringing errstate into it is just rube goldbergian. E.g.:
>
> mask = (denominator == 0)
> x = np.floor_divide(numerator, denominator, where=~mask)
> x[mask] = 0

In any case, controlling the errstate() is important because that
operation is often buried inside a function written by someone who
made assumptions about the inputs. Even if I had remembered that we
had added the `where=` keyword recently, I would still run into places
where I needed to silence the error. This is an old, long-established
pattern, not a rube goldbergian contraption. It's how all of numpy.ma
works for domained operations like division.

>>> Even more egregiously, numpy currently treats the integer
>>> divide-by-zero case identically with the floating-point one -- so if
>>> you want 0 / 0 to be an error (as you have to if you care about
>>> getting correct results), then you have to make 0.0 / 0.0 an error as
>>> well.
>>
>> If you would like to introduce a separate `integer_divide` setting for
>> errstate() and make it raise by default, I'd be marginally okay with
>> that. In the above pattern, I'd be wrapping it with an errstate()
>> context manager anyways to silence the warning, so silencing the
>> default exception would be just as easy. However, nothing else in
>> errstate() raises by default, so this would be the odd special case.
>
> In a perfect world (which may or not match the actual world) I
> actually would prefer integer wraparound to be treated as its own
> category (instead of being lumped with float overflow-to-inf), and to
> raise by default (with the option to enable it explicitly). Unexpected
> inf's give you correct results in some cases and obviously-broken
> results in others; unexpected wraparound tends to produce silent bugs
> in basically all integer using code [1]; lumping them together is
> pretty suboptimal. But that's a whole 'nother discussion...

Well, it's up to you to make a concrete proposal. You have the
opportunity to make the world you want.

-- 
Robert Kern



More information about the NumPy-Discussion mailing list