[Python-ideas] checking for identity before comparing built-in objects

Mon Oct 8 18:47:48 CEST 2012

On Sun, Oct 7, 2012 at 8:46 PM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> On Sun, Oct 7, 2012 at 11:09 PM, Rob Cliffe <rob.cliffe at btinternet.com> wrote:
>> Couldn't each NAN when generated contain something that identified it
>> uniquely, so that different NANs would always compare as not equal, but any
>> given NAN would compare equal to itself?
>
> If we take this route and try to distinguish NaNs with different
> payload, I am sure you will want to distinguish between -0.0 and 0.0
> as well.  The later would violate transitivity in -0.0 == 0 == 0.0.
>
> The only sensible thing to do with NaNs is either to treat them all
> equal (the Eiffel way) or to stick to IEEE default.
>
> I don't think NaN behavior in Python is a result of a deliberate
> decision to implement IEEE 754.

Oh, it was. It was very deliberate. Like in many other areas of
Python, I refused to invent new rules when there was existing behavior
elsewhere that I could borrow and with which I had no reason to
quibble. (And in the case of floating point behavior, there really is
no alternate authority to choose from besides IEEE 754. Languages that
disagree with it do not make an authority.)

Even if I *did* have reasons to quibble with the NaN behavior (there
were no NaNs on the mainframe where I learned programming, so they
were as new and weird to me as they are to today's novices), Tim
Peters, who has implemented numerical libraries for Fortran compilers
in a past life and is an absolute authority on floating points,
convinced me to follow IEEE 754 as closely as I could.

> If that was the case, why 0.0/0.0 does not produce NaN?

Easy. It was an earlier behavior, from the days where IEEE 754
hardware did not yet rule the world, and Python didn't have much op an
opinion on float behavior at all -- it just did whatever the platform
did. Infinities and NaNs were not on my radar (I hadn't met Tim yet
:-). However division by zero (which is not just a float but also an
int behavior) was something that I just had to address, so I made the
runtime check for it and raise an exception. When we became more
formal about this, we considered changing this but decided that the
ZeroDivisionError was more user-friendly than silently propagating
NaNs everywhere, given the typical use of Python. (I suppose we could
make it optional, and IIRC that's what Decimal does -- but for floats
we don't have a well-developed numerical context concept yet.)

> Similarly, Python math library does not produce
> infinities where IEEE 754 compliant library should:
>
>>>> math.log(0.0)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ValueError: math domain error

Again, this mostly comes from backward compatibility with the math
module's origins (and it is as old as Python itself, again predating
its use of IEEE 754). AFAIK Tim went over the math library very
carefully and cleaned up what he could, so he probably thought about
this as well. Also, IIUC the IEEE library prescribes exceptions as
well as return values; e.g. "man 3 log" on my OSX computer says that
log(0) returns -inf as well as raise a divide-by-zero exception. So I
think this is probably compliant with the standard -- one can decide
to ignore the exceptions in certain contexts and honor them in others.
(Probably even the 1/0 behavior can be defended this way.)

> Some other operations behave inconsistently:
>
>>>> 2 * 10.**308
> inf
>
> but
>>>> 10.**309
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> OverflowError: (34, 'Result too large')

Probably the same. IEEE 754 may be more complex than you think!

> I think non-reflexivity of nan in Python is an accidental feature.

It is not.

> Python's float type was not designed with NaN in mind and until
> recently, it was relatively difficult to create a nan in pure python.

And when we did add NaN and Inf we thought about the issues carefully.

> It is also not true that IEEE 754 requires that nan == nan is false.
> IEEE 754 does not define operator '==' (nor does it define boolean
> false).  Instead, IEEE defines a comparison operation that can have
> one of four results: >, <, =, or unordered.  The standard does require
> than NaN compares unordered with anything including itself, but it
> does not follow that a language that defines an == operator with
> boolean results must define it so that nan == nan is false.

Are you proposing changes again? Because it sure sounds like you are
unhappy with the status quo and will not take an explanation, however
authoritative it is.

Given a language with the 6 comparisons like Python (and most do),
they have to be mapped to the IEEE comparison *somehow*, and I believe
we chose one of the most logical translations imaginable (given that
nobody likes == and != raising exceptions).

-- 
--Guido van Rossum (python.org/~guido)