[Python-ideas] checking for identity before comparing built-in objects

Tue Oct 9 03:31:40 CEST 2012

On Mon, Oct 8, 2012 at 5:17 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> Alexander, while I might have chosen to make nan == nan True, I consider it
> a near tossup with no happy resolution and would not change it now.

While I did suggest to change nan == nan result two years ago,
<http://mail.python.org/pipermail/python-ideas/2010-March/006945.html>,
I am not suggesting it now.  Here I am merely trying to understand to
what extent Python's float is implementing IEEE 754 and why in some
cases Python's behavior deviates from the standard while in the case
of nan == nan, IEEE 754 is taken as a gospel.

> Guido's
> explanation is pretty clear: he went with the IEEE standard as interpreted
> for Python by Tim Peters.

It would be helpful if that interpretation was clearly written
somewhere.  Without a written document this interpretation seems
apocryphal to me.

Earlier in this thread, Guido wrote: "I am not aware of an update to
the standard."  To the best of my knowledge IEEE Std 754 was last
updated in 2008.  I don't think the differences between 1985 and 2008
revisions matter much for this discussion, but since I am going to
refer to chapter and verse, I will start by citing the document that I
will use:

IEEE Std 754(TM)-2008
(Revision of IEEE Std 754-1985)
IEEE Standard for Floating-Point Arithmetic
Approved 12 June 2008
IEEE-SA Standards Board

(AFAICT, the main difference between 754-2008 and 754-1985 is that the
former includes decimal floats added in 854-1987.)

Now, let me put my language lawyer hat on and compare Python floating
point implementations to IEEE 754-2008 standard.   Here are the
relevant clauses:

3. Floating-point formats
4. Attributes and rounding
5. Operations
6. Infinity, NaNs, and sign bit
7. Default exception handling
8. Alternate exception handling attributes
9. Recommended operations
10. Expression evaluation
11. Reproducible floating-point results

Clause 3 (Floating-point formats) defines five formats: 3 binary and 2
decimal.  Python supports a superset of decimal formats and a single
binary format.  Section 3.1.2 (Conformance) contains the following
provision: "A programming environment conforms to this standard, in a
particular radix, by implementing one or more of the basic formats of
that radix as both a supported arithmetic format and a supported
interchange format."  I would say Python is conforming to Clause 3.

Clause 4 (Attributes and rounding) is supported only by Decimal
through contexts: "For attribute specification, the implementation
shall provide language-defined means, such as compiler directives, to
specify a constant value for the attribute parameter for all standard
operations in a block; the scope of the attribute value is the block
with which it is associated."  I believe Decimal is mostly conforming,
but float is not conforming at all.

Clause 5 requires "[a]ll conforming implementations of this standard
shall provide the operations listed in this clause for all supported
arithmetic formats, except as stated below."  In other words, a
language standard that claims conformance with IEEE 754 must provide
all operations unless the standard states otherwise.  Let's try to map
 IEEE 754 required operations to Python float operations.

5.3.1 General operations

sourceFormat roundToIntegralTiesToEven(source)
sourceFormat roundToIntegralTiesToAway(source)
sourceFormat roundToIntegralTowardZero(source)
sourceFormat roundToIntegralTowardPositive(source)
sourceFormat roundToIntegralTowardNegative(source)
sourceFormat roundToIntegralExact(source)

Python only provides float.__trunc__ which implements
roundToIntegralTowardZero.  (The builtin round() belongs to a
different category because it changes format from double to int.)

sourceFormat nextUp(source)
sourceFormat nextDown(source)

I don't think these are available for Python floats.

sourceFormat remainder(source, source) - float.__mod__

Not fully conforming.  For example, the standard requires
remainder(-2.0, 1.0) to return -0.0, but in Python 3.3:

>>> -2.0 % 1.0
0.0

On the other hand,

>>> math.fmod(-2.0, 1.0)
-0.0

sourceFormat minNum(source, source)
sourceFormat maxNum(source, source)
sourceFormat minNumMag(source, source)
sourceFormat maxNumMag(source, source)

I don't think these are available for Python floats.

5.3.3 logBFormat operations

I don't think these are available for Python floats.

5.4.1 Arithmetic operations

formatOf-addition(source1, source2) - float.__add__
formatOf-subtraction(source1, source2) - float.__sub__
formatOf-multiplication(source1, source2) - float.__mul__
formatOf-division(source1, source2) - float.__truediv__
formatOf-squareRoot(source1) - math.sqrt
formatOf-fusedMultiplyAdd(source1, source2, source3) - missing
formatOf-convertFromInt(int) - float.__new__

With exception of fusedMultiplyAdd, Python float is conforming.

intFormatOf-convertToIntegerTiesToEven(source)
intFormatOf-convertToIntegerTowardZero(source)
intFormatOf-convertToIntegerTowardPositive(source)
intFormatOf-convertToIntegerTowardNegative(source)
intFormatOf-convertToIntegerTiesToAway(source)
intFormatOf-convertToIntegerExactTiesToEven(source)
intFormatOf-convertToIntegerExactTowardZero(source)
intFormatOf-convertToIntegerExactTowardPositive(source)
intFormatOf-convertToIntegerExactTowardNegative(source)
intFormatOf-convertToIntegerExactTiesToAway(source)

Python has a single builtin round().

5.5.1 Sign bit operations

sourceFormat copy(source) - float.__pos__
sourceFormat negate(source) - float.__neg__
sourceFormat abs(source) - float.__abs__
sourceFormat copySign(source, source) - math.copysign

Python float is conforming.

Now we are getting close to the issue at hand:
"""
5.6.1 Comparisons
Implementations shall provide the following comparison operations, for
all supported floating-point operands of the same radix in arithmetic
formats:

boolean compareQuietEqual(source1, source2)
boolean compareQuietNotEqual(source1, source2)
boolean compareSignalingEqual(source1, source2)
boolean compareSignalingGreater(source1, source2)
boolean compareSignalingGreaterEqual(source1, source2)
boolean compareSignalingLess(source1, source2)
boolean compareSignalingLessEqual(source1, source2)
boolean compareSignalingNotEqual(source1, source2)
boolean compareSignalingNotGreater(source1, source2)
boolean compareSignalingLessUnordered(source1, source2)
boolean compareSignalingNotLess(source1, source2)
boolean compareSignalingGreaterUnordered(source1, source2)
boolean compareQuietGreater(source1, source2)
boolean compareQuietGreaterEqual(source1, source2)
boolean compareQuietLess(source1, source2)
boolean compareQuietLessEqual(source1, source2)
boolean compareQuietUnordered(source1, source2)
boolean compareQuietNotGreater(source1, source2)
boolean compareQuietLessUnordered(source1, source2)
boolean compareQuietNotLess(source1, source2)
boolean compareQuietGreaterUnordered(source1, source2)
boolean compareQuietOrdered(source1, source2).
"""

Signaling comparisons are missing.  Ordered/Unordered comparisons are
missing.  Note that the standard does not require any particular
spelling for operations.  "In this standard, operations are written as
named functions; in a specific programming environment they might be
represented by operators, or by families of format-specific functions,
or by operations or functions whose names might differ from those in
this standard."  (Sec. 5.1)  It would be perfectly conforming for
python to spell compareSignalingEqual() as '==' and
compareQuietEqual() as math.eq() or even
ieee745_2008.compareQuietEqual().  The choice that Python made was not
dictated by the standard.  (As I have shown above, Python's %
operation does not implement a conforming IEEE 754 residual(), but
math.fmod() seems to fill the gap.)

This post is already too long, so I'll leave Clauses 6-11 for another
time.  "IEEE 754 may be more complex than you think!" (GvR, earlier in
this thread.)  I hope I already made the case that Python's float does
not conform to IEEE 754 and that IEEE 754 does not require an
operation spelled "==" or "float.__eq__" to return False when
comparing two NaNs.  The standard requires support for 22 comparison
operations, but Python's float supports around six.  On top of that,
Python has an operation that has no analogue in IEEE 754 - the "is"
comparison.   This is why IEEE 754 standard does not help in answering
the main question in this thread: should (x is y) imply (x == y)?  We
need to formulate a rationale for breaking this implication without a
reference to IEEE 754 or Tim's interpretation thereof.

Language-lawyierly-yours,

Alexander Belopolsky