On Tue, Oct 9, 2012 at 12:43 PM, Guido van Rossum email@example.com wrote:
No, that's not what I meant -- maybe my turn of phrase "invoking IEEE" was confusing. The first part is what I meant: "Python cannot have a rule that x is y implies x == y because that would preclude implementing float.__eq__ as IEEE 754 equality comparison." The second half should be: "And we have already (independently from all this) decided that we want to implement float.__eq__ as IEEE 754 equality comparison." I'm sure a logician could rearrange the words a bit and make it look more logical.
I'll have a go. It's a lot longer, though :)
When designing their floating point support, language designers must choose between two mutually exclusive options: 1. IEEE754 compliant floating point comparison where NaN != NaN, *even if* they're the same object 2. The invariant that "x is y" implies "x == y"
The idea behind following the IEEE754 model is that mathematics is a *value based system*. There is only really one NaN, just as there is only one 4 (or 5, or any other specific value). The idea of a number having an identity distinct from its value simply doesn't exist. Thus, when modelling mathematics in an object system, it makes sense to say that *object identity is irrelevant, and only value matters*.
This is the approach Python has chosen: for *numeric* operations, including comparisons, object identity is irrelevant to the maximum extent that is practical. Thus "x = float('nan'); assert x != x" holds for *exactly the same reason* that "x = 10e50; y = 10e50; assert x == y" holds.
However, when it comes to containers, being able to assume that "x is y" implies "x == y" has an immense practical benefit in terms of being able to implement a large number of non-trivial optimisations. Thus the Python language definition explicitly allows containers to make that assumption, *even though it is known not to be universally true*.
This hybrid model means that even though "'x is y' implies 'x == y'" is not true in the general case, it may still be *assumed to be true* regardless by container implementations. In particular, the containers defined in the standard library reference are *required* to make this assumption.
This does mean that certain invariants about containers don't hold in the presence of NaN values. This is mostly a theoretical concern, but, in those cases where it *does* matter, then the appropriate solution is to implement a custom container type that handles NaN values correctly.
It's perhaps worth including a section explaining this somewhere in the language reference. It's not an accident that Python behaves the way it does, but it's certainly a rationale that can help implementors correctly interpret the rest of the language spec.