[Python-Dev] PyObject_RichCompareBool identity shortcut

Thu Apr 28 05:14:38 CEST 2011

On Wed, Apr 27, 2011 at 9:28 AM, Raymond Hettinger
<raymond.hettinger at gmail.com> wrote:
>
> On Apr 27, 2011, at 7:53 AM, Guido van Rossum wrote:
>
>> Maybe we should just call off the odd NaN comparison behavior?
>
> I'm reluctant to suggest changing such enshrined behavior.

No doubt there would be some problems; probably more for decimals than
for floats.

> ISTM, the current state of affairs is reasonable.

Hardly; when I picked the NaN behavior I knew the IEEE std prescribed
it but had never seen any code that used this.

> Exotic objects are allowed to generate exotic behaviors
> but consumers of those objects are free to ignore some
> of those behaviors by making reasonable assumptions
> about how an object should behave.

I'd say that the various issues and inconsistencies brought up (e.g. x
in A even though no a in A equals x) make it clear that one ignores
NaN's exoticnesss at one's peril.

> It's possible to make objects where the __hash__ doesn't
> correspond to __eq__.; they just won't behave well with
> hash tables.

That's not the same thing at all. Such an object would violate a rule
of the language (although one that Python cannot strictly enforce) and
it would always be considered a bug. Currently NaN is not violating
any language rules -- it is just violating users' intuition, in a much
worse way than Inf does. (All in all, Inf behaves pretty intuitively,
at least for someone who was awake during at least a few high school
math classes. NaN is not discussed there. :-)

> Likewise, it's possible for a sequence to
> define a __len__ that is different from it true length; it
> just won't behave well with the various pieces of code
> that assume collections are equal if the lengths are unequal.

(you probably meant "are never equal")

Again, typically a bug.

> All of this seems reasonable to me.

Given the IEEE std and Python's history, it's defensible and hard to
change, but still, I find reasonable too strong a word for the
situation.

I expect that that if 15 years or so ago I had decided to ignore the
IEEE std and declare that object identity always implies equality it
would have seemed quite reasonable as well... The rule could be
something like "the == operator first checks for identity and if left
and right are the same object, the answer is True without calling the
object's __eq__ method; similarly the != would always return False
when an object is compared to itself". We wouldn't change the
inequalities, nor the outcome if a NaN is compared to another NaN (not
the same object). But we would extend the special case for object
identity from containers to all == and != operators. (Currently it
seems that all NaNs have a hash() of 0. That hasn't hurt anyone so
far.)

Doing this in 3.3 would, alas, be a huge undertaking -- I expect that
there are tons of unittests that depend either on the current NaN
behavior or on x == x calling x.__eq__(x). Plus the decimal unittests
would be affected. Perhaps somebody could try?

-- 
--Guido van Rossum (python.org/~guido)