On 08/10/2012 19:39, Guido van Rossum wrote:
Does this mean that the following behaviour of lists is a bug?
>
> x=float('NAN')
> [x]==[x], [x]<=[x], [x]>=[x]
(True, True, True)
No. That's a special case in the comparisons for sequences.
[Now that I'm back at a real keyboard I can elaborate...]
This applies to all container comparisons: without the rule that if
two contained items reference the same object they are to be
considered equal without calling their __eq__, containers couldn't
take the shortcut that a container is always equal to itself (i.e. c1
is c2 => c1 == c2). Without this shortcut, container comparisons would
be much more expensive: any time a large container was compared to
itself, it would be forced to recursively compare all the contained
items. You might say that it has to do this anyway when comparing to a
container that is not itself, but if the anser is "unequal" the
comparison can stop as soon as two unequal items are found, whereas if
the answer is "equal" you end up comparing all items. For two
different containers there is no possible shortcut, but comparing a
container to itself is quite common and really does deserve the
shortcut. We discussed this in the past and always came to the same
conclusion: despite the rules for NaN, the shortcut for containers is
required. A similar shortcut exists for 'x in [x]' BTW.
Thank you for elaborating, I was going to ask what the justification for the
special case was.
You have explained why
x=float('NAN'); A=[x]; A==A
True
but not as far as I can see why
x=float('NAN'); A=[x]; B=[x]; A==B, [x]=[x]
(True, True)
where neither of the results is comparing a container to itself.
It's so that when the container is iterating over pairs of elements it
can check for item identity (a simple pointer comparison) first, which
makes a pretty big difference in speed.
--
--Guido van Rossum (python.org/~guido)