[Python-Dev] == on object tests identity in 3.x
Andreas Maier
andreas.r.maier at gmx.de
Fri Jul 11 16:04:35 CEST 2014
Am 09.07.2014 03:48, schrieb Raymond Hettinger:
>
> On Jul 7, 2014, at 4:37 PM, Andreas Maier <andreas.r.maier at gmx.de> wrote:
>
>> I do not really buy into the arguments that try to show how identity and value are somehow the same. They are not, not even in Python.
>>
>> The argument I can absolutely buy into is that the implementation cannot be changed within a major release. So the real question is how we document it.
>
> Once every few years, someone discovers IEEE-754, learns that NaNs
> aren't supposed to be equal to themselves and becomes inspired
> to open an old debate about whether the wreck Python in a effort
> to make the world safe for NaNs. And somewhere along the way,
> people forget that practicality beats purity.
>
> Here are a few thoughts on the subject that may or may not add
> a little clarity ;-)
>
> * Python already has IEEE-754 compliant NaNs:
>
> assert float('NaN') != float('NaN')
>
> * Python already has the ability to filter-out NaNs:
>
> [x for x in container if not math.nan(x)]
>
> * In the numeric world, the most common use of NaNs is for
> missing data (much like we usually use None). The property
> of not being equality to itself is primarily useful in
> low level code optimized to run a calculation to completion
> without running frequent checks for invalid results
> (much like @n/a is used in MS Excel).
>
> * Python also lets containers establish their own invariants
> to establish correctness, improve performance, and make it
> possible to reason about our programs:
>
> for x in c:
> assert x in c
>
> * Containers like dicts and sets have always used the rule
> that identity-implies equality. That is central to their
> implementation. In particular, the check of interned
> string keys relies on identity to bypass a slow
> character-by-character comparison to verify equality.
>
> * Traditionally, a relation R is considered an equality
> relation if it is reflexive, symmetric, and transitive:
>
> R(x, x) -> True
> R(x, y) -> R(y, x)
> R(x, y) ^ R(y, z) -> R(x, z)
>
> * Knowingly or not, programs tend to assume that all of those
> hold. Test suites in particular assume that if you put
> something in a container that assertIn() will pass.
>
> * Here are some examples of cases where non-reflexive objects
> would jeopardize the pragmatism of being able to reason
> about the correctness of programs:
>
> s = SomeSet()
> s.add(x)
> assert x in s
>
> s.remove(x) # See collections.abc.Set.remove
> assert not s
>
> s.clear() # See collections.abc.Set.clear
> asset not s
>
> * What the above code does is up to the implementer of the
> container. If you use the Set ABC, you can choose to
> implement __contains__() and discard() to use straight
> equality or identity-implies equality. Nothing prevents
> you from making containers that are hard to reason about.
>
> * The builtin containers make the choice for identity-implies
> equality so that it is easier to build fast, correct code.
> For the most part, this has worked out great (dictionaries
> in particular have had identify checks built-in from almost
> twenty years).
>
> * Years ago, there was a debate about whether to add an __is__()
> method to allow overriding the is-operator. The push for the
> change was the "pure" notion that "all operators should be
> customizable". However, the idea was rejected based on the
> "practical" notions that it would wreck our ability to reason
> about code, it slow down all code that used identity checks,
> that library modules (ours and third-party) already made
> deep assumptions about what "is" means, and that people would
> shoot themselves in the foot with hard to find bugs.
>
> Personally, I see no need to make the same mistake by removing
> the identity-implies-equality rule from the built-in containers.
> There's no need to upset the apple cart for nearly zero benefit.
Containers delegate the equal comparison on the container to their
elements; they do not apply identity-based comparison to their elements.
At least that is the externally visible behavior.
Only the default comparison behavior implemented on type object follows
the identity-implies-equality rule.
As part of my doc patch, I will upload an extension to the
test_compare.py test suite, which tests all built-in containers with
values whose order differs the identity order, and it shows that the
value order and equality wins over identity, if implemented.
>
> IMO, the proposed quest for purity is misguided.
> There are many practical reasons to let the builtin
> containers continue work as the do now.
As I said, I can accept compatibility reasons. Plus, the argument
brought up by Benjamin about the desire for the the
identity-implies-equality rule as a default, with no corresponding rule
for order comparison (and I added both to the doc patch).
Andy
More information about the Python-Dev
mailing list