[Python-Dev] == on object tests identity in 3.x

Raymond Hettinger raymond.hettinger at gmail.com
Wed Jul 9 03:48:17 CEST 2014


On Jul 7, 2014, at 4:37 PM, Andreas Maier <andreas.r.maier at gmx.de> wrote:

> I do not really buy into the arguments that try to show how identity and value are somehow the same. They are not, not even in Python.
> 
> The argument I can absolutely buy into is that the implementation cannot be changed within a major release. So the real question is how we document it.

Once every few years, someone discovers IEEE-754, learns that NaNs
aren't supposed to be equal to themselves and becomes inspired
to open an old debate about whether the wreck Python in a effort
to make the world safe for NaNs.  And somewhere along the way,
people forget that practicality beats purity.

Here are a few thoughts on the subject that may or may not add
a little clarity ;-)

* Python already has IEEE-754 compliant NaNs:

       assert float('NaN') != float('NaN')

* Python already has the ability to filter-out NaNs:

       [x for x in container if not math.nan(x)]

* In the numeric world, the most common use of NaNs is for
  missing data (much like we usually use None).  The property
  of not being equality to itself is primarily useful in
  low level code optimized to run a calculation to completion
  without running frequent checks for invalid results
  (much like @n/a is used in MS Excel).

* Python also lets containers establish their own invariants
  to establish correctness, improve performance, and make it
  possible to reason about our programs:

           for x in c:
	       assert x in c

* Containers like dicts and sets have always used the rule
  that identity-implies equality.  That is central to their
  implementation.  In particular, the check of interned
  string keys relies on identity to bypass a slow
  character-by-character comparison to verify equality.

* Traditionally, a relation R is considered an equality
  relation if it is reflexive, symmetric, and transitive:

      R(x, x) -> True
      R(x, y) -> R(y, x)
      R(x, y) ^ R(y, z) -> R(x, z)

* Knowingly or not, programs tend to assume that all of those
  hold.  Test suites in particular assume that if you put
  something in a container that assertIn() will pass.

* Here are some examples of cases where non-reflexive objects
  would jeopardize the pragmatism of being able to reason
  about the correctness of programs:

      s = SomeSet()
      s.add(x)
      assert x in s

      s.remove(x)        # See collections.abc.Set.remove
      assert not s

      s.clear()          # See collections.abc.Set.clear
      asset not s

* What the above code does is up to the implementer of the
  container.  If you use the Set ABC, you can choose to
  implement __contains__() and discard() to use straight
  equality or identity-implies equality.  Nothing prevents
  you from making containers that are hard to reason about.

* The builtin containers make the choice for identity-implies
  equality so that it is easier to build fast, correct code.
  For the most part, this has worked out great (dictionaries
  in particular have had identify checks built-in from almost
  twenty years).

* Years ago, there was a debate about whether to add an __is__()
  method to allow overriding the is-operator.  The push for the
  change was the "pure" notion that "all operators should be
  customizable".  However, the idea was rejected based on the
  "practical" notions that it would wreck our ability to reason
  about code, it slow down all code that used identity checks,
  that library modules (ours and third-party) already made
  deep assumptions about what "is" means, and that people would
  shoot themselves in the foot with hard to find bugs.

Personally, I see no need to make the same mistake by removing
the identity-implies-equality rule from the built-in containers.
There's no need to upset the apple cart for nearly zero benefit.

IMO, the proposed quest for purity is misguided.
There are many practical reasons to let the builtin
containers continue work as the do now.


Raymond 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140708/73c62c69/attachment.html>


More information about the Python-Dev mailing list