[pypy-dev] x is y <=> id(x)==id(y)

Armin Rigo arigo at tunes.org
Sun May 5 11:59:44 CEST 2013

```Hi all,

I'm just wondering again about some "bug" reports that are not bugs,
about people misusing "is" to compare two immutable objects.  The
current situation in PyPy is that "is" works like "==" for ints,
longs, floats or complexes.  It does not for strs or unicodes or
tuples.  Now of course someone on python-dev was (indirectly)
complaining that you can compare in CPython ``x is ' '``, which works
because single-character strings are cached, but not in PyPy.  I'm
sure someone else has been bitten by writing in CPython ``x is ()``,
which is also cached there.

(Fwiw I think that there is a design flaw somewhere in Python, to
allow "1 is 1" to be executed without any error but also without any
well-defined result...)

Can we fix it once and for all?  It's annoying because of id: if we
want ``x is y`` for equal huge strings x and y, but still want
``id(x)==id(y)``, then we have to compute ``id(some_string)`` in a
rather slow way, producing a huge number.  The same for tuples: if we
always want ``(1, 2) is (1, 2)`` then we need to compute
``id(some_tuple)`` recursively, which can also lead to huge numbers.
In fact such a definition can explode the memory: ``a = (); for i in
range(100): a = (a, a); id(a)`` would likely need a 2**100-digits
number.

Solution 2 would be to add these hacks specially for cases that
CPython caches: I think by now we're only missing empty or single-char
strings or unicodes, and empty tuple.

Solution 3 would be to drop half of the rule, keeping only
``id(x)==id(y) => x is y``.  This would be the easiest, as we could
remove the complicated computations already done for longs or floats
or complexes.  We'd clearly document it as a difference from CPython.
The question is what kind of code might break if we drop the case ``x
is y => id(x)==id(y)``.

A bientôt,

Armin.
```