[pypy-dev] x is y <=> id(x)==id(y)

Armin Rigo arigo at tunes.org
Sun May 5 21:41:54 CEST 2013


Hi all,

On Sun, May 5, 2013 at 9:16 PM, Steven D'Aprano <steve at pearwood.info> wrote:
>> It's true what you're saying, but we consistently see bug reports
>> about people comparing ints or strings with is and complaining that
>> they work fine on cpython, but not on pypy.
>
> Then their code is buggy, not PyPy. But you know that :-)

This is precisely what this thread is about: such "buggy" code that
uses "is" to compare two immutable objects.  At this point, the
question is not "would it cause any trouble in existing programs to
say that "x is not y" when CPython in the same program says that "x is
y", because we know that the answer to that is "yes".

We already found out a perfectly reasonable fix for "small" objects: two
equal ints are always "is"-identical and have the same id() in PyPy.
This is a nice way to solve the above problem.  If anything it creates
the opposite problem: some code that works on PyPy might not work on
CPython.  If PyPy becomes used enough, CPython will then have to care
about that too, and we'll end up with a well-defined definition of
"is" on immutable objects :-)

But we're not (yet) using the same idea on *all* types of immutable
objects.  So what we're concerned about now is whether it could be
implemented efficiently: the answer could be "yes if we forget about
strictly enforcing "x is y <=> id(x) == id(y)".  So, the question:
although it's documented to be wrong, would it actually cause any
trouble to relax this requirement?

> a = b = X   # regardless of what X is
> mylist = [a, None]
> assert mylist[0] is a
> assert mylist[0] is b
>
> both assertions must pass, no matter what X is, whether mutable or
> immutable.

I *think* that in this case the assertions cannot fail in PyPy either.
If X is a string, then we get as "mylist[0]" an object that is a
different W_StringObject but containing internally the same
RPython-level string, and as such (because we tweaked "is") they
compare "is"-identical.  But that seems like a problem waiting to
happen: if in the future we're using a list strategy for a list
of single characters, then W_StringObjects containing single
characters will be rebuilt out of an RPython list of characters, and
not be "is"-identical under our current definition.

In addition, the problem right now is about code like ``if x[5] is
'.': ...`` which happens to work as expected on CPython, but not on
PyPy.  In PyPy's case the two strings x[5] and '.' are using different
RPython-level strings.


A bientôt,

Armin.


More information about the pypy-dev mailing list