[Pythonmac-SIG] string ids

Chris Rebert pythonmac at rebertia.com
Tue Dec 30 02:29:17 CET 2008


On Mon, Dec 29, 2008 at 11:46 AM, Feat <jf at ai.univ-paris8.fr> wrote:
> At 11:30 -0500 29/12/08, David Hostetler wrote:
>>The 'weird' results you were seeing when using 'is' were really just the python interpretor lifting up its skirts a bit and (inadvertantly perhaps) revealing when it had shared the memory storage for a string literal and when it hadn't.
>
> Yes: thank you. Is there any way to predict whether the values are going to be shared or not? As a matter of fact, the ids are the same as long as I do not use any for statement with the values in a sequence, and when I use any, it will sometimes get them from the context and some other times it will make up new values.

I think the interpreter only automatically interns extremely short
strings, like one or two characters long, and the empty string of
course. In any case, you shouldn't worry about it or really care about
the mysterious rules the interpreter uses to decide when to share a
string. As explained below, you almost always should use == and not
`is`; thus, comparing strings for pointer equality almost never comes
up in practice, so the 'sharing' rules don't matter much at all.

> I had hoped I could tell my students how to discriminate between the two cases. They are no C programmers yet, but they're fully aware of the difference between the eq() and equal() Lisp predicates, eq() being totally reliable as for pointers comparison.

The difference between `is` and == is exactly the same as between eq()
and equal(), pointer equality versus value equality.
`is` is used extremely rarely, apart from comparisons with None, with
`if foo is None:` being idiomatic in that case. Unless you're
comparing with None or have very good reason, always use ==. It's
similar to how one should always use .equals() rather than == with
objects in Java.

If you want to give your students a more straightforward example to
demonstrate the difference, use a normal, non-builtin object instead
of a string so that the interpreter doesn't do any unexpected magical
sharing. For example, write a simple Point class and show how:

a = Point(1, 2)
b = Point(1, 2)
c = a
print a == b #---> True
print a is b, b is a #---> False False
print a is c, c is a #---> True True

Cheers,
Chris

-- 
Follow the path of the Iguana...
http://rebertia.com


More information about the Pythonmac-SIG mailing list