Re: [Tutor] <var:data> assignment

Magnus Lycka magnus at thinkware.se
Tue May 11 20:25:56 EDT 2004


I wrote:
> > If you know that you use a particular string often, or need to make it
> > faster to i.e. speed up dictionary access with this string as a key, you
> > can force Python to intern it. (It's only for strings you can do this.)
..
> > I'm not sure exactly what algorithm Python uses to decide which objects
> > to intern automagically.

Danny responded:
> The word "intern" really should apply to strings; I don't think intern()
> works on arbitrary objects.

I just wrote that! But I guess I am a bit sloppy when I use the same 
word for basically the same thing being done on integers as on strings. 
God forbid using the name of builtin functions for any other purpose! ;) 
Do you have a better term for this sharing of objects?

> As an optimization hack, the integers in the half-open interval range
> 
>     [-5, 100)
> 
> are created in advance and are kept alive in the Python runtime, so that a
> request for a small integer is quickly fulfilled by dipping into this
> "small integer" pool.

Aha. Do you know how it's done with short strings? (Don't tell me
that strings such as "Hello" are created in advance by Python! :)

As I look further, it seems that it has nothing to do with size
as I thought, but that all strings that are valid Python identifiers
are interned. 

The manual says that "Normally, the names used in Python programs are 
automatically interned, and the dictionaries used to hold module, class 
or instance attributes have interned keys."

It's obviously more extensive than that. All string *literals* that 
could possibly be names used in Python programs seems to get interned.

Note that it only happens automatically with string literals, not with 
strings that are the result of an expression (as far as I can see).

>>> s1 = 'Hexllpwioutpreoiuwptriwpwreioptwr243523452345345___'
>>> s2 = 'Hexllpwioutpreoiuwptriwpwreioptwr243523452345345___'
>>> s1 is s2
True
>>> s1 = 'a.'
>>> s2 = 'a.'
>>> s1 is s2
False
>>> s1 = 'z'*2
>>> s2 = 'z'*2
>>> s1 is s2
False
>>> s1 = 'z' + 'a'
>>> s2 = 'z' + 'a'
>>> s1 is s2
False

> > Python never interns mutable objects. Why?
> 
> Aliasing reasons.  If strings were mutable, then something like:
> 
> ### Pseudocode
> word1 = intern("hello")
> word2 = intern("hello")
> word2[1] = 'a'
> ###
> 
> would raise havok: what would we expect word1 to contain, "hello" or
> "hallo"?

Yes, yes I know. This was intended as a pedagogical question for 
the "pupil", not for the "professor"! ;)

-- 
Magnus Lycka, Thinkware AB
Alvans vag 99, SE-907 50 UMEA, SWEDEN
phone: int+46 70 582 80 65, fax: int+46 70 612 80 65
http://www.thinkware.se/  mailto:magnus at thinkware.se



More information about the Tutor mailing list