[Python-Dev] Alternative implementation of string interning

Oren Tirosh oren-py-d@hishome.net
Tue, 2 Jul 2002 08:27:38 +0300


On Mon, Jul 01, 2002 at 05:12:31PM -0400, Tim Peters wrote:
> [Oren Tirosh, on <http://python.org/sf/576101>]
> > ...
> > Interned strings are no longer immortal.  They die when their refcnt
> > reaches 0 just like any other object.
> 
> This may be a problem.  Code now can rely on that id(some_interned_string)
> stays the same across the life of a run.

This requires code that stores the id of an object without keeping a 
reference to the actual object.  It also requires that no other piece of 
Python or C code keep a reference to that object and yet for its identity to 
be somehow still significant.  If find that extremely hard to imagine.

> > Can anyone explain why they were implemented with a pointer in the first
> > place? Barry?
...
> and PyObject_SetAttr() was changed to make spam_str what you called an
> "indirectly interned" string by magic.  This was (or at least Guido thought
> it was <wink>) an important optimization at the time.

I see.  As far as I can tell, it isn't any more.


Now for something a bit more radical:

Why not make interned strings a type?  <type 'istr'> could be an 
un-subclassable subclass of string.  intern would just be an alias for this 
type.  No two istr instances are equal unless they are identical.  I guess 
PyString_CheckExact would need to be changed to accept either String or 
InternedString.

	Oren