[Python-Dev] Alternative implementation of string interning
Oren Tirosh
oren-py-d@hishome.net
Tue, 2 Jul 2002 08:27:38 +0300
On Mon, Jul 01, 2002 at 05:12:31PM -0400, Tim Peters wrote:
> [Oren Tirosh, on <http://python.org/sf/576101>]
> > ...
> > Interned strings are no longer immortal. They die when their refcnt
> > reaches 0 just like any other object.
>
> This may be a problem. Code now can rely on that id(some_interned_string)
> stays the same across the life of a run.
This requires code that stores the id of an object without keeping a
reference to the actual object. It also requires that no other piece of
Python or C code keep a reference to that object and yet for its identity to
be somehow still significant. If find that extremely hard to imagine.
> > Can anyone explain why they were implemented with a pointer in the first
> > place? Barry?
...
> and PyObject_SetAttr() was changed to make spam_str what you called an
> "indirectly interned" string by magic. This was (or at least Guido thought
> it was <wink>) an important optimization at the time.
I see. As far as I can tell, it isn't any more.
Now for something a bit more radical:
Why not make interned strings a type? <type 'istr'> could be an
un-subclassable subclass of string. intern would just be an alias for this
type. No two istr instances are equal unless they are identical. I guess
PyString_CheckExact would need to be changed to accept either String or
InternedString.
Oren