[Python-Dev] Alternative implementation of interning

Guido van Rossum guido@python.org
Fri, 16 Aug 2002 12:44:22 -0400


> On Thu, Aug 15, 2002 at 12:46:25PM -0400, Tim Peters wrote:
> > As the only person to have posted an example relying on this behavior, it's
> > OK by me if that example breaks -- it was made up just to illustrate the
> > possibility and raise a caution flag.  I don't take it seriously.

[Oren]
> In Python it's easier to just use the string so there is no real incentive 
> to use the id.  I would say that making the result of the intern() builtin
> mortal is probably safe.

OK, there seems consensus on this one.

> The problem is in C extension modules. In C there is an incentive to rely
> on the immortality of interned strings because it makes the code simpler.
> There was an example of this in the Mac import code. PyString_InternInPlace 
> should probably create immortal interned strings for backward compatibility 
> (and deprecated, of course)

But the vast majority of C code does *not* depend on this.  I'd rather
keep PyString_InternInPlace(), so we don't have to change all call
locations, only the very rare ones that rely on this (Martin found
another two).

Maybe we can add even detect the abusing cases by putting a test in
PyString_InternInPlace() like this:

if (s->ob_refcnt == 1) {
    PyErr_Warn(PyExc_DeprecationWarning,
               "interning won't keep your string alive");
    PyErr_Clear(); /* In case the warning was an error, ignore it */
    Py_INCREF(s); /* Make s immortal */
}

> Maybe PyString_Intern should be renamed to PyString_InternReference to
> make it more obvious that it modifies the pointer "in place".

The perfect name for that API already exists: PyString_InternInPlace(). :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)