[Python-Dev] Re: Alternative implementation of string interning

Tim Peters tim.one@comcast.net
Thu, 04 Jul 2002 02:40:41 -0400


[Timothy Delaney]
> The #1 most important consideration here is backwards
> compatibility IMO.  Whilst I would be personally unaffected by
> this change (allowing interned strings to be collected), we've
> already had examples of people and code that would be.

Have we?  I posted an example I made up -- I've written and seen code
*close* to that, but not close enough to actually break if interned strings
were to get collected.  I also saw Jack's interned-string refcount abuse in
an isolated part of the core Mac support code, but breaking core code never
counts because we have 100% control over the core (if interned strings were
to get collected, we'd fiddle the Mac code for the same release, and nobody
would be the wiser).  I don't recall hearing about anything else here, and I
don't know of anything else.

Any subsystem that can waste an unbounded amount of memory is a potential
cause of user headaches.  I don't like immortal interned strings, and I
don't like the unbounded int or float free lists either.  It's also not good
that pymalloc never returns arenas to the system, although at least that was
carefully designed so that arenas not in use can become and stay paged out
(e.g., it doesn't periodically "tickle" them as part of general
bookkeeping -- when they're unused by the user, they're also untouched by
pymalloc).

So far, I don't know of any real loss that would occur as a result of
reclaiming unreferenced interned strings.