[Python-Dev] Interning string subtype instances
"Martin v. Löwis"
martin at v.loewis.de
Tue Feb 13 11:18:32 CET 2007
Hrvoje Nikšić schrieb:
> Another reason is that Python's interning mechanism is much better than
> such a simple implementation: it stores the interned state directly in
> the PyString_Object structure, so you can find out that a string is
> already interned without looking it up in the dictionary. This
> information can (and is) used by both Python core and by C extensions.
> Another advantage is that, as of recently, interned strings can be
> garbage collected, which is typically not true of simple replacements
> (although it could probably be emulated by using weak references, it's
> not trivial.)
OTOH, in an application that needs unique strings, you normally know
what the scope is (i.e. where the strings come from, and when they
aren't used anymore).
For example, in XML parsing, pyexpat supports an interning dictionary.
It puts all element and attribute names into (but not element content,
which typically isn't likely to be repeated). It starts with a fresh
dictionary before parsing starts, and releases the dictionary when
parsing is done.
Regards,
Martin
More information about the Python-Dev
mailing list