[XML-SIG] string interning

Alexandre Fayolle Alexandre.Fayolle@logilab.fr
Tue, 11 Sep 2001 11:58:09 +0200 (CEST)


On Tue, 11 Sep 2001, Alexandre Fayolle wrote:

> Hello,
> 
> I was browsing in the source code, and came accross this in
> xml.sax.handler :
> 
> feature_string_interning = "http://xml.org/sax/features/string-interning"
> # true: All element names, prefixes, attribute names, Namespace URIs, and
> #       local names are interned using the built-in intern function.
> # false: Names are not necessarily interned, although they may be
> (default).
> # access: (parsing) read-only; (not parsing) read/write
> 
> what does 'interning' mean in this context ?

Keeping in mind that SAX was designed with Java in mind, I gave a look at
the Java API documentation for class String, which defines an intern()
method:
http://java.sun.com/j2se/1.3/docs/api/java/lang/String.html#intern()

What intern() does is look for a string object being equal() to the
current String in a pool of unique Strings. One can then discard the
original String (to be garbage collected later) and use the returned
one. This can save memory (because fewer objects are produced by the
parser) and time (because one can use the == operator to test if element
names are equal (equivalent of python's 'is' test in this context).

This, I think, clarifies the issue. Please correct me if I'm wrong. 

Alexandre Fayolle
-- 
LOGILAB, Paris (France).
http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org
Narval, the first software agent available as free software (GPL).