[XML-SIG] Major upcoming DOM changes in CVS

Greg Stein gstein@lyra.org
Thu, 18 Mar 1999 14:33:49 -0800

Andrew M. Kuchling wrote:
> Greg Stein writes:
> >post-DOM-construction walk of the DOM tree). Each element can define any
> >number of prefixes, so my stack is a list (one item per element depth)
> >of dictionaries (prefix to URI mapping). When an element "starts", I
>         I could do that, keeping a dictionary on each _nodeData
> instance.  Finding the namespace for a given prefix is then
> proportional to the height of the DOM tree, because you have to start
> at the node and scan back toward the root.  A common operation is
> likely to be "find attribute X in namespace with URI Y", and that
> would be terribly slow; scan back until you find a namespace
> declaration with URI Y, and then check for an attribute with that
> prefix.  That's O(height of tree * # of attributes), but I can't think
> of a better way.

As I mentioned: store attributes as (URI, name) pairs for the key. The
lookup will be quite fast then.

Remember: the prefixes are only important when you parse or "render" the
XML. While you're operating on the DOM, those prefixes are
meaningless/bogus. In fact, I might posit that preparing the prefixes is
the job of the XmlWriter class.
(and the toxml() method is no longer as useful)

>         It would obviously be better to store a cumulative map on each
> node, reducing the height-of-tree factor to a constant, but I'm
> frightened of that approach, fearing it'll make changing the tree
> expensive or difficult, since you'd either have to recompute the maps
> on an entire subtree every time you change an attribute or move
> something around (expensive), or use smart updating to saveCPU time
> (difficult, and potentially a source of bugs from complicated updating
> logic).

The map between prefix and URI is only used at parse/render time. In
this case, I think the idea of a cumulative map works great.

Normally, I associate a set of all (used) namespaces with the document.
It becomes very easy to know ahead of time how many namespaces there
are, define prefixes as ns%d, declare them on the document element, and
then use them throughout the doc.

It would be a bit harder for the DOM to do this, but (logically) a DOM
has a set of namespaces. If the XmlWriter knew those ahead of time, then
the generation part would be easy. The alternative is to simply insert
xmlns: declarations everywhere, as they occur (and reuse parent
URI<->prefix mappings).

>         In a recent xml-dev posting, David Megginson mentioned that
> some implementors are turning the element names into longer,
> "URI-prefix tagName" strings, like "http://www.w3.org/RDF rdf".  This
> is apparently of dubious legality, but it gets their job done.  I
> think it's an ugly hack, myself...

If they actually remap the element name... yes, I'd say that is a hack.
However, the URI/name pair is, by definition, the actual value of the
element or attribute. We Pythoneers can easily deal with the pair, so we
don't need to resort to the hacks.


Greg Stein, http://www.lyra.org/