[XML-SIG] Implementing 'live'-ness in a 4DOM fork.

Sun Aug 17 22:39:33 EDT 2003

I want to implement live collections in a forked 4DOM.  OK, 'want' is
perhaps the wrong word :-(((  This is for use in a JavaScript/DOM/Python
binding I'm writing.

Does anybody have advice on the simplest approach?

I don't think I care about performance -- I'm only doing HTML.

I had thought that the simplest way would be to wrap NamedNodeMap,
NodeList and HTMLCollection (and HTMLOptionCollection, which doesn't
actually exist in 4DOM -- see below) in a wrapper that simply recomputes
everything every time an attribute is looked up, then delegates to the
standard 4DOM implementation.  However, these interfaces represent mutable
collections, which is going to mess up that approach: there are methods
like NamedNodeMap.removeNamedItem and attributes like
HTMLOptionCollection.length (which is *not* readonly).

Worse, I presume that when you remove an item from a NamedNodeMap, the
node tree gets updated.  I'm damned if I can tell from the standard,
though.  All it says is:

| NodeList and NamedNodeMap objects in the DOM are live; that is,
| changes to the underlying document structure are reflected in all
| relevant NodeList and NamedNodeMap objects. For example, if a DOM
| user gets a NodeList object containing the children of an Element,
| then subsequently adds more children to that element (or removes
| children, or modifies them), those changes are automatically
| reflected in the NodeList, without further action on the user's
| part. Likewise, changes to a Node in the tree are reflected in all
| references to that Node in NodeList and NamedNodeMap objects.

That tells me nothing about what happens when you remove a node from a
NamedNodeMap.  Still, I suppose there is no other sane interpretation than
the one I take above -- otherwise, what would happen if you made changes
from *both* the node tree *and* a collection reflecting that tree?

Given that two-way syncing of state is required between these collections
and the node tree, I can't see how to do this nicely with mutation events,
because you'd get infinite loops without special precautions.  In fact,
that might be the least of the problems -- using mutation events to do
this looks pretty painful anyway, even without this two-way syncing
requirement!

I suppose the simplest thing is to reimplement the collections so they
don't store any permanent state (which will require changes to all objects
that can create collections, too).  Slow, but I don't care.

BTW, in previous posts here about this, I notice that people have claimed
that only getElementsByTagName is affected here (because it has to
construct a NodeList that doesn't directly reflect the node tree
structure).  Looking at the source, though, I see that NodeList,
NamedNodeMap and HTMLCollection all call the UserDict / UserList
constructors, which create a new dict and a new list respectively. Unless
I missed something, that means none of these collections is live.  I don't
know whether this is considered a feature or a bug by people here, but it
could be trivially 'fixed', by just binding to self.data instead of
calling the constructor.

Also BTW, is the HTML DOM still being used by anyone?  I posted a
PyXML bug on SF (782470) about HTMLDocument.getElementsByName a while
ago -- appears quite broken, and no-one seems worried.  I also noticed
HTMLOptionsCollection is missing -- is that deliberate?

John