[XML-SIG] DOM API

Fredrik Lundh fredrik@pythonware.com
Tue, 20 Apr 1999 11:40:32 +0200


Greg wrote:
> On Mon, 19 Apr 1999, Paul Prescod wrote:
> > I'm going to propose instead a light-weight DOM subset. I would rather not
> > require PyXML users to memorize two different APIs depending on whether
> > they doing light-weight work or heavy-weight work.

the downside with Paul's line of reasoning is that it makes it
impossible to come up with something that is light-weight
also from the CPU's perspective...  not good.

> euh... I can definitely state that in the applications that I've been
> working with, that PIs are bogus, but namespaces are absolutely required.
> (that's how my code came to be!)

as far as I can tell, *all* upcoming XML standards use namespaces.
for a layman like me, they're pretty much part of the standard, so
having them in the core API is a good thing...

...

> Case in point: I wrote a first draft davlib.py against the DOM. Damn it
> was a serious bitch to simply extract the CDATA contents of an element!
> Moreover, it was also a total bitch to simply say "give me the child
> elements". Of course, that didn't work since the DOM insisted on returning
> a list of a mix of CDATA and elements.
> 
> The whole notion of mixing "node types" in a list is completely bogus if
> you want direct simplicity in a model.

well, our internal coreXML system returns a list consisting of Element
and and plain old strings (for CDATA).  the Element class has helpers
to deal with elements that contain only strings, and elements that
contain only child elements.  most code use these helpers, and auto-
matically flags "bad" XML documents.

I'm not yet convinced that your solution is easier to use -- but I might
change my mind...  just give me some time to think about it.

> It is one of my biggest problems with the DOM thing. Some yahoos
> over in the XML DOM world want all this nifty OO crap, yet they
> have built something that is hardly usable in a practical application.

> IMO, the XML DOM model is a neat theoretical expression of OO
> modelling of an XML document. For all practical purposes, it is
> nearly useless.

Am I the only one who think this year's W3C april's fool joke
was really scary...

> I mean hey: does anybody actually use the DOM to *generate* XML?
> Screw that -- I use "print". I can't imagine generating XML using the DOM.
> Complicated and processing intensive.

...

as an aside, here's an excerpt from Garnet, using our light-weight
XML builder...  root is a parent element, package is an "archive
handler" that takes care of "external entities" (if XML had been
designed by real programmers, it would have supported binary
data from the start ;-)

    def dump(self, root, package=None):
        stack = root.addelement("stack")
        if self.pcs:
            stack.addelement("pcs", self.pcs.tag)
        for i in self.stack:
            item = stack.addelement("item")
            title = i.gettitle()
            if title:
                item.addelement("title", title)
            extent = string.join(map(str, i.getextent()))
            item.addelement("extent", extent)
            i.dump(item, package)

doing this with print statements is quite a bit more error
prone.  this model is also interface-driven -- there's nothing
in here that deals directly with the file format.

...

I want something really light-weight, and highly pythonish, and I
don't care the slightest about TLA compatibility.  the "qp" API is
pretty close to what I want, but I think I can make it even simpler.
more on that later.

Cheers /F