Pure Python XML parser

Wed Oct 8 10:09:20 EDT 2003

Uche Ogbuji wrote:
> After all, DOM pretty much sucks.

Uche,

You're the second person in a week to express that sentiment, after
the EffBot last week. I was going to make some points in reply to <f/>
last week, but held back. But now that it's come up again, I thought
I'd chip in my €0,02.

I mostly agree: the DOM sucks. But only when it is used for purposes
for which it was never designed.

DOM was designed to be a simple object model for documents, which was
easily manipulable with script languages. But, most importantly, it
was designed for use on the client side only, i.e. in the browser. So
although the DOM is a *huge* memory hog, this is not a problem in its
natural environment, the browser, where only 1 DOM is created per
document being viewed.

The problems come with DOM when people start doing stupid things with
it, like trying to use it in server applications, and creating
multiple DOMs to service every single client. IMHO, people are asking
for trouble when they do that, and I have little sympathy for them.

I feel that DOM has gotten bad press because of these kinds of
misuses: I still believe that DOM is great for what it was designed
for: non-server, script-based simple document manipulation.

regards,

-- 
alan kennedy
-----------------------------------------------------
check http headers here: http://xhaus.com/headers
email alan:              http://xhaus.com/mailto/alan

Uche Ogbuji wrote:
> 
> jjl at pobox.com (John J. Lee) wrote in message news:<873ce5q707.fsf at pobox.com>...
> > Peter Hansen <peter at engcorp.com> writes:
> >
> > > "Ellinghaus, Lance" wrote:
> > > >
> > > > Does anyone have a pure python XML parser?
> > >
> > > http://www.fourthought.com (I *think* it's pure XML...)
> >
> > I guess you're referring to 4DOM.  Fourthought no longer maintain
> > that.  It's part of PyXML now.
> >
> > I was about to say that 4DOM isn't up-to-date with the DOM spec, but
> > now I come to think of it, I'm not certain whether the XML part is or
> > is not up-to-date.  What I am sure about is that the HTML DOM part
> > isn't (it's level 2, but based on the prosposed spec. of September
> > 2000).
> >
> > Whatever, 4DOM certainly isn't as standards-compliant as pxdom.
> 
> I haven't used 4DOM for years.  There are certainly better
> alternatives.
> 
> But I think this discussion is off topic.  The OP was asking about
> *parsers*, not DOM implementations.  For all we know, he's not even
> interested in DOM at all.  After all, DOM pretty much sucks.
> 
> The only pure Python parser I know of is xmlproc, which is part of
> PyXML.
> 
> http://pyxml.sourceforge.net/
> 
> --Uche
> http://uche.ogbuji.net

-- 
alan kennedy
-----------------------------------------------------
check http headers here: http://xhaus.com/headers
email alan:              http://xhaus.com/mailto/alan