So utterly confused w/ various XML libraries

Martin v. Loewis martin at v.loewis.de
Mon Aug 5 17:29:39 EDT 2002


Robb Shecter <rs at onsitetech.com> writes:

> - Which ones are parts of which

xml.dom.minidom and xml.sax.expatreader are part of Python 2.x, and of
PyXML.

xmlproc, sgmlop, 4DOM, and a number of other libraries are part of
PyXML (see PyXML release announcement for a longer list).

xml.xpath and xml.xslt are part of 4Suite, and PyXML 0.7+.

pDomlette, cDomlette, xlink, xpointer, Ods, Rdf, and others are part
of 4Suite (see 4Suite release announcement for details).

Various other libraries exist in isolation.

> - Which ones are drop-in replacements for the standard distributions

PyXML is a drop-in.

> - Which ones are known to be (speed) improvements over the standard
> dist.

All those packages improve the standard dist, either in functionality,
completeness, or speed.

> In my case, we're doing SOAP programming (Using Activestate's
> distribution of PxXML 0.7.)  But receiving even trivial data is too
> slow to be useful, and the profiler shows all the time spent in
> "PyExpat.py".

Ah, so you are building a 4DOM tree. 4DOM is more complete than other
DOM implementations, but also slower. Using SAX instead of DOM is even
faster, and using the plain parser API is yet faster. Of course, you
trade convenience for speed.

> And I've seen various pieces of advice like, "Go get
> sgmlop/minidom/cdomlette".  But I don't know how PyXML relates to
> any of these, of PyExpat, or even why I need it...

minidom/cdomlette are indeed faster DOM implementations than 4DOM, but
they also lack features. minidom does not have traversal, ranges, and
events, and cDomlette has none of these, and does not support all
modifications of the tree, as well as some of the DOM Core functions.

sgmlop is an XML parser that does not create a tree at all, but it is
quite fast.

Regards,
Martin



More information about the Python-list mailing list