[XML-SIG] Performance question
Henry S. Thompson
06 Nov 2002 13:59:16 +0000
"Fred L. Drake, Jr." <firstname.lastname@example.org> writes:
> Henry S. Thompson writes:
> > If you want _another_ factor of 10, go to PyLTXML. The report below
> > is from Python 2.2.1 on RedHat Linux 7.2 using PyXML 0.8.1 and
> > PyLTXML-1.3-2.
> Wow! That's fast!
> > I used Fred's driver, added two new functions to text bit-level and
> > tree-level access via PyLTXML.
> > parser performance test
> > 100 parses took 3.88 seconds, or 0.04 seconds/parse
> > 100 parses took 0.25 seconds, or 0.00 seconds/parse
> > 100 parses took 0.02 seconds, or 0.00 seconds/parse
> > 100 parses took 0.03 seconds, or 0.00 seconds/parse
> > The first measurement is the original 4DOM DOM builder, the second is
> > the expatbuilder, the third is PyLTXML returning the whole tree, the
> > fourth is PyLTXML returning every bit (start tag, end tag, text). I
> > guess the tree is faster because it's slightly lazy wrt Python
> > structures, i.e. only the root is in Python form as returned, the rest
> > gets converted from the native C structs as you walk the Python tree.
> So is the resulting object compliant (or at least close) to the Python
> DOM, as defined in the Python Library Reference?
> (Lazy building of structures is fine, of course, since that's
> implementation.) If it doesn't support the DOM API, does it support
> something with an equivalent model and functionality?
I believe so -- our model actually _predates_ the DOM, and we've never
had the time/resources to roll it forward, but it was of course
solving the same problem.
The documentation lists the following Python object types:
These correspond to the xml.dom objects as follows, I think:
FileType * 22.214.171.124 DOMImplementation Objects
ItemType * 126.96.36.199 Node Objects
python tuple * 188.8.131.52 NodeList Objects
DoctypeType * 184.108.40.206 DocumentType Objects
FileType * 220.127.116.11 Document Objects
ItemType * 18.104.22.168 Element Objects
not exposed * 22.214.171.124 Attr Objects
not exposed * 126.96.36.199 NamedNodeMap Objects
OOBType * 188.8.131.52 Comment Objects
ItemType * 184.108.40.206 Text and CDATASection Objects
OOBType * 220.127.116.11 ProcessingInstruction Objects
The details are in the documentation which comes with the source
distribution, which uses distutils and is GPL-click-wrapped at
To avoid hassle, you'll want the source and the appropriate binary
distribution at a minimum -- actually _building_ the extension
requires an LT XML installation as well.
Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
W3C Fellow 1999--2002, part-time member of W3C Team
2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
Fax: (44) 131 650-4587, e-mail: email@example.com
[mail really from me _always_ has this .sig -- mail without it is forged spam]