[XML-SIG] Performance question

Fred L. Drake, Jr. fdrake@acm.org
Tue, 5 Nov 2002 09:23:52 -0500

Henry S. Thompson writes:
 > If you want _another_ factor of 10, go to PyLTXML.  The report below
 > is from Python 2.2.1 on RedHat Linux 7.2 using PyXML 0.8.1 and
 > PyLTXML-1.3-2.

Wow!  That's fast!

 > I used Fred's driver, added two new functions to text bit-level and
 > tree-level access via PyLTXML.
 > parser performance test
 > 100 parses took 3.88 seconds, or 0.04 seconds/parse
 > 100 parses took 0.25 seconds, or 0.00 seconds/parse
 > 100 parses took 0.02 seconds, or 0.00 seconds/parse
 > 100 parses took 0.03 seconds, or 0.00 seconds/parse
 > The first measurement is the original 4DOM DOM builder, the second is
 > the expatbuilder, the third is PyLTXML returning the whole tree, the
 > fourth is PyLTXML returning every bit (start tag, end tag, text).  I
 > guess the tree is faster because it's slightly lazy wrt Python
 > structures, i.e. only the root is in Python form as returned, the rest
 > gets converted from the native C structs as you walk the Python tree.

So is the resulting object compliant (or at least close) to the Python
DOM, as defined in the Python Library Reference?


(Lazy building of structures is fine, of course, since that's
implementation.)  If it doesn't support the DOM API, does it support
something with an equivalent model and functionality?

 > Here are the additions I made to Fred's version of the script:
 > def allBits(s):
 >   f=PyLTXML.OpenString(s1,PyLTXML.NSL_read|PyLTXML.NSL_read_namespaces)
 >   b=PyLTXML.GetNextBit(f)
 >   while b:
 >     b=PyLTXML.GetNextBit(f)
 >   PyLTXML.Close(f)
 > def itemParse(s):
 >   f=PyLTXML.OpenString(s1,PyLTXML.NSL_read|PyLTXML.NSL_read_namespaces)
 >   b=PyLTXML.GetNextBit(f)
 >   while b.type!='start':
 >     b=PyLTXML.GetNextBit(f)
 >   d=PyLTXML.ItemParse(f,b.item)
 >   PyLTXML.Close(f)
 >   return d  

Ouch!  Very inscrutible code... at least to me.  I must confess that
I've not had time to dig into the LTXML API (C or Python), though I've
stashed a copy of the documentation on my desk somewhere, meaning to
get to it.


Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation