[XML-SIG] Future plans

Laurent Szyster l.szyster@ibm.net
Mon, 20 Dec 1999 18:48:21 +0100


"Andrew M. Kuchling" wrote:
> 
> Laurent Szyster writes:
> >"Andrew M. Kuchling" wrote:
> >>   that was terribly slow; building a DOM tree of "The Winter's
> >>   Tale" now takes around 25 seconds, not 80.  (4DOM takes around
> >>   12 seconds for the same job.)
> >On what kind of computer (CPU model, clock speed)?
> 
> My 266MHz Linux box.  The code still needs more looking at; I'm not
> sure where the bottleneck is at the moment: the DOM, SAX, PyExpat, ...?

Most probably the DOM, surely no PyExpat.

On my 233MHz Linux box (HP NetServer E45), using PyExpat, a modified
qp_xml.py core with additional layers for Namespace and XPath support,
my figures are:

                                                 DOM     4DOM
  ------------------------------------------------------------
  PyExpat with no callback       0.37 secs       1.48%   3.08%
  base qp_xml like parser        1.64 secs (0)   6.68%  13.68%
  Enhanced pythonic DOM          1.97 secs (1)   7.88%  16.42%
  XML Namespace support added    2.55 secs (2)  10.02%  21.25%
  XPath capable DOM              4.64 secs (3)  18.56%  38.67%

(0) experience showed that it's node object instanciation that
    makes biggest performance difference between this parser and
    a PyExpat with no callback.

(1) each element object instance __dict__ is modified so that
    you can access the nth occurence of it's child 'type' as

      element.type[i]

    or it's attributes 'name' as

      element.name

(2) namespaces lookup, maps element types to classes instances
    based on the namespace info and apply functions for namespace's
    attributes (obviously, win_tale.xml only test the overhead
    of no namespace processing ;-)

(3) builds additional kjSets and kjGraph data structures for fast
    XPath operation on the DOM (note: performance degrades with
    number of elements and attributes in excess of around 5.000).


Laurent Szyster