[XML-SIG] the faster way to get a dom.

uche.ogbuji@fourthought.com uche.ogbuji@fourthought.com
Mon, 16 Oct 2000 09:24:38 -0600


> I wondering what is the fastest (as in speed of processing) to get a
> DOM.  Below is the way OI;ve been doing, but lately Ilve had to deal
> with very lrage XML documents and I wondeing if ther is a way to imporev
> speed.
> 
> from xml.dom.ext.reader import Sax
> 
> def parseXml(s,ownerDocument=None):
>     "parse and return doc"
>     doc = Sax.FromXml(s,ownerDocument)
>     ext.StripXml(doc)
>     return doc

There is a lot of overhead in 4DOM's SAX reader.  We've cut some out and we 
wonder whether we'll soon be reaching the point of diminishing requrns 
optimizing that.

Maybe it's time for a c-level DOM builder.  We have one for cDomlette, a tiny 
DOM written entirely in C (with Python interface, of course) which comes with 
4Suite.  It would take quite some effort to scale it up to the full 4DOM, 
though.

Are the large documents such that a subset of the DOM would suffice for your 
use?  If so, have a look at cDomlette in 4Suite.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python