Why is xml.dom.minidom so slow?

Fredrik Lundh fredrik at pythonware.com
Thu Jan 2 22:50:13 CET 2003

Bjorn Pettersen wrote:

> All I'm doing boils down to:
>   response = rf.nextResponse()
>   dom = parseString(response)
> in a loop. Am I doing something wrong? Is there a faster way when all I
> need is a traversable tree structure as the result?

as a general rule, XML toolkits that try to implement the DOM specification
in pure Python are incredibly slow and bloated.

on random XML data, minidom can easily gobble up a kilobyte or two for
each element.  in one of my benchmarks, it used about 50 bytes of object
memory for each input character:


creating all those objects take time...

toolkits that use a more pythonic api also tend to be more efficient; for
example, the pure python version of my elementtree module is typically
3-5 times faster than minidom, and uses less than half the memory:


you may be able to reach 10x with SAX-style custom code using pyexpat
(or sgmlop) directly...


...but to be on the safe side, I'd go for a C parser/tree builder.  the following
two are about as fast as anything can be:


(unfortunately, the C version of elementtree isn't yet ready for public


More information about the Python-list mailing list