Mailman 3 [lxml-dev] Callgrind tests - lxml - The Python XML Toolkit

March 22, 2006

      Hello everyone,

another one for the archives.

I did a few tests with Callgrind and KCachegrind (if you don't know
kcachegrind, install it, you'll love it), as I was suspecting the XPath
wrapper to have become slow due to the global function registries. What I
found was:

1) libxml2 performance is heavily bound by malloc calls (not sure if callgrind
influences this). The XPath implementation is so incredibly fast that the
registration of the /builtin/ XPath functions (xmlXPathRegisterAllFunctions)
and the related hash table creation (two xmlHashCreate's per XPath context)
were the major bottlenecks in my tests. The overhead added by lxml itself was
negligible.

2) string formatting in Python was the other problem. The major bottleneck in
tree setup in bench.py was the python function that builds the element names
based on loop variables (PyString_Format). Meaning, the bottleneck was
/outside/ the tested code this time.

So, the major result is that, for the tested parts, lxml's performance is
mainly bound by two factors: Python and libxml2. I guess I can safely assume
that the code parts that I checked are pretty much too small an issue to merit
any further optimization efforts.

Have fun,
Stefan

[lxml-dev] Callgrind tests

Stefan Behnel

Martijn Faassen

Stefan Behnel

Stefan Behnel

Martijn Faassen

Stefan Behnel

Stefan Behnel

tags

participants (2)