[cross-posting this from lxml-dev] jholg@gmx.de, 05.11.2009 22:05:
Why would '/xs:schema/xs:complexType//xs:element[@name="equity"]/@type' [takes 103.62 seconds in libxml2 2.6.32] perform drastically slower than '//xs:element[@name="equity"]/@type' [takes 0.096 seconds] ?
Yes, that's surprising, especially when you see the absolute times. I ran it through callgrind, and the problem is that the first case produces separate node set results, one for each parent element that was found and searched. I.e., the evaluation works more or less like this: '/xs:schema/xs:complexType//xs:element[@name="equity"]/@type' -> walk through all "xs:schema" root elements (1 result) -> walk through all "xs:complexType" children (many found) -> for each result, search matching "xs:element" descendants -> for each match, select the "type" attribute (1 result) -> collect all matched attributes in a node-set -> merge the set of results and sort them into document order It's the last operation, merging and sorting large sets of results, that makes this extremely slow - it takes 92% of the evaluation time in my tests (using libxml2 2.7.5). It's much faster to traverse the document in a single step, and just select single attributes from it, that can quickly be appended to the node set. I imagine that this step could actually be optimised away in many cases (like the case above, where results are guaranteed to be found in doc order), so I guess it's just in there to avoid too much special casing. But it seriously kills the performance here. The sorting algorithm is an unstable shell-sort, whose exponential runtime I expect to be the key problem here. I would assume that in most cases, the partial node-sets are already in doc-order, so optimising away the sorting in favour of appending one node-set to the other whenever doc order is guaranteed to be preserved would drastically drop the runtime of the longer expression. Stefan