[lxml-dev] Uche's OT-benchmark on lxml and (c)ElementTree
Hi all, I just ran a slight variant (doesn't print, builds list) of Uche's OT benchmark on * ElementTree 1.2.6 * cElementTree 1.0.5 * lxml (trunk/SVN 27065, timings for 0.9.2 are similar) The code I used is attached, it runs against the 3.3M ot.xml file from http://www.ibiblio.org/bosak/xml/eg/religion.2.00.xml.zip Here's the (somewhat bogus) output on my machine (bench, result, time): bench_ET 120 1.59 bench_cET 120 0.31 bench_lxml_findall 120 0.32 bench_lxml_xpath 120 0.33 bench_lxml_xpath_all 120 0.26 I should note that lxml's findall() uses ElementTree's ElementPath implementation. Here are the real numbers from timeit: # python -m timeit -s "from otbench import *" "bench_ET()" 10 loops, best of 3: 1.36 sec per loop # python -m timeit -s "from otbench import *" "bench_cET()" 10 loops, best of 3: 260 msec per loop # python -m timeit -s "from otbench import *" "bench_lxml_findall()" 10 loops, best of 3: 295 msec per loop # python -m timeit -s "from otbench import *" "bench_lxml_xpath()" 10 loops, best of 3: 261 msec per loop # python -m timeit -s "from otbench import *" "bench_lxml_xpath_all()" 10 loops, best of 3: 214 msec per loop I think it's pretty interesting how close the timings of cET.findall() and lxml.xpath() are. Regards, Stefan
Stefan Behnel wrote:
I just ran a slight variant (doesn't print, builds list) of Uche's OT benchmark on
And another difference is that you don't actually measure the overhead of the Python interpret startup, correct? :)
* ElementTree 1.2.6 * cElementTree 1.0.5 * lxml (trunk/SVN 27065, timings for 0.9.2 are similar)
The code I used is attached, it runs against the 3.3M ot.xml file from http://www.ibiblio.org/bosak/xml/eg/religion.2.00.xml.zip
[snip]
I think it's pretty interesting how close the timings of cET.findall() and lxml.xpath() are.
Cool. Impressive for cET.findall(), as it's using a Python implementation of the search algorithm - the same one as used by ElementTree, last I checked. I'm pleasantly surprised lxml_findall() now appears to have reached parity with cET in this test; it used to be that cET was quite a bit faster in my old measurements. Can you identify which tuning effort had this effect, or this due to a slightly different benchmark? Last I did a similar check we were still half the speed: http://faassen.n--tree.net/blog/view/weblog/2005/01/17/0 Heh, the Uche quote on that page has been proven wrong, right? :) This reminds me, I was talking to someone who was interested in getting a function that would just give the first xpath result by the way - equivalent to find(), I think. Often you know you find just one thing, and you want it to return that, instead of having to grap things from the resulting list of nodes. Of course underneath it'd still do the same search, so this would be purely a convenience API, not a performance gain. Regards, Martijn
Hi Martijn, Martijn Faassen wrote:
This reminds me, I was talking to someone who was interested in getting a function that would just give the first xpath result by the way - equivalent to find(), I think. Often you know you find just one thing, and you want it to return that, instead of having to grap things from the resulting list of nodes. Of course underneath it'd still do the same search, so this would be purely a convenience API, not a performance gain.
Yeah, sadly, libxml2 doesn't have any routines for stopping XPath once a result has been matched. But then, XPath is pretty complex, how do you actually know when you have a result that will be passed through to the caller? But then, that's basically what _elementpath does, too: find everything and return the first result. I wouldn't mind having something like xpath1() to return the first hit only. On the other hand, are there any substantial differences in the ElementTree Path syntax with respect to an XPath subset that should keep us from having findall() etc. call XPath directly? Stefan
On Thu, 2006-05-11 at 12:59 +0200, Stefan Behnel wrote:
Hi Martijn,
Martijn Faassen wrote:
This reminds me, I was talking to someone who was interested in getting a function that would just give the first xpath result by the way - equivalent to find(), I think. Often you know you find just one thing, and you want it to return that, instead of having to grap things from the resulting list of nodes. Of course underneath it'd still do the same search, so this would be purely a convenience API, not a performance gain.
Yeah, sadly, libxml2 doesn't have any routines for stopping XPath once a result has been matched. But then, XPath is pretty complex, how do you actually know when you have a result that will be passed through to the caller?
But then, that's basically what _elementpath does, too: find everything and return the first result. I wouldn't mind having something like xpath1() to return the first hit only.
On the other hand, are there any substantial differences in the ElementTree Path syntax with respect to an XPath subset that should keep us from having findall() etc. call XPath directly?
I've already implemented it some time ago in branch xpath-find. you can take a look. there were no failed tests, as far as I remember.
Hi Andrey, Andrey Tatarinov schrieb:
On Thu, 2006-05-11 at 12:59 +0200, Stefan Behnel wrote:
are there any substantial differences in the ElementTree Path syntax with respect to an XPath subset that should keep us from having findall() etc. call XPath directly?
I've already implemented it some time ago in branch xpath-find.
you can take a look.
Right, the Clark notation. From what I see in your implementation, that's basically the same as the etree.ETXPath wrapper for the XPath class. So I'll check if we can replace it and throw _elementpath.py out. Depends on the performance, though. ETXPath also uses RegExps to split up the path expression.
there were no failed tests, as far as I remember.
Yeah, well, that's not the surest thing for something as complex as XPath. We don't test expressions, just the API. Maybe the ET self-tests help out here. Thanks for the hint, Stefan
Hi again, Stefan Behnel wrote:
Andrey Tatarinov wrote:
On Thu, 2006-05-11 at 12:59 +0200, Stefan Behnel wrote:
are there any substantial differences in the ElementTree Path syntax with respect to an XPath subset that should keep us from having findall() etc. call XPath directly? I've already implemented it some time ago in branch xpath-find.
you can take a look.
Right, the Clark notation. From what I see in your implementation, that's basically the same as the etree.ETXPath wrapper for the XPath class.
So I'll check if we can replace it and throw _elementpath.py out. Depends on the performance, though. ETXPath also uses RegExps to split up the path expression.
there were no failed tests, as far as I remember.
Yeah, well, that's not the surest thing for something as complex as XPath. We don't test expressions, just the API. Maybe the ET self-tests help out here.
... and they do: ------------------------------------------ ********************************************************************** File "/home/me/source/Python/lxml/lxml-HEAD/selftest.py", line 224, in selftest.bad_find Failed example: elem.findall("/tag") Expected: Traceback (most recent call last): SyntaxError: cannot use absolute path on element Got: [] ********************************************************************** File "/home/me/source/Python/lxml/lxml-HEAD/selftest.py", line 227, in selftest.bad_find Failed example: elem.findall("../tag") Expected: Traceback (most recent call last): SyntaxError: unsupported path syntax (..) Got: [] ********************************************************************** File "/home/me/source/Python/lxml/lxml-HEAD/selftest.py", line 230, in selftest.bad_find Failed example: elem.findall("section//") Expected: Traceback (most recent call last): SyntaxError: path cannot end with // Got: Traceback (most recent call last): File "/home/me/source/Python/lxml/lxml-HEAD/src/doctest.py", line 1256, in __run compileflags, 1) in test.globs File "<doctest selftest.bad_find[3]>", line 1, in ? elem.findall("section//") File "etree.pyx", line 909, in etree._Element.findall File "xpath.pxi", line 215, in etree.ETXPath.__init__ File "xpath.pxi", line 171, in etree.XPath.__init__ File "xpath.pxi", line 65, in etree.XPathEvaluatorBase._raise_parse_error XPathSyntaxError: Error in xpath expression. ********************************************************************** File "/home/me/source/Python/lxml/lxml-HEAD/selftest.py", line 233, in selftest.bad_find Failed example: elem.findall("tag[tag]") Expected: Traceback (most recent call last): SyntaxError: expected path separator ([) Got: [] ********************************************************************** 1 items had failures: 4 of 5 in selftest.bad_find ------------------------------------------ That's a pretty good rate... :) Hmm, everything else passes, but those are gonna be hard to emulate without a real pre-parser (which would cost us performance for compat-only). XPath is just too powerful to strip it down easily... Guess we should just leave it as is for now. Calling _elementpath is sufficiently fast by now, writing our own parser is not worth it. Stefan
Stefan Behnel wrote: [snip]
Guess we should just leave it as is for now. Calling _elementpath is sufficiently fast by now, writing our own parser is not worth it.
Right, I was going to answer in this thread with the same conclusion I drew before, which was exactly yours. Trying to emulate the .find behavior on top of XPath is not worth doing in my opinion. There are the parsing issues you mention, and the danger of introducing subtle incompatibilities (I'd need to look up the old thread to check what things I found out then). It's probably more worthwhile to invest that energy in speeding up find by writing a native implementation. :) Regards, Martijn
Hi Martijn, Martijn Faassen wrote:
It's probably more worthwhile to invest that energy in speeding up find by writing a native implementation. :)
Possibly. Although, well, I guess this is fast enough for now: # python -m timeit -s "from otbench import *" "bench_lxml_findall()" 10 loops, best of 3: 252 msec per loop # python -m timeit -s "from otbench import *" "bench_cET()" 10 loops, best of 3: 258 msec per loop :) Just for the records: cET 1.0.5 vs. lxml trunk, SVN 27133. (hint: I rewrote the ElementDepthFirstIterator returned by getiterator() to support tag-filtering directly.) Have fun, Stefan
Stefan Behnel wrote:
Hi Martijn,
Martijn Faassen wrote:
It's probably more worthwhile to invest that energy in speeding up find by writing a native implementation. :)
Possibly. Although, well, I guess this is fast enough for now:
# python -m timeit -s "from otbench import *" "bench_lxml_findall()" 10 loops, best of 3: 252 msec per loop
# python -m timeit -s "from otbench import *" "bench_cET()" 10 loops, best of 3: 258 msec per loop
:)
Just for the records: cET 1.0.5 vs. lxml trunk, SVN 27133.
Way cool! Yes, definitely sounds fast enough for now. :) Regards, Martijn
Hi Martijn, Martijn Faassen wrote:
Stefan Behnel wrote:
I just ran a slight variant (doesn't print, builds list) of Uche's OT benchmark on
And another difference is that you don't actually measure the overhead of the Python interpret startup, correct? :)
Uh, well, was that supposed to be in there, too? :)
I'm pleasantly surprised lxml_findall() now appears to have reached parity with cET in this test; it used to be that cET was quite a bit faster in my old measurements. Can you identify which tuning effort had this effect, or this due to a slightly different benchmark? Last I did a similar check we were still half the speed:
I'm not quite sure. I changed a lot of bits everywhere, in the XPath code, the proxy code and the Element creation code. Guess it was a mixture of all of them. There's quite a bit of fast-paths in there now that make a difference when you ask for a lot of elements. BTW, note that there is even an element class lookup for ns/name involved in each element creation. Thus, the tests would yield similar timings with custom per-tag Python classes for elements. That's another thing ElementTree can't give you.
http://faassen.n--tree.net/blog/view/weblog/2005/01/17/0
Heh, the Uche quote on that page has been proven wrong, right? :)
Totally. What does that guy know anything about, anyway? :] (uh, he's not listening, is he?) :) I'd actually like to see something about lxml on "xml.com". lxml has been in a pretty usable state for quite a while now and is even nearing feature-completeness. Maybe we should just get out 1.0 and send Uche a friendly mail. *wink* Stefan
Stefan Behnel wrote:
Martijn Faassen wrote:
Stefan Behnel wrote:
I just ran a slight variant (doesn't print, builds list) of Uche's OT benchmark on
And another difference is that you don't actually measure the overhead of the Python interpret startup, correct? :)
Uh, well, was that supposed to be in there, too? :)
It was in Uche's published benchmark if I recall correctly. Fredrik and I slightly disagreed. :)
I'm pleasantly surprised lxml_findall() now appears to have reached parity with cET in this test; it used to be that cET was quite a bit faster in my old measurements. Can you identify which tuning effort had this effect, or this due to a slightly different benchmark? Last I did a similar check we were still half the speed:
I'm not quite sure. I changed a lot of bits everywhere, in the XPath code, the proxy code and the Element creation code. Guess it was a mixture of all of them. There's quite a bit of fast-paths in there now that make a difference when you ask for a lot of elements.
.findall() does ask for lots of elements, so that might be helping then. Pretty good - I didn't expect there was such a gain to be made now, and we've got cElementTree parity now for this performance measurement. Perhaps we should collect all this benchmarking and check them, and then write some article for the lxml website... It'd be a bit of work to make that a solid piece of text, of course. People do pick up on benchmark figures in a rather lazy way sometimes. Last year I as I was developing lxml I honestly said when it was slower than cElementTree in my limited measurements, and I saw that referred to later as "According to Martijn Faassen, libxml might not be that fast with python anyway". For future google searchers: I didn't actually say that! lxml (and libxml2) is plenty fast with Python! So, with a benchmark page on the lxml site, we might get "Stefan Behnel says lxml is faster than anything all the time!" in people's heads instead. Note for future google searchers: Stefan never actually said that! I just made it up! But, lxml is plenty fast with Python! :)
BTW, note that there is even an element class lookup for ns/name involved in each element creation. Thus, the tests would yield similar timings with custom per-tag Python classes for elements. That's another thing ElementTree can't give you.
http://faassen.n--tree.net/blog/view/weblog/2005/01/17/0
Heh, the Uche quote on that page has been proven wrong, right? :)
Totally. What does that guy know anything about, anyway? :] (uh, he's not listening, is he?) :)
:) I like lots of what Uche's done, it's just we had a silly debate about benchmarks. Benchmarks unfortunately tend to invite such discussions.
I'd actually like to see something about lxml on "xml.com". lxml has been in a pretty usable state for quite a while now and is even nearing feature-completeness. Maybe we should just get out 1.0 and send Uche a friendly mail. *wink*
Yes. Unfortunately xml.com slightly changed its focus since last year, which appears to be less Python-related articles. Still, it's worth a shot contacting him. Regards, Martijn
Hi Martijn, Martijn Faassen wrote:
Perhaps we should collect all this benchmarking and check them, and then write some article for the lxml website... It'd be a bit of work to make that a solid piece of text, of course.
Sure, getting numbers is easy. Making sense out of them and making them understandable to others is the trick. You could play with bench.py a bit (it's really easy) and see what other parts of the API would be interesting to test. That way, we'd get a broader idea about what is competitive or faster and what isn't. I wrote many of the benchmarks to see how badly lxml behaves in the spots where I knew it would be bad. So, few of them show where it excels.
So, with a benchmark page on the lxml site, we might get "Stefan Behnel says lxml is faster than anything all the time!" in people's heads
Sure. That's true anyway, right? ;) Stefan
Hi Martijn, Martijn Faassen wrote:
Cool. Impressive for cET.findall(), as it's using a Python implementation of the search algorithm - the same one as used by ElementTree, last I checked.
Actually that's not really surprising. cET has the Python objects readily available and only accesses them. We have to generate them on the fly. And ElementPath is simple enough to be fast. That's like comparing a spoon with a swiss knife.
I'm pleasantly surprised lxml_findall() now appears to have reached parity with cET in this test; it used to be that cET was quite a bit faster in my old measurements. Can you identify which tuning effort had this effect, or this due to a slightly different benchmark?
I figured out what it was. _elementpyth.py calls element.getiterator(). lxml originally collected all children in a list to emulate that. Now it has a real iterator implementation. It could even be faster as _elementpath.py nicely asks for the tag it looks for. Currently, this filters /behind/ the iterator, so all elements are still generated (and we're still close to parity!). Maybe I should check what difference it makes if we filter in plain C... Stefan
participants (3)
-
Andrey Tatarinov
-
Martijn Faassen
-
Stefan Behnel