[pypy-dev] Bringing Cython and PyPy closer together

Sun Feb 19 21:55:08 CET 2012

Stefan Behnel, 18.02.2012 11:20:
> Amaury Forgeot d'Arc, 18.02.2012 10:08:
>> I made some modifications to pypy, cython and lxml,
>> and now I can compile and install cython, lxml, and they seem to work!
>>
>> For example::
>>     html = etree.Element("html")
>>     body = etree.SubElement(html, "body")
>>     body.text = "TEXT"
>>     br = etree.SubElement(body, "br")
>>     br.tail = "TAIL"
>>     html.xpath("//text()")
>>
>> Here are the changes I made, some parts are really hacks and should be
>> polished:
>> lxml: http://paste.pocoo.org/show/552903/
> 
> The weakref changes are really unfortunate as they appear in one of the
> most performance critical spots of lxml's API: on-the-fly proxy creation.

To give an idea of how much overhead there is, here's a micro-benchmark.

First, parsing:

$ python2.7 -m timeit -s 'import lxml.etree as et' \
     'et.parse("input3.xml")'
10 loops, best of 3: 136 msec per loop

$ pypy -m timeit -s 'import lxml.etree as et' \
     'et.parse("input3.xml")'
10 loops, best of 3: 127 msec per loop

I have no idea why pypy is faster here - there really isn't any interaction
with the core during XML parsing, certainly nothing that would account for
some 7% of the runtime. Maybe some kind of building, benchmarking or
whatever fault on my side. Anyway, parsing is clearly in the same ballpark
for both.

However, when it comes to element proxy instantiation (collecting all
elements in the XML tree here as a worst-case example), there is a clear
disadvantage for PyPy:

$ python2.7 -m timeit -s 'import lxml.etree as et; \
     el=et.parse("input3.xml").getroot()'   'list(el.iter())'
10 loops, best of 3: 84 msec per loop

$ pypy -m timeit -s 'import lxml.etree as et; \
     el=et.parse("input3.xml").getroot()'   'list(el.iter())'
10 loops, best of 3: 1.29 sec per loop

That's about the same factor of 15 that you got. This may or may not matter
to applications, though, because there are many tools in lxml that allow
users to be very selective about which proxies they want to see
instantiated, and to otherwise let a lot of functionality execute in C. So
applications may get away with a performance hit below that factor in
practice. What certainly matters for applications is to get the feature set
of lxml within PyPy.

Stefan