[pypy-dev] Scrapy fails in PyPy

Maciej Fijalkowski fijall at gmail.com
Thu Dec 13 09:13:30 CET 2012


On Thu, Dec 13, 2012 at 9:35 AM, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Maciej Fijalkowski, 12.12.2012 20:10:
>> On Wed, Dec 12, 2012 at 7:06 PM, Joe Hillenbrand wrote:
>>> I was able to fix the issue with scrapy.
>>>
>>> https://github.com/joehillen/scrapy/commit/8778af5c5be50a5d746751352f8d710d1f24681c
>>>
>>> Unfortunately, scrapy takes twice as long in PyPy than in CPython. I suspect
>>> this is because lxml is twice as slow in PyPy vs CPython, which I found in
>>> lxml's benchmarks.
>>>
>>> Should lxml be added to the set of speed tests?
>>
>> no. lxml uses cpyext (CPython extension compatibility) that is and
>> will forever be slow.
>
> Well, I don't think it would be hard for any PyPy core developer to make it
> twice as fast. Shouldn't be more than a day's work.
>
> Stefan

I'm not so sure, we wouldn't know until someone tries it. What
optimizations did you have in mind?

For what is worth, cpyext is not twice as slow, lxml is. cpyext is
likely 10-20x slower or so. I presume lowering the overhead would not
automatically make lxml twice as fast, since it's doing quite a lot of
other work.

Anyway, without trying we don't really know

Cheers,
fijal


More information about the pypy-dev mailing list