[pypy-dev] HTML Parser?

Fri Feb 22 11:19:34 CET 2013

On Fri, Feb 22, 2013 at 8:39 AM, Joe Hillenbrand <joehillen at gmail.com> wrote:
> Great to hear! I just got it working with scrapy. Unfortunately there wasn't
> any speedup.
>
> A normal crawl in CPython takes:
> real    1m32.238s
> user    0m56.576s
> sys    0m1.208s
>
> In PyPy:
> real    1m54.098s
> user    1m18.105s
> sys    0m1.372s
>
> Thanks for all your hard work.
>
> -Joe

lxml-cffi is known to be slower than normal lxml. You'll get speedups
if you start doing non-trivial logic in python, probably. For what is
worth, cffi is missing a lot of trivial optimizations (and one
non-trivial), so there is a lot of room for improvement.