Hi Amaury,
first of all, thank you for the effort that you put into this. I'm not
entirely happy about having a separate fork of the code base, but I'm sure
(some?) PyPy users will appreciate it.
Amaury Forgeot d'Arc, 14.06.2013 23:20:
> I almost finished a port of lxml that uses cffi bindings instead of Cython.Most of it, it seems. From what I gathered, there are at least some major
> The goal is to run it with PyPy, which support for the CPython API is known
> to be very slow and somewhat incomplete.
> cffi on the other hand is much more friendly to PyPy's JIT.
>
> The code is there: https://github.com/amauryfa/lxml in the "cffi" branch.
> Note that it works directly from the source tree, no need to compile.
> You have to use a recent PyPy, ensure that the "lxml" directory can be seen
> from your PYTHONPATH, and that you have libxml development packages.
>
> The conversion -- from .pyx files to pure Python -- turned out to be quite
> straightforward.
adaptations that seem to have required some thought. The transformation is
certainly nothing that could be automated.
> And the result is interesting IMO:Not sure what you mean. Which __init__.py where?
>
> - Impact on the existing code is minimal (in __init__.py), of course there
> is a lot of duplication.
> - Only 5 tests fail in the whole suite (if you except objectify, which isCould you explain what the problem is here? lxml.objectify is certainly
> hard to support correctly on a non-refcounting gc)
quite tightly coupled with lxml.etree, but it mainly uses standard features
of the API, mainly custom element classes.
> - Performance is on par with the version installed on my laptop, exceptSure.
> from some catastrophic benchmarks, probably because of a missed
> optimization. I did not try to optimize anything, correctness first!
Is your goal to reintegrate this with mainline lxml? This will certainly
> My plan is to issue a pull request soon.
require more work. And I wouldn't be happy about the additional maintenance
overhead just to keep the additional cffi sources up-to-date.
> Some work remains, though, and I need your help here:Sounds like you'd better write a completely separate one. Most of what lxml
> - adapt setup.py to the new system
is doing at install time is configuring the build. A cffi version shouldn't
need that.
> - fix the remaining testsNot easily. The Cython code merges everything into one module using
> - Is there a way to reduce the code duplication between .pyx and .py files?
includes, but it uses Cython syntax. Maybe submodules could be reused.
Also, much of the implementation could be reduced to plain Python syntax,
but that makes it more verbose. Maybe still less verbose than duplicated
source files, but still too verbose to maintain IMHO. Plus, the functional
changes for cffi would still need some major special casing.
So, my current opinion on this is: I certainly won't do the work to merge
it, and it's currently unclear to me what a good way to merge this would
be. If it can be done without overly degrading maintenance, I'll consider
it, but otherwise, having it in a fork sounds simpler for now.