[pypy-dev] Scrapy fails in PyPy

Maciej Fijalkowski fijall at gmail.com
Fri Dec 14 21:04:12 CET 2012


On Fri, Dec 14, 2012 at 9:44 PM, Stefan Behnel <stefan_ml at behnel.de> wrote:
> Alex Gaynor, 13.12.2012 08:43:
>> Out of curiosity Stefan, if we had an alternate C-API with similar methods
>> (e.g. PyPyList_Append or so), but different signatures and memory model,
>> how hard do you think it would be to have Cython support this?
>
> Impossible to say in that generality. If it's only about exchanging C
> functions, it should be doable, but if it has an impact on Cython's type
> system, it might turn into a horrible mess to come up with something that
> works in both CPython and PyPy.
>
> Also note that Cython knows a lot about reference counting internally. If
> that alternate C-API requires substantial changes to the way references are
> maintained in the C code, that would mean some work.
>
> Also note that the amount of Cython code out there that uses explicit C-API
> calls for one reason or another is most likely rather large.
>
> All in all, I'm not a fan of that one big revolution that will make
> everything beautiful, fast and shiny, but that will never happen, really. I
> prefer small steps that make things work.
>
> Stefan

I don't want to be a naysayer here, but supporting CPython C API is a
mess. I don't think there is a way to make it nice and shiny (no
matter what) or a way to make incremental improvements that lead
anywhere good.

That said, I agree that exposing a different C API is not solving
much, while adding burden to maintainers of both Cython and PyPy and
I'm generally against the idea.

What can be done is keeping refcounts on python objects and then
growing few fields for keeping the C stuff forever. I can even think
about a scheme that would do it with a bit of a mess. This would
require storing an extra field on all objects. I can think about a
scheme to have this done only when invoking cpyext for the first time.

If we have a special pointer, we can allocate an object in old
generation, that's tied to the original object. It has a refcount,
with 0 means it goes away. Since it's not movable, you can take a
pointer to it and pass it to C. It's also a root, but a special kind
of root where during collection refcount == 0 means it dies away. The
objects have references to each other, so:

* PyPy object keeps the C object alive for the entire lifetime of the
pypy object.

* C object keeps the PyPy object alive as long as it's refcount is not
0 during collection time.

What we get:

* Simple refcounting (can be implemented in C as macros even)

* Lack of dictionaries

What we loose:

* We need to implement it (so time) and it requires a little bit of GC
complications.

Cheers,
fijal


More information about the pypy-dev mailing list