On Wed, Mar 26, 2014, at 21:17, Kevin Modzelewski wrote:
On Wed, Mar 26, 2014 at 1:52 PM, Benjamin Peterson
There are several reasons. Two of the most important are 1) PyPy's internal representation of objects is different from CPython's, so a conversion cost must be payed every time objects pass between pure Python and C. Unlike CPython, extensions with PyPy can't poke around directly in data structures. Macros like PyList_SET_ITEM have to become function calls.
Hmm interesting... I'm not sure I follow, though, why the calling PyList_SET_ITEM on a PyPy list can't know about the PyPy object representation. Again, I understand how it's not necessarily going to be as fast as pure-python code, but I don't understand why PyList_SET_ITEM on PyPy needs to be slower than on CPython. Is it because PyPy uses more complicated internal representations, expecting the overhead to be elided by the JIT?
Let's continue with the list example. pypy lists use an array as the underlying data structure like CPython, but the similarity stops there. You can't just have random C code putting things in pypy lists. The internal representation of the list might be unwrapped integers, not points to int objects like CPython lists. There also needs to be GC barriers. The larger picture is that building a robust CPython compatibility layer is difficult and error-prone compared to the solution of rewriting C extensions in Python (possibly with cffi).
Also, I'm assuming that CPyExt gets to do a recompilation of the extension module;
2) Bridging the gap between PyPy's GC and CPython's ref counting
requires a lot of bookkeeping.
From a personal standpoint I'm also curious about how much of this overhead is fundamental, and how much could be alleviated with (potentially significant) implementation effort. I know PyPy has a precise GC, but I wonder if using a conservative GC could change the situation dramatically if you were able to hook the extension module's allocator and switch it to using the conservative GC. That's my plan, at least, which is one of the reasons I've been curious about the issues that PyPy has been running into since I'm curious about how much will be applicable.
Conservative GCs are evil and slow. :) I don't know what you mean by the "extension module's allocator". That's a fairly global thing.