[pypy-dev] What can Cython do for PyPy?

Stefan Behnel stefan_ml at behnel.de
Thu Aug 12 11:25:18 CEST 2010


Maciej Fijalkowski, 12.08.2010 10:05:
> On Thu, Aug 12, 2010 at 8:49 AM, Stefan Behnel wrote:
>> there has recently been a move towards a .NET/IronPython port of Cython,
>> mostly driven by the need for a fast NumPy port. During the related
>> discussion, the question came up how much it would take to let Cython also
>> target other runtimes, including PyPy.
>>
>> Given that PyPy already has a CPython C-API compatibility layer, I doubt
>> that it would be hard to enable that. With my limited knowledge about the
>> internals of that layer, I guess the question thus becomes: is there
>> anything Cython could do to the C code it generates that would make the
>> Cython generated extension modules run faster/better/safer on PyPy than
>> they would currently? I never tried to make a Cython module actually run on
>> PyPy (simply because I don't use PyPy), but I have my doubts that they'd
>> run perfectly out of the box. While generally portable, I'm pretty sure the
>> C code relies on some specific internals of CPython that PyPy can't easily
>> (or efficiently) provide.
>
> CPython extension compatibility layer is in alpha at best. I heavily
> doubt that anything would run out of the box. However, this is a
> cpython compatiblity layer anyway, it's not meant to be used as a long
> term solutions. First of all it's inneficient (and unclear if will
> ever be)

If you only use it to call into non-trivial Cython code (e.g. some heavy 
calculations on NumPy tables), the call overhead should be mostly 
negligible, maybe even close to that in CPython. You could even provide 
some kind of fast-path to 'cpdef' functions (i.e. functions that are 
callable from both C and Python) and 'api' functions (which are currently 
exported at the module API level using the PyCapsule mechanism). That would 
reduce the call overhead to that of a C call.

Then, a lot of Cython code doesn't do much ref-counting and the like but 
simply runs in plain C. So, often enough, there won't be that much overhead 
involved in the code itself either, especially in tight loops where users 
prune away all CPython interaction anyway.


> but it's also unjitable. This means that to JIT, cpython
> extension is like a black box which should not be touched.

Well, unless both sides learn about each other, that is. It won't 
necessarily impact the JIT, but then again, a JIT usually won't have a 
noticeable impact on the performance of Cython code anyway.


> Also, several concepts, like refcounting are completely alien to pypy
> and emulated.

Sure. That's why I asked if there is anything that Cython can help to 
improve here. For example, the code it generates for INCREF/DECREF 
operations is not only configurable at the C preprocessor level.


> For example for numpy, I think a rewrite is necessary to make it fast
> (and as experiments have shown, it's possible to make it really fast),
> so I would not worry about using cython for speeding things up.

This isn't only about making things fast when being rewritten. This is also 
about accessing and reusing existing code in a new environment. Cython is 
becoming increasingly popular in the numerics community, and a lot of 
Cython code is being written as we speak, not only in the SciPy/NumPy 
environment. People even find it attractive enough to start rewriting their 
CPython extension modules (most often library wrappers) from C in Cython, 
both for performance and TCO reasons.


> There is another usecase for using cython for providing access to C
> libraries. This is a bit harder question and I don't have a good
> answer for that, but maybe cpython compatibility layer would be good
> enough in this case? I can't see how Cython can produce a "native" C
> code instead of CPython C code without some major effort.

Native (standalone) C code isn't the goal, just something that adapts well 
to what PyPy can provide as a CPython compatibility layer. If Cython 
modules work across independent Python implementations, that would be the 
most simple way by far to make lots of them available cross-platform, thus 
making it a lot simpler to switch between different implementations.

Stefan




More information about the Pypy-dev mailing list