[Cython] [GSoC] Python backend for Cython using PyPy's FFI

Carl Witty carl.witty at gmail.com
Thu Apr 7 18:53:24 CEST 2011


On Thu, Apr 7, 2011 at 9:06 AM, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
> On 04/07/2011 05:01 PM, Romain Guillebert wrote:
>>
>> Hi
>>
>> I proposed the Summer of Code project regarding the Python backend for
>> Cython.
>>
>> As I said in my proposal this would translate Cython code to Python +
>> FFI code (I don't know yet if it will use ctypes or something specific
>> to PyPy). PyPy's ctypes is now really fast and this will allow people to
>> port their Cython code to PyPy.
>>
>> For the moment I've been mostly in touch with the PyPy people and they
>> seem happy with my proposal.
>>
>> Of course I'm available for questions.
>
> Disclaimer: I haven't read the proposal (don't have access yet but will
> soon). So perhaps the below is redundant.
>
> This seems similar to Carl Witty's port of Cython to .NET/IronPython. An
> important insight from that project is that Cython code does NOT specify an
> ABI, only an API which requires a C compiler to make sense. That is; many
> wrapped C libraries have plenty of macros, we only require partial
> definition of struct, we only require approximate typedef's, and so on.
>
> In the .NET port, the consequence was that rather than the original idea of
> generating C# code (with FFI specifications) was dropped, and one instead
> went with C++/CLR (which is a proper C++ compiler that really understands
> the C side on an API level, in addition to giving access to the .NET
> runtime).
>
> There are two ways around this:
>
>  a) In addition to Python code, generate C code that can take (the
> friendlest) APIs and probe for the ABIs (such as, for instance, getting the
> offset of each struct field from the base pointer). Of course, this must
> really be rerun for each platform/build of the wrapped library.
>
> Essentially, you'd use Cython to generate C code that, in a target build,
> would generate Python code...
>
>  b) Create a subset of the Cython language ("RCython" :-)), where you
> require explicit ABIs (essentially this means either disallowing "cdef
> extern from ...", or creating some new form of it). Most Cython extensions I
> know about would not work with this though, so there would need to be
> porting in each case. Ideally one should then have a similar mode for
> Cython+CPython so that one can debug with CPython as well.

Note that a) is not sufficient in general -- it doesn't handle macros
that expand into code, like errno and putc().  There's another option
I considered,

c) Given the API specification in the Cython file, generate C code
that wraps that API with a known ABI.  So for:

cdef extern from "<errno.h>":
    int errno

you would generate a C file something like:

#include <errno.h>

void _write_errno(int newval) {
  errno = newval;
}

int _read_errno() {
  return errno;
}

and for

cdef extern from "<sys/types.h>":
    ctypedef int ino_t

cdef extern from "<sys/stat.h>":
    cdef struct stat:
        ino_t st_ino

you would generate (in part):

#include <sys/types.h>
#include <sys/stat.h>

long long _read_struct_stat_st_ino(struct stat *ptr) {
    return ptr->st_ino;
}

void _write_struct_stat_st_ino(struct stat *ptr, long long newval) {
    ptr->st_ino = newval;
}

(Of course, you'd want to add more name mangling to these examples.)

Note that I use "long long" for st_ino even though the Cython code
claimed that st_ino was int; that's because Cython generates code that
would work even if st_ino were "long long", and probably some modules
would break if you used the types declared in Cython.

Also, you could combine a), b), and c).  For example, use a) to
determine struct sizes, type sizes, and field offsets; use b) when
you're not worried about macros; and use c) (perhaps triggered by a
new annotation in the Cython source) when you want to handle arbitrary
API's that may be implemented with macros.

Carl


More information about the cython-devel mailing list