[Cython] Hello

Stefan Behnel stefan_ml at behnel.de
Mon Jan 27 08:02:04 EST 2020


John Skaller2 schrieb am 27.01.20 um 06:56:
> Hi! I have just built Cython but haven’t used it yet.
> 
> I want to check I understand what it does whilst examing the sources in the repository.
> Please let me know if I have it wrong!
> 
> Given a Python file, Cython parses it, and translates it to the equivalent C, which can
> then be compiled to binary, thereby bypassing the overhead of interpreting bytecode,
> but still executing the same accesses to the CPython run time API that the interpreter would.

Not the same, just equivalent operations. Cython often generates code that
bypasses the "usual" C-API calls.


> So there may be a small speed improvement, or maybe not (because it misses optimisations
> the interpreter might be able to spot).

Yes, it goes both ways.

Cython generates many fast-paths that the interpreter doesn't have or
cannot easily provide, so there is a high chance that compiled code is
faster than interpreted code. The speed improvement is usually somewhere
around 30% without type annotations, but highly dependant on your actual code.

OTOH, recent CPython versions added some internal infrastructure that
speeds up important operations, which isn't always easy to use from
external tools like Cython. So there are a few cases where the runtime has
an inherent advantage, although it doesn't always show much in comparison.
And we keep fighting back. :)


> The binary will typically be a C extension module
> that can be loaded and operate the same way as the original Python.
> 
> Now, Cython is an extension of Python which allows some extra stuff, including
> type annotations, and other directives related to integration with C. These can be
> used to facilitate integration with external C libraries directly, and mapping into
> Python, as if the code were written in C, only we’re using a Python like language
> representing a subset of C instead of C.

It's really mostly about data types (and Cython's mixed Python/C type
system). Python allows you to do a lot of seemingly different things in the
same syntactic constructs, and Cython does the same when it generates the C
code by adapting it to the data types that the source operates on.


> Additionally, the compiler recognises the type annotations, and can reduce or
> eliminate run time type checks, improving performance, or even replacing
> common constructions in Python which much faster ones that do the same job
> “closer to the metal”.

Yes.


> To make this work, the CPython API itself is represented in a set of *.pxd files
> found in the repository in Includes/cpython

No. :)

These files are only for end users to allow importing pre-declared parts of
the CPython C-API for their own use (in case they feel like it). Cython
itself does not use or need them, but if they help you…


> splitting the logic of the compiler
> roughly into two parts: the front and back end. The front end groks Python
> and Cython code whilst the back end generates the actual C.

Back to a Yes.


> Just FYI, I’m the developer of a programming language, Felix, which is C++ code
> generator. You can think of it as a meta-programming language for C++ with a 
> proper type system. Felix binds C/C++ code with statements like:
> 
> 	type PyObject = “PyObject*”;
> 	fun add: PyObject * PyObject -> PyObject = “Py_AddLong($1)”;
> 
> and can use the bindings like:
> 	
> 	var a : PyObject = ….
> 	var b: PyObject = ...
> 	var sum = add (a,b);
> 
> so in some ways its doing the same kind of job as Cython, except it isn’t
> specialised to bind to Python, it can bind to anything written in C or C++.
> Including the Python API as illustrated.

As Greg's question hinted, Cython knows a lot about the object reference
counting that CPython uses for garbage collection. That's one of the main
reasons why people prefer it over using the CPython C-API directly. Writing
correct code in the latter is quite difficult and requires a lot of
discipline. Cython gives you that for free.


> One possible future goal is to replace NumPy with something much better.

You're not the first. :) Look at other projects like Pythran, Numba,
Theano, numexpr, …

Pythran actually integrates with Cython's type system to generate C++ code
from NumPy expressions.


> I also have code written in Python that I might translate to C using Cython.
> It would be kind of interesting to use Cython to generate C, and then create
> bindings to that C in Felix, so instead of calling the Python C API, we call
> the Cython generated API instead, allowing people to write libraries for
> Felix in Cython instead of C or Felix. However that’s a more major integration task.
> You’d want Cython to generate the Felix bindings, or at least output meta-data
> that would allow them to be generated easily .. such as .. a *.pxd file !!

I would encourage you to generate Cython code instead of C/C++ directly, if
you want to interact with Python. There's no need to create yet another
"generator for C-API calls that isn't as good as Cython".

Even if you need to mix in C/C++ code, Cython will allow you to do that,
e.g. via verbatim code sections.

http://docs.cython.org/en/latest/src/userguide/external_C_code.html#including-verbatim-c-code

Stefan


More information about the cython-devel mailing list