[Cython] Add a Pythran backend for Numpy operation

Stefan Behnel stefan_ml at behnel.de
Sat May 6 12:00:09 EDT 2017


Hi all!

Adrien Guinet implemented a Pythran backend for NumPy array operations in
Cython.

https://github.com/cython/cython/pull/1607

I'm forwarding his explanations to cython-users as I really like this as a
feature and would like to get some feedback by some more people to make
sure it works as it is for more than one group of users. Is anyone
interested in giving it some testing?

Thanks!

Stefan



Adrien Guinet schrieb am 15.02.2017 um 22:15:
> Hello everyone!
> 
> I've been working for quite some time on the usage of Pythran as a backend for
> the Numpy operations that Cython can generate. The associated PR on github can
> be found here: https://github.com/cython/cython/pull/1607. This work has been
> sponsored by the OpenDreamKit project
> (https://github.com/OpenDreamKit/OpenDreamKit/).
> 
> First of all, the Pythran project
> (https://github.com/serge-sans-paille/pythran) is a (subset of) Python to C++
> compiler, that aims at optimizing "scientific" Python code.  It also provides a
> full C++ implementation of a major set of the Numpy API.  Some of the advantage
> of this implementation is that it supports expression templates and SIMD
> instructions (partially thanks to Boost.SIMD [1]).
> 
> One of the limitation of the current Numpy support of Cython is that it relies
> on the original Numpy Python module for a lot of computations. The overall idea
> is to replace these calls by the Numpy implementation provided within the
> Pythran project.
> 
> I'll discuss in this mail the various choices that have been made, why and some
> implementation details. Then we'll also show some benchmark to see the
> potential improvements, which is the point of all this in the end :)
> 
> Pythran limitations
> -------------------
> 
> The Pythran Numpy implementation has some limitations:
> 
> * array "views" are not supported. That means that arrays must be stored in
>   contiguous memory. Fortran and C-style format are supported.
> * the endianness of the integers must be the same that the one of the targeted
>   architecture (note that Cython has the same limitation)
> 
> That's why we did two things:
> 
> * the usage of the Pythran backend needs to be explicitly asked by the user by
>   providing the --np-pythran flag to the Cython compiler, or by using the
>   "np_pythran" flag to the cythonize call (for distutils)
> * in function arguments, Numpy buffers are replaced by fused types to be able
>   to fall back in case of unsupported buffers. More on this below.
> 
> Implementation choices and details within Cython
> ------------------------------------------------
> 
> a) PythranExpr
> 
> We defined a new type in PyrexTypes.py, which defines a Pythran buffer or
> expression. A Pythran expression is associated to a Pythran expression
> template, whose C++ type can be something like "decltype(a+b)". We thus compose
> every expression/function call like this, which allows us to use Pythran's
> expression template mechanism.
> 
> We also choose to let the C++ compiler deduced the final type of every
> expression, and emit errors if something goes wrong. This choice allows not to
> have to rewrite in Python all the (potentially implicit) conversion rules that
> can apply in a C/C++ program, which could be error prone. The disadvantage is
> that it may generate not that trivial error messages for the end-user.
> 
> b) Fused types for function arguments
> 
> As Pythran has some limitations about the Numpy buffers it can support, we
> chose to replace Numpy buffer arguments by a fused type that can be either a
> Pythran buffer or the original Numpy buffer. The decision is made to use one
> type or another according to the limitations described above.
> 
> This allows a fallback to the original Cython implementation in case of an
> unsupported buffer type.
> 
> Tests
> -----
> 
> A flag has been added to the runtests.py script. If provided with a path to a
> Pythran installation, it will run the C++ tests in "Pythran" mode. This allows
> to reuse the whole test suite of Cython.
> 
> Benchmark
> ---------
> 
> The whole idea of this is to get better performances.
> 
> Here is a simple benchmark of what this mode can achieve, using this cython code:
> 
> def sqrt_sum(numpy.ndarray[numpy.float_t, ndim=1] a,
> numpy.ndarray[numpy.float_t, ndim=1] b):
>     return numpy.sqrt(numpy.sqrt(a*a+b*b))
> 
> On my computer (Core i7-6700HQ), this gives there results, with an array of
> 100000000 32-bit floats as input:
> 
> - for the classical Cython version: 960ms
> - for the Cython version using the Pythran backend: 761ms
> - for the Cython version using the Pythran backend using SIMD instructions: 243ms
> 
> which makes a speedup of ~3.9x using SIMD instructions.
> 
> Documentation
> -------------
> 
> I put an example of how to use this with distutils in the documentation. It
> could be put elsewhere if needed, or formatted differently.
> 
> 
> I'd be happy to discuss the various choices made here, and the implementation
> details.
> 
> Thanks everyone!
> 
> [1]: https://github.com/NumScale/boost.simd



More information about the cython-devel mailing list