[pypy-dev] Questions on the pypy+numpy project
Stefan Behnel
stefan_ml at behnel.de
Mon Oct 17 12:11:48 CEST 2011
David Cournapeau, 17.10.2011 00:01:
> On Sun, Oct 16, 2011 at 10:20 PM, Ian Ozsvald wrote:
>> how big is the scipy ecosystem beyond numpy? What's the rough line
>> count for Python, C, Fortran etc that depends on numpy?
>
> The ecosystem is pretty big. There are at least in the order of
> hundred of packages that depend directly on numpy and scipy.
>
> For scipy alone, the raw count is around 150k-300k LOC (it is a bit
> hard to estimate because we include some swig-generated code that I
> have ignored here, and some code duplication to deal with distutils
> insanity). There is around 80k LOC of fortran alone in there.
>
> More and more scientific code use cython for speed or just for
> interfacing with C (and recently C++). Other tools have been used for
> similar reasons (f2py, in particular, to automatically wrap fortran
> and C).
and fwrap nowadays, which also generates glue code for talking to Fortran
from Cython code, through a thin C code wrapper (AFAIK).
> f2py at least is quite tightly coupled to numpy C API. I know
> there is work for a pypy-friendly backend for cython, but I don't know
> where things are there.
It's, erm, resting. The GSoC is over, the code hasn't been merged into
mainline yet, lacks support for some recent Cython language features and is
not in a state that would allow building anything major with it right away.
It's based on ctypes, so it suffers from the same problems as ctypes,
namely API/ABI inconsistencies beyond those that "ctypes_configure" can
handle. In particular, things like talking to C macros will at least
require additional C glue code to be generated, which doesn't currently
happen. What works is the stripping of Cython specific syntax off the code
and to map "regular" C code interactions to corresponding ctypes calls. So,
some things work as it is, everything else needs more work. Helping hands
and funding are welcome.
That being said, I still think it's a promising approach, and it would be
very interesting for PyPy to support Cython code (in one way or another).
Cython certainly has a good standing in the Scientific Python community
these days. If PyPy wants to enter as well, it will have to show that it
can easily and efficiently interface with the huge amount of existing
scientific code out there, be it C, C++, Fortran, Cython or whatever. And
rewriting the code or even just the wrappers for Yet Another Python
Implementation is not a scalable solution to that problem.
> I would like to see less C boilerplate code in scipy, and more cython
> usage (which generates faster code and is much more maitainable); this
> can also benefit pypy, if only for making the scipy code less
> dependend on CPython details.
And by making the implementation essentially Python. That way, it can much
more easily be ported to other Python platforms, especially PyPy, than if
you have to start by reverse engineering even the exact wrapper signature
from C code.
Stefan
More information about the pypy-dev
mailing list