[capi-sig]Replacing Python C API with CFFI
Hello!
Summary:
It looks like fixing the Python C API is a lot of work and might fail, so I was wondering if it would be better to put effort into reducing the C API by making more use of CFFI within the Python standard library instead?
Motivation:
It feels like the C API is currently very broad because it provides a second interface layer to a large chunk of the Python object space (in addition to the one visible to Python code).
I think it was Armin Rigo's observation when writing CFFI that there is a better API that already exists -- i.e. C types and function signatures.
If we're going to do a lot of work, would it not be better to push instead for moving towards CFFI as the interface between C and Python?
I'm not sure that this is orthogonal to Victor's current proposal or not -- it might just be one route to achieving it.
Advantages over the current route:
There's a clearer direction for people to head it, so it's easier to contribute.
A lot of work has already been done (e.g. it works for PyPy, a lot of libraries already exist that use it).
Maybe Cython is an equally good option to CFFI in this scenario -- I don't know Cython though, so I have no deep opinion on that.
Schiavo Simon
Hi,
2018-07-31 18:54 GMT+02:00 Simon Cross <hodgestar@gmail.com>:
It looks like fixing the Python C API is a lot of work and might fail, so I was wondering if it would be better to put effort into reducing the C API by making more use of CFFI within the Python standard library instead?
I don't really care of the usage of C API inside Python stdlib. I'm even fine with the stdlib using private APIs. I'm more worried about all third party C extensions.
I never wrote any cffi binding, but it seems that it makes application startup slower much slower when the binding has not been compiled yet. In that case, cffi requires a C compiler and maybe also build dependencies (like header files). cffi itself also pulls pycparser dependency which is a non trivial Python module.
If you ignore the compilation part, distributing a binary can also be an issue: you have to distribute one binary per platform.
Again, I never really used cffi, so the issues are maybe not issues in practice.
Victor
Replying to myself a bit here:
More abstractly, I think it might be possible to define a tiny core API that describes conventions for calling backwards and forwards between Python and C and then break apart the C API into chunks (modules?) that build helpers for specific functionality on top of the tiny core of calling conventions.
CFFI is much slower than CPython's C-API:
https://mail.python.org/pipermail/python-dev/2013-December/130772.html
CPython's API is actually quite fast in comparison to FFIs of various languages.
The only FFI I know that *achieves* C Python's speed despite being garbage collected is OCaml's. So garbage collection certainly isn't a panacea either, probably on the contrary.
Comparisons to state-of-the-art Lisp implementations like SBCL would be interesting.
On Wed, Aug 1, 2018 at 7:19 PM, <skrah@bytereef.org> wrote:
CFFI is much slower than CPython's C-API:
https://mail.python.org/pipermail/python-dev/2013-December/130772.html
Woot.
It's a bit unsurprising that CFFI is slower on CPython because it works on top of the C API (so it's an extra layer).
Under PyPy it does better because it's more of a first class citizen.
I think these benchmarks support the general idea that changing the C API does open some doors to better performance (which is why PyPy folks wrote CFFI)?
Schiavo Simon
Simon Cross wrote:
I think these benchmarks support the general idea that changing the C API does open some doors to better performance (which is why PyPy folks wrote CFFI)?
I wonder. The benchmarks show that the C-API is very fast, which historically is one of the reasons for Python's success.
In my experience the theoretical maximum by eliminating ceval.c and inlining everything is below 2x, something around 1.6x.
So speedups would have to be done in libpython itself, but I'm not exactly sure what is being proposed.
Stefan Krah
Simon Cross schrieb am 01.08.2018 um 21:37:
I think these benchmarks support the general idea that changing the C API does open some doors to better performance (which is why PyPy folks wrote CFFI)?
Initially, they half-heartedly recommended ctypes as a way to write native code wrappers for both CPython and PyPy, and then they wrote CFFI because a) it matches their JIT better than ctypes and was easier to implement and optimise for them than the complex ctypes library, and b) they saw the FFI of luajit [1], which is much smarter and easier to use than ctypes, so they took it as a model.
The fact that there is a CFFI for CPython is mostly due to their wish to make people write native code wrappers that work on both systems, while being fast on PyPy. And because they are nice people who like writing cool software. :) But mostly, really, to keep people from writing low-level C-API code that they would then have to support in their cpyext emulation. So, actually very similar to Victor's intentions. The more high level the code is that people write, the easier it becomes to change the basement under their feet.
Stefan
Simon Cross schrieb am 01.08.2018 um 21:37:
On Wed, Aug 1, 2018 at 7:19 PM, wrote:
CFFI is much slower than CPython's C-API:
https://mail.python.org/pipermail/python-dev/2013-December/130772.html
It's a bit unsurprising that CFFI is slower on CPython because it works on top of the C API (so it's an extra layer).
Under PyPy it does better because it's more of a first class citizen.
It's faster under PyPy because their JIT can understand and optimise it. Or, more specifically, its usage in jitted code, so you get unboxed native calls instead of Python API calls. CPython does not have a JIT, that's why it's slower there. Doesn't really have anything to do with the C-API as such.
Stefan
On Wed, Aug 1, 2018 at 10:33 PM, Stefan Behnel <python_capi@behnel.de> wrote:
It's faster under PyPy because their JIT can understand and optimise it. Or, more specifically, its usage in jitted code, so you get unboxed native calls instead of Python API calls. CPython does not have a JIT, that's why it's slower there. Doesn't really have anything to do with the C-API as such.
/me nods.
I think it was this kind of dynamic optimization that Victor was hoping might one day make it into CPython (and was blocked by the current complex C-API).
On Wed, Aug 1, 2018 at 2:20 PM, Simon Cross <hodgestar@gmail.com> wrote:
On Wed, Aug 1, 2018 at 10:33 PM, Stefan Behnel <python_capi@behnel.de> wrote:
It's faster under PyPy because their JIT can understand and optimise it. Or, more specifically, its usage in jitted code, so you get unboxed native calls instead of Python API calls. CPython does not have a JIT, that's why it's slower there. Doesn't really have anything to do with the C-API as such.
/me nods.
I think it was this kind of dynamic optimization that Victor was hoping might one day make it into CPython (and was blocked by the current complex C-API).
There's nothing about the C-API that prevents a JIT (in fact numba can do exactly that, though its more focused on operating exclusively on unboxed types) and such a jit could optimize CFFI calls just as well. But so long as the default mode is interpreted, always operating on boxed objects, it will be slower.
On Thu, Aug 2, 2018 at 10:16 AM, Robert Bradshaw <robertwb@math.washington.edu> wrote:
There's nothing about the C-API that prevents a JIT (in fact numba can do exactly that, though its more focused on operating exclusively on unboxed types) and such a jit could optimize CFFI calls just as well. But so long as the default mode is interpreted, always operating on boxed objects, it will be slower.
Indeed, and as long as the C-API works with Python objects, a JIT has to pass actual boxed Python objects to code written using the C-API, which makes improving the performance of calls across the boundary between the "interpreter" and the C-API using code somewhat hopeless.
Simon Cross schrieb am 02.08.2018 um 11:39:
On Thu, Aug 2, 2018 at 10:16 AM, Robert Bradshaw wrote:
There's nothing about the C-API that prevents a JIT (in fact numba can do exactly that, though its more focused on operating exclusively on unboxed types) and such a jit could optimize CFFI calls just as well. But so long as the default mode is interpreted, always operating on boxed objects, it will be slower.
Indeed, and as long as the C-API works with Python objects, a JIT has to pass actual boxed Python objects to code written using the C-API, which makes improving the performance of calls across the boundary between the "interpreter" and the C-API using code somewhat hopeless.
As long as we are just talking about calling such code, this is not true. A JIT compiler can be made to understand that the other side is natively implemented and use that information to call the underlying native function directly. PyPy's CFFI implementation does that. And there are discussions starting on providing a dedicated C call protocol for this as part of the C-API.
Stefan
I hope that one day, CPython will get its JIT compiler as well :-D (There are already multiple JIT compilers in forks of CPython.)
Victor
2018-08-01 23:20 GMT+02:00 Simon Cross <hodgestar@gmail.com>:
On Wed, Aug 1, 2018 at 10:33 PM, Stefan Behnel <python_capi@behnel.de> wrote:
It's faster under PyPy because their JIT can understand and optimise it. Or, more specifically, its usage in jitted code, so you get unboxed native calls instead of Python API calls. CPython does not have a JIT, that's why it's slower there. Doesn't really have anything to do with the C-API as such.
/me nods.
I think it was this kind of dynamic optimization that Victor was hoping might one day make it into CPython (and was blocked by the current complex C-API).
capi-sig mailing list -- capi-sig@python.org To unsubscribe send an email to capi-sig-leave@python.org
participants (5)
-
Robert Bradshaw
-
Simon Cross
-
skrah@bytereef.org
-
Stefan Behnel
-
Victor Stinner