[Python-Dev] Let's change to C API!

Victor Stinner vstinner at redhat.com
Tue Jul 31 09:34:05 EDT 2018


Antoine: would you mind to subscribe to the capi-sig mailing list? As
expected, they are many interesting points discussed here, but I would
like to move all C API discussions to capi-sig. I only continue on
python-dev since you started here (and ignored my request to start
discussing my idea on capi-sig :-)).

2018-07-31 13:55 GMT+02:00 Antoine Pitrou <solipsis at pitrou.net>:
>> I understood that PyPy succeeded to become at least 2x faster than
>> CPython by stopping to use reference counting internally.
>
> "I understood that"... where did you get it from? :-)

I'm quite sure that PyPy developers told me that, but I don't recall
who nor when.

I don't think that PyPy became 5x faster just because of a single
change. But I understand that to be able to implement some
optimizations, you first have to remove constraints caused by a design
choice like reference counting.

For example, PyPy uses different memory allocators depending on the
scope and the lifetime of an object. I'm not sure that you can
implement such optimization if you are stuck with reference counting.


> So I think that we should ask what the ABI differences between debug
> and non-debug builds are.

Debug build is one use case. Another use case for OS vendors is to
compile a C extension once (ex: on Python 3.6) and use it on multiple
Python versions (3.7, 3.8, etc.).


> AFAIK, the two main ones are Py_TRACE_REFS and Py_REF_DEBUG.  Are there
> any others?

No idea.

> Honestly, I don't think Py_TRACE_REFS is useful.  I don't remember
> any bug being discovered thanks to it.  Py_REF_DEBUG is much more
> useful.  The main ABI issue with Py_REF_DEBUG is not object structure
> (it doesn't change object structure), it's when a non-debug extension
> steals a reference (or calls a reference-stealing C API function),
> because then increments and decrements are unbalanced.

About Py_REF_DEBUG:_Py_RefTotal counter is updated at each
INCREF/DECREF. _Py_RefTotal is a popular feature of debug build, and
I'm not sure how we can update it without replacing Py_INCREF/DECREF
macros with function calls.

I'm ok to remove/deprecate Py_TRACE_REFS feature if nobody uses it.


> OS vendors seem to be doing a fine job AFAICT.  And if I want a recent
> Python I just download Miniconda/Anaconda.

Is it used in production to deploy services? Or is it more used by
developers? I never used Anaconda.


> cffi is a ctypes replacement.  It's nice when you want to bind with
> foreign C code, not if you want tight interaction with CPython objects.

I have been told that cffi is a different way to do the same thing.
Instead of writing C code with the C API glue, only write C code, and
then write a cffi binding for it.

But I never used Cython nor cffi, so I'm not sure which one is the
most appropriate depending on the use case.


> I think you don't realize that the C API is *already* annoying.  People
> started with it mostly because there wasn't a better alternative at the
> time.  You don't need to make it more annoying than it already is ;-)
>
> Replacing existing C extensions with something else is entirely a
> developer time/effort problem, not an attractivity problem.  And I'm
> not sure that porting a C extension to a new C API is more reasonable
> than porting to Cython entirely.

Do you think that it's doable to port numpy to Cython? It's made of
255K lines of C code. A major "rewrite" of such large code base is
very difficult since people want to push new things in parallel. Or is
it maybe possible to do it incrementally?


> It's just that I disagree that removing the C API will make CPython 2x
> faster.

How can we make CPython 2x faster? Why everybody, except of PyPy,
failed to do that?


Victor


More information about the Python-Dev mailing list