[Python-Dev] Let's change to C API!
Antoine Pitrou
solipsis at pitrou.net
Tue Jul 31 07:55:45 EDT 2018
On Tue, 31 Jul 2018 12:51:23 +0200
Victor Stinner <vstinner at redhat.com> wrote:
> 2018-07-31 8:58 GMT+02:00 Antoine Pitrou <solipsis at pitrou.net>:
> > What exactly in the C API made it slow or non-promising?
> >
> >> The C API requires that your implementations make almost all the same
> >> design choices that CPython made 25 years ago (C structures, memory
> >> allocators, reference couting, specific GC implementation, GIL, etc.).
> >
> > Yes, but those choices are not necessarily bad.
>
> I understood that PyPy succeeded to become at least 2x faster than
> CPython by stopping to use reference counting internally.
"I understood that"... where did you get it from? :-)
> I also want to make the debug build usable.
So I think that we should ask what the ABI differences between debug
and non-debug builds are.
AFAIK, the two main ones are Py_TRACE_REFS and Py_REF_DEBUG. Are there
any others?
Honestly, I don't think Py_TRACE_REFS is useful. I don't remember
any bug being discovered thanks to it. Py_REF_DEBUG is much more
useful. The main ABI issue with Py_REF_DEBUG is not object structure
(it doesn't change object structure), it's when a non-debug extension
steals a reference (or calls a reference-stealing C API function),
because then increments and decrements are unbalanced.
> I also want to allow OS vendors to provide multiple Python versions
> per OS release: *reduce* the maintenance burden, obviously it will
> still mean more work. It's a tradeoff depending on the lifetime of
> your OS and the pressure of customers to get the newest Python :-) FYI
> Red Hat already provide recent development tools on top of RHEL (and
> Centos and Fedora) because customers are asking for that. We don't
> work for free :-)
OS vendors seem to be doing a fine job AFAICT. And if I want a recent
Python I just download Miniconda/Anaconda.
> I also want to see more alternatives implementations of Python! I
> would like to see RustPython succeed!
As long as RustPython gets 10 commits a year, it has no chance of being
a functional Python implementation, let alone a successful one. AFAICS
it's just a toy project.
> > and one where I think Stefan is right that we
> > should push people towards Cython and alternatives, rather than direct
> > use of the C API (which people often fail to use correctly, in my
> > experience).
>
> Don't get me wrong: my intent is not to replace Cython. Even if PyPy
> is pushing hard cffi, many C extensions still use the C API.
cffi is a ctypes replacement. It's nice when you want to bind with
foreign C code, not if you want tight interaction with CPython objects.
> Maybe if the C API becomes more annoying and require developers to
> adapt their old code base for the "new C API", some of them will
> reconsider to use Cython, cffi or something else :-D
I think you don't realize that the C API is *already* annoying. People
started with it mostly because there wasn't a better alternative at the
time. You don't need to make it more annoying than it already is ;-)
Replacing existing C extensions with something else is entirely a
developer time/effort problem, not an attractivity problem. And I'm
not sure that porting a C extension to a new C API is more reasonable
than porting to Cython entirely.
> Do you think that it's wrong to promise that a smaller C API without
> implementation details will allow to more easily *experiment*
> optimizations?
I don't think it's wrong. Though as long as CPython itself uses the
internal C API, you'll still have a *lot* of code to change before you
can even launch a functional interpreter and standard library...
It's just that I disagree that removing the C API will make CPython 2x
faster.
Actually, important modern optimizations for dynamic languages (such as
inlining, type specialization, inline caches, object unboxing) don't
seem to depend on the C API at all.
> >> I have to confess that helping Larry is part of my overall plan.
> >
> > Which is why I'd like to see Larry chime in here.
>
> I already talked a little bit with Larry about my plan, but he wasn't
> sure that my plan is enough to be able to stop reference counting
> internally and move to a different garbage collector. I'm only sure
> that it's possible to keep using reference counting for the C API,
> since there are solutions for that (ex: maintain a hash table
> PyObject* => reference count).
Theoretically possible, but the cost of reference counting will go
through the roof if you start using a hash table.
> Honestly, right now, I'm only convinvced of two things:
>
> * Larry implementation is very complex and so I doubt that he is going
> to succeed. I'm talking about solutions to maintain optimize reference
> counting in multithreaded applications. Like his idea of "logs" of
> reference counters.
Well, you know, *any* solution is going to be very complex. Switching
to a full GC for a runtime (CPython) which can allocate hundreds of
thousands of objects per second will require a lot of optimization work
as well.
> * We have to change the C API: it causes troubles to *everybody*.
> Nobody spoke up because changing the C API is a giant project and it
> breaks the backward compatibility. But I'm not sure that all victims
> of the C API are aware that their issues are caused by the design of
> the current C API.
I fully agree that the C API is not very nice to play with. The
diversity of calling / error return conventions is one annoyance.
Borrowed references and reference stealing is another. Getting
reference counting right on all code paths is often delicate.
So I'm all for sanitizing the C API, and slowly deprecating old
patterns. And I think we should push people towards Cython for most
current uses of the C API.
Regards
Antoine.
More information about the Python-Dev
mailing list