[Python-Dev] BDFL-Delegate appointments for several PEPs
mark at hotpy.org
Sat Mar 30 12:30:58 EDT 2019
On 27/03/2019 1:50 pm, Petr Viktorin wrote:
> On Sun, Mar 24, 2019 at 4:22 PM Mark Shannon <mark at hotpy.org> wrote:
>> Hi Petr,
>> Regarding PEPs 576 and 580.
>> Over the new year, I did a thorough analysis of possible approaches to
>> possible calling conventions for use in the CPython ecosystems and came
>> up with a new PEP.
>> The draft can be found here:
>> I was hoping to profile a branch with the various experimental changes
>> cherry-picked together, but don't seemed to have found the time :(
>> I'd like to have a testable branch, before formally submitting the PEP,
>> but I'd thought you should be aware of the PEP.
> Hello Mark,
> Thank you for letting me know! I wish I knew of this back in January,
> when you committed the first draft. This is unfair to the competing
> PEP, which is ready and was waiting for the new govenance. We have
> lost three months that could be spent pondering the ideas in the
I realize this is less than ideal. I had planned to publish this in
December, but life intervened. Nothing bad, just too busy.
> Do you think you will find the time to piece things together? Is there
> anything that you already know should be changed?
I've submitted the final PEP and minimal implementation
> Do you have any comments on [Jeroen's comparison]?
It is rather out of date, but two comments.
1. `_PyObject_FastCallKeywords()` is used as an example of a call in
CPython. It is an internal implementation detail and not a common path.
2. The claim that PEP 580 allows "certain optimizations because other
code can make assumptions" is flawed. In general, the caller cannot make
assumptions about the callee or vice-versa. Python is a dynamic language.
> The pre-PEP is simpler then PEP 580, because it solves simpler issues.
The fundamental issue being addressed is the same, and it is this:
Currently third-party C code can either be called quickly or have access
to the callable object, not both. Both PEPs address this.
> I'll need to confirm that it won't paint us into a corner -- that
> there's a way to address all the issues in PEP 579 in the future.
PEP 579 is mainly a list of supposed flaws with the
The general thrust of PEP 579 seems to be that builtin-functions and
builtin-methods should be more flexible and extensible than they are. I
don't agree. If you want different behaviour, then use a different
object. Don't try an cram all this extra behaviour into a pre-existing
However, if we assume that we are talking about callables implemented in
C, in general, then there are 3 key issues covered by PEP 579.
1. Inspection and documentation; it is hard for extensions to have
docstrings and signatures. Worth addressing, but completely orthogonal
to PEP 590.
2. Extensibility and performance; extensions should have the power of
Python functions without suffering slow calls. Allowing the C code
access to the callable object is a general solution to this problem.
Both PEP 580 and PEP 590 do this.
3. Exposing the underlying implementation and signature of the C code,
so that optimisers can avoid unnecessary boxing. This may be worth
doing, but until we have an adaptive optimiser capable of exploiting
this information, this is premature. Neither PEP 580 nor PEP 590
explicit allow or prevent this.
> The pre-PEP claims speedups of 2% in initial experiments, with
> expected overall performance gain of 4% for the standard benchmark
> suite. That's pretty big.
That's because there is a lot of code around calls in CPython, and it
has grown in a rather haphazard fashion. Victor's work to add the
"FASTCALL" protocol has helped. PEP 590 seeks to formalise and extend
that, so that it can be used more consistently and efficiently.
> As far as I can see, PEP 580 claims not much improvement in CPython,
> but rather large improvements for extensions (Mistune with Cython).
Calls to and from extension code are slow because they have to use the
`tp_call` calling convention (or lose access to the callable object).
With a calling convention that does not have any special cases,
extensions can be as fast as builtin functions. Both PEP 580 and PEP 590
attempt to do this, but PEP 590 is more efficient.
> The pre-PEP has a complication around offsetting arguments by 1 to
> allow bound methods forward calls cheaply. I fear that this optimizes
> for current usage with its limitations.
It's optimising for the common case, while allowing the less common.
Bound methods and classes need to add one additional argument. Other
rarer cases, like `partial` may need to allocate memory, but can still
add or remove any number of arguments.
> PEP 580's cc_parent allows bound methods to have access to the class,
> and through that, the module object where they are defined and the
> corresponding module state. To support this, vector calls would need a
> two-argument offset.
Not true. The first argument in the vector call is the callable itself.
Through that it, any callable can access its class, its module or any
other object it wants.
> (That seems to illustrate the main difference between the motivations
> of the two PEPs: one focuses on extensibility; the other on optimizing
> existing use cases.)
I'll reiterate that PEP 590 is more general than PEP 580 and that once
the callable's code has access to the callable object (as both PEPs
allow) then anything is possible. You can't can get more extensible than
> The pre-PEP's "any third-party class implementing the new call
> interface will not be usable as a base class" looks quite limiting.
PEP 580 has the same limitation for the same reasons. The limitation is
necessary for correctness if an object supports calls via `__call__` and
through another calling convention.
> [Jeroen's comparison]:
More information about the Python-Dev