[Python-Dev] PEP 580 and PEP 590 comparison.

Mark Shannon mark at hotpy.org
Sun Apr 14 07:34:17 EDT 2019

Hi Petr,

Thanks for spending time on this.

I think the comparison of the two PEPs falls into two broad categories, 
performance and capability.

I'll address capability first.

Let's try a thought experiment.
Consider PEP 580. It uses the old `tp_print` slot as an offset to mark 
the location of the CCall structure within the callable. Now suppose 
instead that it uses a `tp_flag` to mark the presence of an offset field 
and that the offset field is moved to the end of the TypeObject. This 
would not impact the capabilities of PEP 580.
Now add a single line
which would make PyCCall_FastCall compatible with the PEP 590 vectorcall 
Now rebase the PEP 580 reference code on top of PEP 590 minimal 
implementation and make the vectorcall field of CFunction point to 
The resulting hybrid is both a PEP 590 conformant implementation, and is 
at least as capable as the reference PEP 580 implementation.

Therefore PEP 590, must be at least as capable at PEP 580.

Now performance.

Currently the PEP 590 implementation is intentionally minimal. It does 
nothing for performance. The benchmark Jeroen provides is a 
micro-benchmark that calls the same functions repeatedly. This is 
trivial and unrealistic. So, there is no real evidence either way. I 
will try to provide some.

The point of PEP 590 is that it allows performance improvements by 
allowing callables more freedom of implementation. To repeat an example 
from an earlier email, which may have been overlooked, this code reduces 
the time to create ranges and small lists by about 30%


To speed up calls to builtin functions by a measurable amount will need 
some work on argument clinic. I plan to have that done before PyCon in May.


More information about the Python-Dev mailing list