[Python-Dev] Benchmarks why we need PEP 576/579/580

Mon Jul 23 05:51:41 EDT 2018

I did exactly the same benchmark again with Python 3.7 and the results 
are similar. I'm copying and editing the original post for completeness:

I finally managed to get some real-life benchmarks for why we need a
faster C calling protocol (see PEPs 576, 579, 580).

I focused on the Cython compilation of SageMath. By default, a function
in Cython is an instance of builtin_function_or_method (analogously,
method_descriptor for a method), which has special optimizations in the
CPython interpreter. But the option "binding=True" changes those to a
custom class which is NOT optimized.

I ran the full SageMath testsuite several times on Python 2.7 without 
and with binding=True to find out any significant differences. I then 
checked if those differences could be reproduced on Python 3.7 (SageMath 
has not been fully ported to Python 3 yet). The most dramatic difference 
is multiplication for generic matrices. More precisely, with the 
following command:

python3 -m timeit -s "from sage.all import MatrixSpace, GF; M =
MatrixSpace(GF(9), 200).random_element()" "M * M"

With binding=False, I got
1 loop, best of 5: 1.19 sec per loop

With binding=True, I got
1 loop, best of 5: 1.83 sec per loop

This is a big regression which should be gone completely with PEP 580.

I used Python 3.7, SageMath 8.3.rc1 (plus a few patches to make it work 
with binding=True and with Python 3.7) and Cython 0.28.4.

I hope that this finally shows that the problems mentioned in PEP 579
are real.

Jeroen.