[Python-Dev] Micro-benchmarks for PEP 580

INADA Naoki songofacandy at gmail.com
Thu Jul 12 00:59:38 EDT 2018


On Thu, Jul 12, 2018 at 4:54 AM Jeroen Demeyer <J.Demeyer at ugent.be> wrote:
>
> On 2018-07-11 10:50, Victor Stinner wrote:
> > As you wrote, the
> > cost of function costs is unlikely the bottleneck of application.
>
> With that idea, METH_FASTCALL is not needed either. I still find it very
> strange that nobody seems to question all the crazy existing
> optimizations for function calls in CPython, yet claiming at the same
> time that those are just stupid micro-optimizations which are surely not
> important for real applications.

METH_FASTCALL for pyfunction and builtin cfunction made application
significantly faster.  It is proven by application benchmark, not only micro
benchmark.

On the other hand, calling 3rd party extension is much less frequently.
Our benchmark suite contains some extension call, but it is not so frequent
and I can't find any significant boost when using METH_FASTCALL on it.
That's why METH_FASTCALL benefit is proven for builtins, but not for
3rd parties.

If you want to prove it, you can add benchmark heavily using extension
call.  With it, we can measure impact of using METH_FASTCALL in 3rd
party extensions.

---

But for now, I'm +1 to enable FASTCALL in custom type in 3.8.
Cython author confirmed they really want to use custom method type
and lack of FASTCALL support will block them.

So my current point is: should we go PEP 576 or 580, or middle of them?


>
> Anyway, I'm thinking about real-life benchmarks but that's quite hard.
> One issue is that PEP 580 by itself does not make existing faster, but
> allows faster code to be written in the future.

At this time, no need to show performance difference.
I need some application benchmark which are our target for optimize.

We can understand concretely how PEP 580 (and possible future optimization
based on PEP 580) can boost some type of applications by these target
sample application benchmarks.
We can estimate performance impact using these benchmarks too.
We can write PoC to measure performance impact too.

But for now, we don't have any target application in our benchmark suite.
I think It is showstopper for us.

> A second issue is that
> Cython (my main application) already contains optimizations for
> Cython-to-Cython calls. So, to see the actual impact of PEP 580, I
> should disable those.
>

Could you more concrete?
Which optimization do you refer?  direct call of cdef/cpdef?
METH_FASTCALL + LOAD_METHOD?
Or more futher optimization PEP 580 enables?

Switching "bidning=True" and "binding=False" is not enough for it?

I expect we can have several Cython modules which calls
each other.  I feel it's right way to simulate "Cython (not Python) as a
glue language" workload.

Anyway, I don't request you to show "performance impact".
I request only "target application we want to optimize with PEP 580 and
future optimization based on PEP 580" for now.

Regards,

-- 
INADA Naoki  <songofacandy at gmail.com>


More information about the Python-Dev mailing list