[Python-Dev] Possible performance regression

Tue Feb 26 17:28:14 EST 2019

On 2019-02-26, Raymond Hettinger wrote:
> That said, I'm only observing the effect when building with the
> Mac default Clang (Apple LLVM version 10.0.0 (clang-1000.11.45.5).
> When building GCC 8.3.0, there is no change in performance.

My guess is that the code in _PyEval_EvalFrameDefault() got changed
enough that Clang started emitting a bit different machine code.  If
the conditional jumps are a bit different, I understand that could
have a significant difference on performance.

Are you compiling with --enable-optimizations (i.e. PGO)?  In my
experience, that is needed to get meaningful results.  Victor also
mentions that on his "how-to-get-stable-benchmarks" page.  Building
with PGO is really (really) slow so I supect you are not doing it
when bisecting.  You can speed it up greatly by using a simpler
command for PROFILE_TASK in Makefile.pre.in.  E.g.

    PROFILE_TASK=$(srcdir)/my_benchmark.py

Now that you have narrowed it down to a single commit, it would be
worth doing the comparison with PGO builds (assuming Clang supports
that).

> That said, it seems to be compiler specific and only affects the
> Mac builds, so maybe we can decide that we don't care.

I think the key question is if the ceval loop got a bit slower due
to logic changes or if Clang just happened to generate a bit worse
code due to source code details.  A PGO build could help answer
that.  I suppose trying to compare machine code is going to produce
too large of a diff.

Could you try hoisting the eval_breaker expression, as suggested by
Antoine:

    https://discuss.python.org/t/profiling-cpython-with-perf/940/2

If you think a slowdown affects most opcodes, I think the DISPATCH
change looks like the only cause.  Maybe I missed something though.

Also, maybe there would be some value in marking key branches as
likely/unlikely if it helps Clang generate better machine code.
Then, even if you compile without PGO (as many people do), you still
get the better machine code.

Regards,

  Neil