[New-bugs-announce] [issue45116] Performance regression 3.10b1 and later on Windows
neonene
report at bugs.python.org
Mon Sep 6 11:27:18 EDT 2021
New submission from neonene <nicesalmon at gmail.com>:
pyperformance on Windows shows some gap between 3.10a7 and 3.10b1.
The following are the ratios compared with 3.10a7 (the higher the slower).
-------------------------------------------------
Windows x64 | PGO release official-binary
----------------+--------------------------------
20210405 |
3.10a7 | 1.00 1.24 1.00 (PGO?)
20210408-07:58 |
b98eba5 | 0.98
20210408-10:22 |
* PR25244 | 1.04
20210503 |
3.10b1 | 1.07 1.21 1.07
-------------------------------------------------
Windows x86 | PGO release official-binary
----------------+--------------------------------
20210405 |
3.10a7 | 1.00 1.25 1.27 (release?)
20210408-07:58 |
b98eba5bc | 1.00
20210408-10:22 |
* PR25244 | 1.11
20210503 |
3.10b1 | 1.14 1.28 1.29
Since PR25244 (28d28e053db6b69d91c2dfd579207cd8ccbc39e7),
_PyEval_EvalFrameDefault() in ceval.c has seemed to be unoptimized with PGO (msvc14.29.16.10).
At least the functions below have become un-inlined there at all.
(1) _Py_DECREF() (from Py_DECREF,Py_CLEAR,Py_SETREF)
(2) _Py_XDECREF() (from Py_XDECREF,SETLOCAL)
(3) _Py_IS_TYPE() (from PyXXX_CheckExact)
(4) _Py_atomic_load_32bit_impl() (from CHECK_EVAL_BREAKER)
I tried in vain other linker options like thread-safe-profiling, agressive-code-generation, /OPT:NOREF.
3.10a7 can inline them in the eval-loop even if profiling only test_array.py.
I measured overheads of (1)~(4) on my own build whose eval-loop uses macros instead of them.
-----------------------------------------------------------------
Windows x64 | PGO patched overhead in eval-loop
----------------+------------------------------------------------
3.10a7 | 1.00
20210802 |
3.10rc1 | 1.09 1.05 4% (slow 43, fast 5, same 10)
20210831-20:42 |
863154c | 0.95 0.90 5% (slow 48, fast 3, same 7)
(3.11a0+) |
-----------------------------------------------------------------
Windows x86 | PGO patched overhead in eval-loop
----------------+------------------------------------------------
3.10a7 | 1.00
20210802 |
3.10rc1 | 1.15 1.13 2% (slow 29, fast 14, same 15)
20210831-20:42 |
863154c | 1.05 1.02 3% (slow 44, fast 7, same 7)
(3.11a0+) |
----------
components: C API, Interpreter Core, Windows
files: 310rc1_confirm_overhead.patch
keywords: patch
messages: 401143
nosy: Mark.Shannon, neonene, pablogsal, paul.moore, steve.dower, tim.golden, vstinner, zach.ware
priority: normal
severity: normal
status: open
title: Performance regression 3.10b1 and later on Windows
type: performance
versions: Python 3.10, Python 3.11
Added file: https://bugs.python.org/file50263/310rc1_confirm_overhead.patch
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue45116>
_______________________________________
More information about the New-bugs-announce
mailing list