Le mer. 1 juil. 2020 à 03:53, Inada Naoki firstname.lastname@example.org a écrit :
I confirmed the performance regression, although the difference is 12%. And I find the commit cause the regression.
The regression is not caused by "static inline" function is not inlined by compiler. The commit changed PyType_HasFeature to call regular function PyType_GetFlags always.
On Fedora 32 with GCC 10.1.1, even if PyType_GetFlags() is a function, the function call is inlined. This is thanks to LTO (and -fno-semantic-interposition, since Fedora builds Python with --enable-shared, which is not the case for the macOS installer).
The python.org macOS installers of Python 3.8.3 and Python 3.9.0b3 are *not* built with LTO or PGO: see Mac/BuildScript/build-installer.py. LTO and PGO can make Python between 10 and 30% faster, it's very significant.
I created https://bugs.python.org/issue41181 with a PR to enable LTO and PGO in the script building the macOS installer.
I confirm that using LTO+PGO, clang also inlines the PyType_GetFlags() function call in tuplegetter_descr_get(): https://bugs.python.org/issue41181#msg372744
Victor -- Night gathers, and now my watch begins. It shall not end until my death.