PEP 670: Convert macros to functions in the Python C API
Hi, Erlend and me wrote a PEP to move away from macros in the Python C API. We are now waiting for feedback :-) Read the PEP online: https://www.python.org/dev/peps/pep-0670/ There is a copy of the PEP below for inline replies. Victor --- PEP: 670 Title: Convert macros to functions in the Python C API Author: Erlend Egeberg Aasland <erlend.aasland@protonmail.com>, Victor Stinner <vstinner@python.org> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 19-Oct-2021 Python-Version: 3.11 Abstract ======== Convert macros to static inline functions or regular functions. Remove the return value of macros having a return value, whereas they should not, to aid detecting bugs in C extensions when the C API is misused. Some function arguments are still cast to ``PyObject*`` to prevent emitting new compiler warnings. Rationale ========= The use of macros may have unintended adverse effects that are hard to avoid, even for experienced C developers. Some issues have been known for years, while others have been discovered recently in Python. Working around macro pitfalls makes the macro coder harder to read and to maintain. Converting macros to functions has multiple advantages: * By design, functions don't have macro pitfalls. * Arguments type and return type are well defined. * Debuggers and profilers can retrieve the name of inlined functions. * Debuggers can put breakpoints on inlined functions. * Variables have a well defined scope. * Code is usually easier to read and to maintain than similar macro code. Functions don't need the following workarounds for macro pitfalls: * Add parentheses around arguments. * Use line continuation characters if the function is written on multiple lines. * Add commas to execute multiple expressions. * Use ``do { ... } while (0)`` to write multiple statements. Converting macros and static inline functions to regular functions makes these regular functions accessible to projects which use Python but cannot use macros and static inline functions. Macro Pitfalls ============== The `GCC documentation <https://gcc.gnu.org/onlinedocs/cpp/Macro-Pitfalls.html>`_ lists several common macro pitfalls: - Misnesting - Operator precedence problems - Swallowing the semicolon - Duplication of side effects - Self-referential macros - Argument prescan - Newlines in arguments Performance and inlining ======================== Static inline functions is a feature added to the C99 standard. Modern C compilers have efficient heuristics to decide if a function should be inlined or not. When a C compiler decides to not inline, there is likely a good reason. For example, inlining would reuse a register which require to save/restore the register value on the stack and so increase the stack memory usage or be less efficient. Debug build ----------- When Python is built in debug mode, most compiler optimizations are disabled. For example, Visual Studio disables inlining. Benchmarks must not be run on a Python debug build, only on release build: using LTO and PGO is recommended for reliable benchmarks. PGO helps the compiler to decide if function should be inlined or not. Force inlining -------------- The ``Py_ALWAYS_INLINE`` macro can be used to force inlining. This macro uses ``__attribute__((always_inline))`` with GCC and Clang, and ``__forceinline`` with MSC. So far, previous attempts to use ``Py_ALWAYS_INLINE`` didn't show any benefit and were abandoned. See for example: `bpo-45094 <https://bugs.python.org/issue45094>`_: "Consider using ``__forceinline`` and ``__attribute__((always_inline))`` on static inline functions (``Py_INCREF``, ``Py_TYPE``) for debug build". When the ``Py_INCREF()`` macro was converted to a static inline functions in 2018 (`commit <https://github.com/python/cpython/commit/2aaf0c12041bcaadd7f2cc5a54450eefd7a6ff12>`__), it was decided not to force inlining. The machine code was analyzed with multiple C compilers and compiler options: ``Py_INCREF()`` was always inlined without having to force inlining. The only case where it was not inlined was the debug build. See discussion in the `bpo-35059 <https://bugs.python.org/issue35059>`_: "Convert ``Py_INCREF()`` and ``PyObject_INIT()`` to inlined functions". Disable inlining ---------------- On the other side, the ``Py_NO_INLINE`` macro can be used to disable inlining. It is useful to reduce the stack memory usage. It is especially useful on a LTO+PGO build which is more aggressive to inline code: see `bpo-33720 <https://bugs.python.org/issue33720>`_. The ``Py_NO_INLINE`` macro uses ``__attribute__ ((noinline))`` with GCC and Clang, and ``__declspec(noinline)`` with MSC. Specification ============= Convert macros to static inline functions ----------------------------------------- Most macros should be converted to static inline functions to prevent `macro pitfalls`_. The following macros should not be converted: * Empty macros. Example: ``#define Py_HAVE_CONDVAR``. * Macros only defining a number, even if a constant with a well defined type can better. Example: ``#define METH_VARARGS 0x0001``. * Compatibility layer for different C compilers, C language extensions, or recent C features. Example: ``#define Py_ALWAYS_INLINE __attribute__((always_inline))``. Convert static inline functions to regular functions ---------------------------------------------------- The performance impact of converting static inline functions to regular functions should be measured with benchmarks. If there is a significant slowdown, there should be a good reason to do the conversion. One reason can be hiding implementation details. Using static inline functions in the internal C API is fine: the internal C API exposes implemenation details by design and should not be used outside Python. Cast to PyObject* ----------------- When a macro is converted to a function and the macro casts its arguments to ``PyObject*``, the new function comes with a new macro which cast arguments to ``PyObject*`` to prevent emitting new compiler warnings. So the converted functions still accept pointers to structures inheriting from ``PyObject`` (ex: ``PyTupleObject``). For example, the ``Py_TYPE(obj)`` macro casts its ``obj`` argument to ``PyObject*``:: #define _PyObject_CAST_CONST(op) ((const PyObject*)(op)) static inline PyTypeObject* _Py_TYPE(const PyObject *ob) { return ob->ob_type; } #define Py_TYPE(ob) _Py_TYPE(_PyObject_CAST_CONST(ob)) The undocumented private ``_Py_TYPE()`` function must not be called directly. Only the documented public ``Py_TYPE()`` macro must be used. Later, the cast can be removed on a case by case basis, but that is out of scope for this PEP. Remove the return value ----------------------- When a macro is implemented as an expression, it has an implicit return value. In some cases, the macro must not have a return value and can be misused in third party C extensions. See `bpo-30459 <https://bugs.python.org/issue30459>`_ for the example of ``PyList_SET_ITEM()`` and ``PyCell_SET()`` macros. It is not easy to notice this issue while reviewing macro code. These macros are converted to functions using the ``void`` return type to remove their return value. Removing the return value aids detecting bugs in C extensions when the C API is misused. Backwards Compatibility ======================= Removing the return value of macros is an incompatible API change made on purpose: see the `Remove the return value`_ section. Rejected Ideas ============== Keep macros, but fix some macro issues -------------------------------------- Converting macros to functions is not needed to `remove the return value`_: casting a macro return value to ``void`` also fix the issue. For example, the ``PyList_SET_ITEM()`` macro was already fixed like that. Macros are always "inlined" with any C compiler. The duplication of side effects can be worked around in the caller of the macro. People using macros should be considered "consenting adults". People who feel unsafe with macros should simply not use them. Examples of hard to read macros =============================== _Py_NewReference() ------------------ Example showing the usage of an ``#ifdef`` inside a macro. Python 3.7 macro (simplified code):: #ifdef COUNT_ALLOCS # define _Py_INC_TPALLOCS(OP) inc_count(Py_TYPE(OP)) # define _Py_COUNT_ALLOCS_COMMA , #else # define _Py_INC_TPALLOCS(OP) # define _Py_COUNT_ALLOCS_COMMA #endif /* COUNT_ALLOCS */ #define _Py_NewReference(op) ( \ _Py_INC_TPALLOCS(op) _Py_COUNT_ALLOCS_COMMA \ Py_REFCNT(op) = 1) Python 3.8 function (simplified code):: static inline void _Py_NewReference(PyObject *op) { _Py_INC_TPALLOCS(op); Py_REFCNT(op) = 1; } PyObject_INIT() --------------- Example showing the usage of commas in a macro. Python 3.7 macro:: #define PyObject_INIT(op, typeobj) \ ( Py_TYPE(op) = (typeobj), _Py_NewReference((PyObject *)(op)), (op) ) Python 3.8 function (simplified code):: static inline PyObject* _PyObject_INIT(PyObject *op, PyTypeObject *typeobj) { Py_TYPE(op) = typeobj; _Py_NewReference(op); return op; } #define PyObject_INIT(op, typeobj) \ _PyObject_INIT(_PyObject_CAST(op), (typeobj)) The function doesn't need the line continuation character. It has an explicit ``"return op;"`` rather than a surprising ``", (op)"`` at the end of the macro. It uses one short statement per line, rather than a single long line. Inside the function, the *op* argument has a well defined type: ``PyObject*``. References ========== * `bpo-45490 <https://bugs.python.org/issue45490>`_: [meta][C API] Avoid C macro pitfalls and usage of static inline functions (October 2021). * `What to do with unsafe macros <https://discuss.python.org/t/what-to-do-with-unsafe-macros/7771>`_ (March 2021). * `bpo-43502 <https://bugs.python.org/issue43502>`_: [C-API] Convert obvious unsafe macros to static inline functions (March 2021). Copyright ========= This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. -- Night gathers, and now my watch begins. It shall not end until my death.
Extra info that I didn't put in the PEP to keep the PEP short. Since Python 3.8, multiple macros have already been converted, including Py_INCREF() and Py_TYPE() which are very commonly used and so matter for Python performance. Macros converted to static inline functions: * Py_INCREF(), Py_DECREF(), Py_XINCREF(), Py_XDECREF(): Python 3.8 * PyObject_INIT(), PyObject_INIT_VAR(): Python 3.8 * Private functions: _PyObject_GC_TRACK(), _PyObject_GC_UNTRACK(), _Py_Dealloc(): Python 3.8 * Py_REFCNT(): Python 3.10 * Py_TYPE(), Py_SIZE(): Python 3.11 Macros converted to regular functions in Python 3.9: * PyIndex_Check() * PyObject_CheckBuffer() * PyObject_GET_WEAKREFS_LISTPTR() * PyObject_IS_GC() * PyObject_NEW(): alias to PyObject_New() * PyObject_NEW_VAR(): alias to PyObjectVar_New() To keep best performances on Python built without LTO, fast private variants were added as static inline functions to the internal C API: * _PyIndex_Check() * _PyObject_IS_GC() * _PyType_HasFeature() * _PyType_IS_GC() -- Many of these changes have been made to prepare the C API to make these structure opaque: * PyObject: https://bugs.python.org/issue39573 * PyTypeObject: https://bugs.python.org/issue40170 Don't access structure members at the ABI level, but abstract them through a function call. Some functions are still static inline functions (and so still access structure members at the ABI level), since the performance impact of converting them to regular functions was not measured yet. Victor
On 20. 10. 21 3:15, Victor Stinner wrote:
Extra info that I didn't put in the PEP to keep the PEP short.
Since Python 3.8, multiple macros have already been converted, including Py_INCREF() and Py_TYPE() which are very commonly used and so matter for Python performance.
Macros converted to static inline functions:
* Py_INCREF(), Py_DECREF(), Py_XINCREF(), Py_XDECREF(): Python 3.8 * PyObject_INIT(), PyObject_INIT_VAR(): Python 3.8 * Private functions: _PyObject_GC_TRACK(), _PyObject_GC_UNTRACK(), _Py_Dealloc(): Python 3.8 * Py_REFCNT(): Python 3.10 * Py_TYPE(), Py_SIZE(): Python 3.11
Macros converted to regular functions in Python 3.9:
* PyIndex_Check() * PyObject_CheckBuffer() * PyObject_GET_WEAKREFS_LISTPTR() * PyObject_IS_GC() * PyObject_NEW(): alias to PyObject_New() * PyObject_NEW_VAR(): alias to PyObjectVar_New()
To keep best performances on Python built without LTO, fast private variants were added as static inline functions to the internal C API:
* _PyIndex_Check() * _PyObject_IS_GC() * _PyType_HasFeature() * _PyType_IS_GC()
--
Many of these changes have been made to prepare the C API to make these structure opaque:
* PyObject: https://bugs.python.org/issue39573 * PyTypeObject: https://bugs.python.org/issue40170
Don't access structure members at the ABI level, but abstract them through a function call.
Some functions are still static inline functions (and so still access structure members at the ABI level), since the performance impact of converting them to regular functions was not measured yet.
I think this info should be in the PEP. If the PEP is rejected, would all these previous changes need to be reverted? Or just the ones done in 3.11?
On Wed, Oct 20, 2021 at 10:58 AM Petr Viktorin <encukou@gmail.com> wrote:
I think this info should be in the PEP.
Ok, we added (and completed) the list to the PEP: https://www.python.org/dev/peps/pep-0670/#macros-converted-to-functions-sinc...
If the PEP is rejected, would all these previous changes need to be reverted? Or just the ones done in 3.11?
I don't know. I guess that it can be decided once the PEP will be rejected :-) Victor -- Night gathers, and now my watch begins. It shall not end until my death.
One of my motivation to write this PEP was decide how to solve the issue: "[C API] Disallow using PyFloat_AS_DOUBLE() as l-value" https://bugs.python.org/issue45476 I proposed two fixes: * Convert macros to static inline functions: https://github.com/python/cpython/pull/28961 * Fix the macro, add _Py_RVALUE(): https://github.com/python/cpython/pull/28976 I would prefer to static inline functions ;-) Victor
Well, I discussed this issue hundreds of times with Victor Stinner. I believe that this is what we have to go even if there is a very little minor performance issue, it will be not a big hurdle. we can see the benchmark from https://speed.python.org/ and CPython become faster and faster. Converting macros to functions will overcome the issue which has been pointed out as known as implementation detail leak. Regards, Dong-hee 2021년 10월 20일 (수) 오전 9:59, Victor Stinner <vstinner@python.org>님이 작성:
Hi,
Erlend and me wrote a PEP to move away from macros in the Python C API. We are now waiting for feedback :-) Read the PEP online: https://www.python.org/dev/peps/pep-0670/
There is a copy of the PEP below for inline replies.
Victor
---
PEP: 670 Title: Convert macros to functions in the Python C API Author: Erlend Egeberg Aasland <erlend.aasland@protonmail.com>, Victor Stinner <vstinner@python.org> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 19-Oct-2021 Python-Version: 3.11
Abstract ========
Convert macros to static inline functions or regular functions.
Remove the return value of macros having a return value, whereas they should not, to aid detecting bugs in C extensions when the C API is misused.
Some function arguments are still cast to ``PyObject*`` to prevent emitting new compiler warnings.
Rationale =========
The use of macros may have unintended adverse effects that are hard to avoid, even for experienced C developers. Some issues have been known for years, while others have been discovered recently in Python. Working around macro pitfalls makes the macro coder harder to read and to maintain.
Converting macros to functions has multiple advantages:
* By design, functions don't have macro pitfalls. * Arguments type and return type are well defined. * Debuggers and profilers can retrieve the name of inlined functions. * Debuggers can put breakpoints on inlined functions. * Variables have a well defined scope. * Code is usually easier to read and to maintain than similar macro code. Functions don't need the following workarounds for macro pitfalls:
* Add parentheses around arguments. * Use line continuation characters if the function is written on multiple lines. * Add commas to execute multiple expressions. * Use ``do { ... } while (0)`` to write multiple statements.
Converting macros and static inline functions to regular functions makes these regular functions accessible to projects which use Python but cannot use macros and static inline functions.
Macro Pitfalls ==============
The `GCC documentation <https://gcc.gnu.org/onlinedocs/cpp/Macro-Pitfalls.html>`_ lists several common macro pitfalls:
- Misnesting - Operator precedence problems - Swallowing the semicolon - Duplication of side effects - Self-referential macros - Argument prescan - Newlines in arguments
Performance and inlining ========================
Static inline functions is a feature added to the C99 standard. Modern C compilers have efficient heuristics to decide if a function should be inlined or not.
When a C compiler decides to not inline, there is likely a good reason. For example, inlining would reuse a register which require to save/restore the register value on the stack and so increase the stack memory usage or be less efficient.
Debug build -----------
When Python is built in debug mode, most compiler optimizations are disabled. For example, Visual Studio disables inlining. Benchmarks must not be run on a Python debug build, only on release build: using LTO and PGO is recommended for reliable benchmarks. PGO helps the compiler to decide if function should be inlined or not.
Force inlining --------------
The ``Py_ALWAYS_INLINE`` macro can be used to force inlining. This macro uses ``__attribute__((always_inline))`` with GCC and Clang, and ``__forceinline`` with MSC.
So far, previous attempts to use ``Py_ALWAYS_INLINE`` didn't show any benefit and were abandoned. See for example: `bpo-45094 <https://bugs.python.org/issue45094>`_: "Consider using ``__forceinline`` and ``__attribute__((always_inline))`` on static inline functions (``Py_INCREF``, ``Py_TYPE``) for debug build".
When the ``Py_INCREF()`` macro was converted to a static inline functions in 2018 (`commit < https://github.com/python/cpython/commit/2aaf0c12041bcaadd7f2cc5a54450eefd7a...
`__), it was decided not to force inlining. The machine code was analyzed with multiple C compilers and compiler options: ``Py_INCREF()`` was always inlined without having to force inlining. The only case where it was not inlined was the debug build. See discussion in the `bpo-35059 <https://bugs.python.org/issue35059>`_: "Convert ``Py_INCREF()`` and ``PyObject_INIT()`` to inlined functions".
Disable inlining ----------------
On the other side, the ``Py_NO_INLINE`` macro can be used to disable inlining. It is useful to reduce the stack memory usage. It is especially useful on a LTO+PGO build which is more aggressive to inline code: see `bpo-33720 <https://bugs.python.org/issue33720>`_. The ``Py_NO_INLINE`` macro uses ``__attribute__ ((noinline))`` with GCC and Clang, and ``__declspec(noinline)`` with MSC.
Specification =============
Convert macros to static inline functions -----------------------------------------
Most macros should be converted to static inline functions to prevent `macro pitfalls`_.
The following macros should not be converted:
* Empty macros. Example: ``#define Py_HAVE_CONDVAR``. * Macros only defining a number, even if a constant with a well defined type can better. Example: ``#define METH_VARARGS 0x0001``. * Compatibility layer for different C compilers, C language extensions, or recent C features. Example: ``#define Py_ALWAYS_INLINE __attribute__((always_inline))``.
Convert static inline functions to regular functions ----------------------------------------------------
The performance impact of converting static inline functions to regular functions should be measured with benchmarks. If there is a significant slowdown, there should be a good reason to do the conversion. One reason can be hiding implementation details.
Using static inline functions in the internal C API is fine: the internal C API exposes implemenation details by design and should not be used outside Python.
Cast to PyObject* -----------------
When a macro is converted to a function and the macro casts its arguments to ``PyObject*``, the new function comes with a new macro which cast arguments to ``PyObject*`` to prevent emitting new compiler warnings. So the converted functions still accept pointers to structures inheriting from ``PyObject`` (ex: ``PyTupleObject``).
For example, the ``Py_TYPE(obj)`` macro casts its ``obj`` argument to ``PyObject*``::
#define _PyObject_CAST_CONST(op) ((const PyObject*)(op))
static inline PyTypeObject* _Py_TYPE(const PyObject *ob) { return ob->ob_type; } #define Py_TYPE(ob) _Py_TYPE(_PyObject_CAST_CONST(ob))
The undocumented private ``_Py_TYPE()`` function must not be called directly. Only the documented public ``Py_TYPE()`` macro must be used.
Later, the cast can be removed on a case by case basis, but that is out of scope for this PEP.
Remove the return value -----------------------
When a macro is implemented as an expression, it has an implicit return value. In some cases, the macro must not have a return value and can be misused in third party C extensions. See `bpo-30459 <https://bugs.python.org/issue30459>`_ for the example of ``PyList_SET_ITEM()`` and ``PyCell_SET()`` macros. It is not easy to notice this issue while reviewing macro code.
These macros are converted to functions using the ``void`` return type to remove their return value. Removing the return value aids detecting bugs in C extensions when the C API is misused.
Backwards Compatibility =======================
Removing the return value of macros is an incompatible API change made on purpose: see the `Remove the return value`_ section.
Rejected Ideas ==============
Keep macros, but fix some macro issues --------------------------------------
Converting macros to functions is not needed to `remove the return value`_: casting a macro return value to ``void`` also fix the issue. For example, the ``PyList_SET_ITEM()`` macro was already fixed like that.
Macros are always "inlined" with any C compiler.
The duplication of side effects can be worked around in the caller of the macro.
People using macros should be considered "consenting adults". People who feel unsafe with macros should simply not use them.
Examples of hard to read macros ===============================
_Py_NewReference() ------------------
Example showing the usage of an ``#ifdef`` inside a macro.
Python 3.7 macro (simplified code)::
#ifdef COUNT_ALLOCS # define _Py_INC_TPALLOCS(OP) inc_count(Py_TYPE(OP)) # define _Py_COUNT_ALLOCS_COMMA , #else # define _Py_INC_TPALLOCS(OP) # define _Py_COUNT_ALLOCS_COMMA #endif /* COUNT_ALLOCS */
#define _Py_NewReference(op) ( \ _Py_INC_TPALLOCS(op) _Py_COUNT_ALLOCS_COMMA \ Py_REFCNT(op) = 1)
Python 3.8 function (simplified code)::
static inline void _Py_NewReference(PyObject *op) { _Py_INC_TPALLOCS(op); Py_REFCNT(op) = 1; }
PyObject_INIT() ---------------
Example showing the usage of commas in a macro.
Python 3.7 macro::
#define PyObject_INIT(op, typeobj) \ ( Py_TYPE(op) = (typeobj), _Py_NewReference((PyObject *)(op)), (op) )
Python 3.8 function (simplified code)::
static inline PyObject* _PyObject_INIT(PyObject *op, PyTypeObject *typeobj) { Py_TYPE(op) = typeobj; _Py_NewReference(op); return op; }
#define PyObject_INIT(op, typeobj) \ _PyObject_INIT(_PyObject_CAST(op), (typeobj))
The function doesn't need the line continuation character. It has an explicit ``"return op;"`` rather than a surprising ``", (op)"`` at the end of the macro. It uses one short statement per line, rather than a single long line. Inside the function, the *op* argument has a well defined type: ``PyObject*``.
References ==========
* `bpo-45490 <https://bugs.python.org/issue45490>`_: [meta][C API] Avoid C macro pitfalls and usage of static inline functions (October 2021). * `What to do with unsafe macros <https://discuss.python.org/t/what-to-do-with-unsafe-macros/7771>`_ (March 2021). * `bpo-43502 <https://bugs.python.org/issue43502>`_: [C-API] Convert obvious unsafe macros to static inline functions (March 2021).
Copyright =========
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. -- Night gathers, and now my watch begins. It shall not end until my death. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/2GN646CG... Code of Conduct: http://python.org/psf/codeofconduct/
-- Software Development Engineer at Line corp. Tel: +82 10-3353-9127 Email: donghee.na92@gmail.com | donghee.na@python.org | donghee.na@linecorp.com Linkedin: https://www.linkedin.com/in/dong-hee-na-2b713b49/
On Wed, 20 Oct 2021 02:55:52 +0200 Victor Stinner <vstinner@python.org> wrote:
Debug build -----------
When Python is built in debug mode, most compiler optimizations are disabled. For example, Visual Studio disables inlining. Benchmarks must not be run on a Python debug build, only on release build: using LTO and PGO is recommended for reliable benchmarks. PGO helps the compiler to decide if function should be inlined or not.
So what is the performance impact on debug builds? The numbers should be given in the PEP.
Rejected Ideas ==============
Keep macros, but fix some macro issues --------------------------------------
Converting macros to functions is not needed to `remove the return value`_: casting a macro return value to ``void`` also fix the issue. For example, the ``PyList_SET_ITEM()`` macro was already fixed like that.
Macros are always "inlined" with any C compiler.
The duplication of side effects can be worked around in the caller of the macro.
People using macros should be considered "consenting adults". People who feel unsafe with macros should simply not use them.
This says that the idea is rejected, but it does not say *why* it was rejected. Can you add that? Regards Antoine.
Hi Antoine, I completed the PEP: https://python.github.io/peps/pep-0670/ * Add benchmarks on a Python debug build: (1) macros vs static inline functions and (2) gcc -O0 vs gcc -Og * Elaborate the Debug Build section * Explain why the "keep macros" idea was rejected Diff: https://github.com/python/peps/commit/570cea56c2fdb9f9b5873a0a83462816e641c5... https://www.python.org/dev/peps/pep-0670/ will be updated soon. From what I understood, debug builds are mostly used by Python core developers to develop Python and so an important use case for performance is running the Python test suite. (1) Replacing macros with static inline functions makes Python 1.04x slower when the compiler **does not** inline static inline functions: gcc -O0. But developers using GCC and LLVM clang should get -Og when using "./configure --with-pydebug". (2) Python built with "gcc -O0" is 1.6x slower than Python built with "gcc -Og". Well, don't use gcc -O0 if you care about performance :-) I didn't run benchmarks on Python built in release mode, since gcc -O3 with LTO and PGO should inline all static inline functions and I don't expect any difference between macros and static inline functions. You can use my PR https://github.com/python/cpython/pull/29728 to run your own benchmarks. Victor -- Night gathers, and now my watch begins. It shall not end until my death.
On Tue, 23 Nov 2021 18:00:28 +0100 Victor Stinner <vstinner@python.org> wrote:
From what I understood, debug builds are mostly used by Python core developers to develop Python and so an important use case for performance is running the Python test suite.
(1) Replacing macros with static inline functions makes Python 1.04x slower when the compiler **does not** inline static inline functions: gcc -O0.
That is fine with me :-)
I didn't run benchmarks on Python built in release mode, since gcc -O3 with LTO and PGO should inline all static inline functions and I don't expect any difference between macros and static inline functions.
That would actually be interesting, since there can be surprises sometimes with compilers... (function inlining depends on heuristics, for example, and there may be positive or negative interactions with other optimizations) Regards Antoine.
On Tue, Nov 23, 2021 at 3:15 PM Antoine Pitrou <antoine@python.org> wrote:
On Tue, 23 Nov 2021 18:00:28 +0100 Victor Stinner <vstinner@python.org> wrote:
I didn't run benchmarks on Python built in release mode, since gcc -O3 with LTO and PGO should inline all static inline functions and I don't expect any difference between macros and static inline functions.
That would actually be interesting, since there can be surprises sometimes with compilers... (function inlining depends on heuristics, for example, and there may be positive or negative interactions with other optimizations)
Thanks Antoine. We definitely need to push back on such "expectations" and turn them into facts by performing careful measurements. Surprises lurk everywhere. See e.g. https://github.com/faster-cpython/ideas/issues/109#issuecomment-975619113 (and watch the Emery Berger video linked there if you haven't already). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
I ran the Python test suite to compare macros versus static inline functions (using PR 29728). I built Python with gcc -O3, LTO and PGO optimizations. => There is *no* significant performance difference. I understand that static inline functions are inlined by the C compiler (GCC) as expected. * Macros: 361 sec +- 1 sec * Static inline functions: 361 sec +- 1 sec $ python3 -m pyperf compare_to pgo-lto_test_suite_macros.json pgo-lto_test_suite_static_inline.json Benchmark hidden because not significant (1): command I built Python with: $ ./configure --with-lto --enable-optimizations --prefix $PWD/install $ taskset --cpu-list 2,3,6,7 make $ make install And I ran the following benchmark, run the test suite 5 times using pyperf which pin the process to isolated CPUs: $ python3 -m pyperf command -p1 --warmups=0 --loops=1 --values=5 -v -o ../pgo-lto_test_suite_macros.json -- ./bin/python3.11 -m test -j5 I isolated 4 logical CPUs (2, 3, 6 and 7) on 8: physical CPU cores 2 and 3 (cores 0 and 1 are not isolated). -- Right now, I cannot use pyperformance: it fails to create a virtual environment because greenlet fails to build with Python 3.11. On speed.python.org, the benchmark are still running only because... pyperformance uses a cached binary wheel of greenlet. It looks dangerous to use a cached wheel, since the Python ABI (PyThreadState) changed! Help is welcomed to repair pyperformance: https://github.com/python/pyperformance/issues/113 Victor On Wed, Nov 24, 2021 at 12:27 AM Guido van Rossum <guido@python.org> wrote:
On Tue, Nov 23, 2021 at 3:15 PM Antoine Pitrou <antoine@python.org> wrote:
On Tue, 23 Nov 2021 18:00:28 +0100 Victor Stinner <vstinner@python.org> wrote:
I didn't run benchmarks on Python built in release mode, since gcc -O3 with LTO and PGO should inline all static inline functions and I don't expect any difference between macros and static inline functions.
That would actually be interesting, since there can be surprises sometimes with compilers... (function inlining depends on heuristics, for example, and there may be positive or negative interactions with other optimizations)
Thanks Antoine. We definitely need to push back on such "expectations" and turn them into facts by performing careful measurements. Surprises lurk everywhere. See e.g. https://github.com/faster-cpython/ideas/issues/109#issuecomment-975619113 (and watch the Emery Berger video linked there if you haven't already).
-- --Guido van Rossum (python.org/~guido) Pronouns: he/him (why is my pronoun here?) _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/TPUARSPZ... Code of Conduct: http://python.org/psf/codeofconduct/
-- Night gathers, and now my watch begins. It shall not end until my death.
On 11/23/2021 6:21 PM, Guido van Rossum wrote:
Thanks Antoine. We definitely need to push back on such "expectations" and turn them into facts by performing careful measurements. Surprises lurk everywhere. See e.g. https://github.com/faster-cpython/ideas/issues/109#issuecomment-975619113 <https://github.com/faster-cpython/ideas/issues/109#issuecomment-975619113> (and watch the Emery Berger video linked there if you haven't already).
Surprises indeed. When I discussed this with my daughter after watching it, she told me that the Sims 2 with multiple mods and multiple player characters, the game loaded very slowly, taking several minute on machines of the time. Some players discovered that it loaded minutes faster if player character names and corresponding filenames were limited to a subset of printable ascii chars (no space, %, and some others). Someone even wrote a renamer program. Has Python on linux been run with with the cos program yet? -- Terry Jan Reedy
Brandt looked at coz for Python but it didn't seem to find anything useful -- it singled out random lines in the code. :-( On Wed, Nov 24, 2021 at 10:13 AM Terry Reedy <tjreedy@udel.edu> wrote:
On 11/23/2021 6:21 PM, Guido van Rossum wrote:
Thanks Antoine. We definitely need to push back on such "expectations" and turn them into facts by performing careful measurements. Surprises lurk everywhere. See e.g.
https://github.com/faster-cpython/ideas/issues/109#issuecomment-975619113 <https://github.com/faster-cpython/ideas/issues/109#issuecomment-975619113>
(and watch the Emery Berger video linked there if you haven't already).
Surprises indeed. When I discussed this with my daughter after watching it, she told me that the Sims 2 with multiple mods and multiple player characters, the game loaded very slowly, taking several minute on machines of the time. Some players discovered that it loaded minutes faster if player character names and corresponding filenames were limited to a subset of printable ascii chars (no space, %, and some others). Someone even wrote a renamer program.
Has Python on linux been run with with the cos program yet?
-- Terry Jan Reedy
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/3QCWTG3Q... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
On 23. 11. 21 18:00, Victor Stinner wrote:
I completed the PEP: https://python.github.io/peps/pep-0670/
What I don't like about this PEP is that it documents changes that were already pushed, not planned ones. But, what's done is done... Are there more macros that are yet to be converted to macros, other than the ones in GH-29728? If so, can you give a list? Since this is about converting existing macros (and not writing new ones), can you talk about which of the "macro pitfalls" apply to the macros in CPython that were/will be changed? The "Backwards Compatibility" section is very small. Can you give a list of macros which lost/will lose "return values"? Can you add the fact that some macros now can't be used as l-values? (and list which ones?) This change is also breaking existing code. Are there any other issues that break existing code? (Even code that, for example, shouldn't work according to Python documentation, but still works fine in practice.) The "Cast to PyObject*" section talks about adding new private functions like _Py_TYPE, which are type-safe, but keeping old names (like Py_TYPE) as macros that do a cast. Could the newly added functions be made public from the start? (They could use names like Py_Type.) This would allow languages that don't have macros to use them directly, and if the non-typesafe macros are ever discouraged/deprecated/removed, this would allow writing compatible code now.
On Wed, Nov 24, 2021 at 10:59 AM Petr Viktorin <encukou@gmail.com> wrote:
Are there more macros that are yet to be converted to macros,
I suppose that you mean "to be converted to functions". Yes, there are many, it's the purpose of the PEP. I didn't provide a list. I would prefer to do it on a case by case basis, as I did previously. To answer your question: it's basically all macros, especially the ones defined by the public C API, except the ones excluded by the PEP: https://www.python.org/dev/peps/pep-0670/#convert-macros-to-static-inline-fu...
other than the ones in GH-29728?
The purpose of this PR is only to run benchmarks to compare the performance of macros versus static inline functions. The PR title is "Convert static inline to macros": it converts existing Python 3.11 static inline functions back to Python 3.6/3.7 macros. It's basically the opposite of the PEP ;-)
The "Backwards Compatibility" section is very small. Can you give a list of macros which lost/will lose "return values"?
https://bugs.python.org/issue45476 lists many of them. See also: https://github.com/python/cpython/pull/28976
Can you add the fact that some macros now can't be used as l-values?
If you are are talking about my merged change preventing using Py_TYPE() as an l-value, this is out of the scope of the PEP on purpose. Py_TYPE(), Py_REFCNT() and Py_SIZE() could be used an l-value in Python 3.9, but it's no longer the case in Python 3.11. Apart of that, I'm not aware of other macros which could be "abused" as l-value. There are macros which can be "abused" ("used") to access to structure members and object internals. For example, &PyTuple_GET_ITEM(tuple, 0) and &PyList_GET_ITEM(list, 0) can be "abused" to access directly to an array of PyObject* (PyObject** type) and so modify directly a tuple/list. I would like to change that (disallow it), but it's out of the scope of the PEP. See https://bugs.python.org/issue41078 for my previous failed attempt (it broke too many things). But this is more in the scope of the PEP 620 which is a different PEP.
Are there any other issues that break existing code?
I listed all known backward incompatibles changes in the Backward Compatibility section. I'm not aware of other backward incompatible changes caused by the PEP. Converting macros to static inline functions or regular functions didn't change the API for the macros already converted, the ones listed in the PEP.
The "Cast to PyObject*" section talks about adding new private functions like _Py_TYPE, which are type-safe, but keeping old names (like Py_TYPE) as macros that do a cast. Could the newly added functions be made public from the start? (They could use names like Py_Type.) This would allow languages that don't have macros to use them directly, and if the non-typesafe macros are ever discouraged/deprecated/removed, this would allow writing compatible code now.
I don't want to increase the size of the C API and so I chose to make the inner function accepting PyObject* private. I see the addition of an hypothetical Py_Type() function as an increase of the maintenance burden: we would have to maintain it, document it, maybe add it to the limited C API / stable ABI, write tests, etc. I prefer to restrict the scope of the PEP. If you want to add variants only accepting PyObject*, that's fine, but I suggest to open a separated issue / PEP. Also, it can be discussed on a case by case basic (function per function). Victor -- Night gathers, and now my watch begins. It shall not end until my death.
On 24. 11. 21 13:20, Victor Stinner wrote:
On Wed, Nov 24, 2021 at 10:59 AM Petr Viktorin <encukou@gmail.com> wrote:
Are there more macros that are yet to be converted to macros,
I suppose that you mean "to be converted to functions". Yes, there are many, it's the purpose of the PEP.
I didn't provide a list. I would prefer to do it on a case by case basis, as I did previously.
To answer your question: it's basically all macros, especially the ones defined by the public C API, except the ones excluded by the PEP: https://www.python.org/dev/peps/pep-0670/#convert-macros-to-static-inline-fu...
other than the ones in GH-29728?
The purpose of this PR is only to run benchmarks to compare the performance of macros versus static inline functions. The PR title is "Convert static inline to macros": it converts existing Python 3.11 static inline functions back to Python 3.6/3.7 macros. It's basically the opposite of the PEP ;-)
The "Backwards Compatibility" section is very small. Can you give a list of macros which lost/will lose "return values"?
https://bugs.python.org/issue45476 lists many of them. See also: https://github.com/python/cpython/pull/28976
Can you put this in the PEP? If things should be evaluated on a case-by-case basiswe should know about the cases. Also, this PR is about preventing the use of some macros as l-values, which you say is out of scope for the PEP. I'm connfused.
Can you add the fact that some macros now can't be used as l-values?
If you are are talking about my merged change preventing using Py_TYPE() as an l-value, this is out of the scope of the PEP on purpose.
Py_TYPE(), Py_REFCNT() and Py_SIZE() could be used an l-value in Python 3.9, but it's no longer the case in Python 3.11. Apart of that, I'm not aware of other macros which could be "abused" as l-value.
Wait, so this PEP is about converting macros to functions, but not about converting Py_SIZE to a function? I'm confused. Why is Py_SIZE listed in the PEP?
There are macros which can be "abused" ("used") to access to structure members and object internals. For example, &PyTuple_GET_ITEM(tuple, 0) and &PyList_GET_ITEM(list, 0) can be "abused" to access directly to an array of PyObject* (PyObject** type) and so modify directly a tuple/list. I would like to change that (disallow it), but it's out of the scope of the PEP. See https://bugs.python.org/issue41078 for my previous failed attempt (it broke too many things). But this is more in the scope of the PEP 620 which is a different PEP. >
Are there any other issues that break existing code?
I listed all known backward incompatibles changes in the Backward Compatibility section. I'm not aware of other backward incompatible changes caused by the PEP.
Converting macros to static inline functions or regular functions didn't change the API for the macros already converted, the ones listed in the PEP.
It did for e.g. Py_SIZE, which no longer behaves like in 3.9, nor as it was documented in 3.8: https://docs.python.org/3.8/c-api/structures.html#c.Py_SIZE Yet Py_SIZE is listed in the PEP as "Macros converted to static inline functions", so clearly it is in scope. Same for Py_TYPE. Are there others?
The "Cast to PyObject*" section talks about adding new private functions like _Py_TYPE, which are type-safe, but keeping old names (like Py_TYPE) as macros that do a cast. Could the newly added functions be made public from the start? (They could use names like Py_Type.) This would allow languages that don't have macros to use them directly, and if the non-typesafe macros are ever discouraged/deprecated/removed, this would allow writing compatible code now.
I don't want to increase the size of the C API and so I chose to make the inner function accepting PyObject* private.
I see the addition of an hypothetical Py_Type() function as an increase of the maintenance burden: we would have to maintain it, document it, maybe add it to the limited C API / stable ABI, write tests, etc.
I prefer to restrict the scope of the PEP. If you want to add variants only accepting PyObject*, that's fine, but I suggest to open a separated issue / PEP. Also, it can be discussed on a case by case basic (function per function).
Since functions like _Py_TYPE will need to be maintained as part of the stable ABI, I'd like to do this right from the start. If you don't, can you add this to Rejected ideas? I'm still interested in:
Since this is about converting existing macros (and not writing new ones), can you talk about which of the "macro pitfalls" apply to the macros in CPython that were/will be changed? Is that just a theoretical issue?
On Wed, Nov 24, 2021 at 2:18 PM Petr Viktorin <encukou@gmail.com> wrote:
The "Backwards Compatibility" section is very small. Can you give a list of macros which lost/will lose "return values"?
https://bugs.python.org/issue45476 lists many of them. See also: https://github.com/python/cpython/pull/28976
Also, this PR is about preventing the use of some macros as l-values, which you say is out of scope for the PEP. I'm connfused.
Oh right, now I'm also confused :-) I forgot about the details. "Py_TYPE(obj) = new_type;" was used in 3rd party C extensions when defining static types to work around linker issues on Windows. Changing Py_TYPE() to disallow using it as an l-value is an incompatible change. From what I saw in bpo-45476, the functions that I propose to change are not used as l-value. Technically, it's an incompatible change. In practice, it should not impact any 3rd party project. For example, PyFloat_AS_DOUBLE() is used to read a float value (ex: "double x = PyFloat_AS_DOUBLE(obj);"), but not to set a float value (ex: "PyFloat_AS_DOUBLE(obj) = 1.0;"). Ok, I should clarify that in the PEP.
Wait, so this PEP is about converting macros to functions, but not about converting Py_SIZE to a function? I'm confused. Why is Py_SIZE listed in the PEP?
Py_SIZE() is already converted to a static inline function. Later, it can be converted to a regular function if it makes sense. It's listed in the PEP to show macros which are already converted, to help to estimate how many 3rd party applications would be affected by the PEP. Py_REFCNT(), Py_TYPE() and Py_SIZE() are special because they were used as l-value on purpose. As far as I know, they were the only 3 macros used as l-value, no? Victor -- Night gathers, and now my watch begins. It shall not end until my death.
On 24. 11. 21 15:32, Victor Stinner wrote:
On Wed, Nov 24, 2021 at 2:18 PM Petr Viktorin <encukou@gmail.com> wrote:
The "Backwards Compatibility" section is very small. Can you give a list of macros which lost/will lose "return values"?
https://bugs.python.org/issue45476 lists many of them. See also: https://github.com/python/cpython/pull/28976
Also, this PR is about preventing the use of some macros as l-values, which you say is out of scope for the PEP. I'm connfused.
Oh right, now I'm also confused :-) I forgot about the details.
"Py_TYPE(obj) = new_type;" was used in 3rd party C extensions when defining static types to work around linker issues on Windows. Changing Py_TYPE() to disallow using it as an l-value is an incompatible change.
From what I saw in bpo-45476, the functions that I propose to change are not used as l-value. Technically, it's an incompatible change. In practice, it should not impact any 3rd party project.
For example, PyFloat_AS_DOUBLE() is used to read a float value (ex: "double x = PyFloat_AS_DOUBLE(obj);"), but not to set a float value (ex: "PyFloat_AS_DOUBLE(obj) = 1.0;").
Ok, I should clarify that in the PEP.
Yes. *Each* incompatible change should be listed, even if you believe it won't affect any project. The PEP reader should be allowed to judge if your assumptions are correct. e.g. I've seen projects actually use "Py_TYPE(obj) = new_type;" to change an object's type after it was given to Python code. It would be great to document why that's wrong *and* what to do instead, both in the PEP that introduced the change and in the "What's New" entry.
Wait, so this PEP is about converting macros to functions, but not about converting Py_SIZE to a function? I'm confused. Why is Py_SIZE listed in the PEP?
Py_SIZE() is already converted to a static inline function. Later, it can be converted to a regular function if it makes sense.
It's listed in the PEP to show macros which are already converted, to help to estimate how many 3rd party applications would be affected by the PEP.
Is such an estimate available?
Py_REFCNT(), Py_TYPE() and Py_SIZE() are special because they were used as l-value on purpose. As far as I know, they were the only 3 macros used as l-value, no?
Who knows? If there's a list of what to change, someone can go through it and answer this for each macro.
On Wed, Nov 24, 2021 at 10:59 AM Petr Viktorin <encukou@gmail.com> wrote:
Since this is about converting existing macros (and not writing new ones), can you talk about which of the "macro pitfalls" apply to the macros in CPython that were/will be changed?
The PEP 670 lists many pitfalls affecting existing macros. Some pitfalls are already worked around in the current implementations, but the point is that it's easy to miss pitfalls when reviewing code adding new macros or modifying macros. Erlend did an analysis in: https://bugs.python.org/issue43502 For macros reusing arguments (known as "Duplication of side effects" in GCC Macro Pitfalls), see his list: https://bugs.python.org/file49877/macros-that-reuse-args.txt Victor
On 24. 11. 21 15:22, Victor Stinner wrote:
On Wed, Nov 24, 2021 at 10:59 AM Petr Viktorin <encukou@gmail.com> wrote:
Since this is about converting existing macros (and not writing new ones), can you talk about which of the "macro pitfalls" apply to the macros in CPython that were/will be changed?
The PEP 670 lists many pitfalls affecting existing macros. Some pitfalls are already worked around in the current implementations, but the point is that it's easy to miss pitfalls when reviewing code adding new macros or modifying macros.
Erlend did an analysis in: https://bugs.python.org/issue43502
For macros reusing arguments (known as "Duplication of side effects" in GCC Macro Pitfalls), see his list: https://bugs.python.org/file49877/macros-that-reuse-args.txt
That's s nice list. Could you link to it in the PEP, so the next person won't have to ask? Meanwhile, I think I found a major source of my confusion with the PEP: I'm not clear on what it actually proposes. Is it justification for changes that were already done, or a plan for more changes, or a policy change ("don't write a public macro if it can be a function"), or all of those?
participants (6)
-
Antoine Pitrou
-
Dong-hee Na
-
Guido van Rossum
-
Petr Viktorin
-
Terry Reedy
-
Victor Stinner