Steering Council reply to PEP 670 -- Convert macros to functions in the Python C API
Thank you for submitting PEP 670 (Convert macros to functions in the Python C API)! The steering council is still discussing it. We're not ready to accept the PEP as a whole, but there are parts on which we agree unanimously. To unblock the work, we're making the following pronouncements: *All other things being equal, static inline functions are better than macros.* Specifically, inline static functions should be preferred over function-like macros in new code. Please add a note about this to PEP 7. *The performance impact of changing macros to static inline functions is negligible.* The performance changes are indistinguishable from noise caused by other factors, such as layout changes and specific compiler optimizations. It's possible that this isn't true in all cases. If any are found, they should be kept as macros and added to the benchmark suite to avoid regressions. More generally, if there is a non-obvious reason for a conscious design decision, it should be explained (e.g. in a comment) to help future developers. *When there is no backwards compatibility issue, macros can be changed to static inline functions.* In effect, the SC accepts the first part of the PEP, except cases where: a macro is converted to macro and function ("Cast to PyObject*"), and where the return value is removed. Please do the changes in several smaller pull requests, don't skip reviews, and be prepared to revert if any issues are found. The straightforward macros should be converted first. When they are, we hope that it will become easier to evaluate the remaining ones. - Petr, on behalf of the Steering Council
On 08. 02. 22 12:07, Erlend Aasland wrote:
Petr, SC, thanks for your time and response! Just a quick comment:
On 8 Feb 2022, at 11:32, Petr Viktorin <pviktori@redhat.com> wrote:
In effect, the SC accepts the first part of the PEP, except cases where:
What status does that put the PEP in?
Still draft. It's still on the SC agenda.
Hi Petr, Thanks for the SC review, it's very helpful! I know that it's a big PEP :-) On Tue, Feb 8, 2022 at 11:33 AM Petr Viktorin <pviktori@redhat.com> wrote:
*All other things being equal, static inline functions are better than macros.* Specifically, inline static functions should be preferred over function-like macros in new code. Please add a note about this to PEP 7.
Ok, I created: https://github.com/python/peps/pull/2315
*When there is no backwards compatibility issue, macros can be changed to static inline functions.* In effect, the SC accepts the first part of the PEP, except cases where: a macro is converted to macro and function ("Cast to PyObject*"),
Right now, a large number of macros cast their argument to a type. A few examples: #define PyObject_TypeCheck(ob, type) PyObject_TypeCheck(_PyObject_CAST(ob), type) #define PyTuple_GET_ITEM(op, i) (_PyTuple_CAST(op)->ob_item[i]) #define PyDict_GET_SIZE(mp) (assert(PyDict_Check(mp)),((PyDictObject *)mp)->ma_used) #define PyWeakref_GET_OBJECT(ref) (... ((PyWeakReference *)(ref))->wr_object ...) Does it mean that these macros must remain macros? PEP 670 proposes adding a macro to still cast their arguments to the requested type so the PEP doesn't add any new compiler warnings. The PEP proposes to consider removing these implicit casts later (to reduce the scope and limit annoyance caused by the PEP). If the macro is converted to a static inline function without such convenient macro, C extensions which don't pass the expected types will get spammed by warnings. Well, it's easy to fix, and once it is fixed, the code remains compatible with old Python versions. I'm not sure that I understood the SC statement here: does it mean that these specific macros (which cast their arguments) must remain macros, or that it's ok to convert them to static inline functions (even if it can emit new warnings)? If the SC is ok to add new compiler warnings, I'm fine with it. Most functions are documented with an exact type, they are not documented to cast implicitly their arguments to the expected types. For example, Py_INCREF() expects a PyObject* type in the documentation: https://docs.python.org/dev/c-api/refcounting.html#c.Py_INCREF -- Or is the problem that having a macro to wrap a static inline function requires to change the function name? Well, technically, it's not needed... I wrote a PR so the macro and the function have the same name: https://github.com/python/cpython/pull/31217 For example, the Py_INCREF() macro now calls Py_INCREF(), instead of calling _Py_INCREF(). Static inline functions are not part of the ABI since they are inlined, but it's maybe better to avoid the "_" prefix to avoid confusion in debuggers and profilers (especially when comparing old and new Python versions).
and where the return value is removed.
In this case, I propose to leave these macros unchanged in PEP 670 in this case, as it already excludes macros which can be used as l-values (macros changed by PEP 674). I would prefer to have the whole PEP 670 either accepted or rejected. I'm not comfortable with a half-accepted status, it's misleading for readers. Victor -- Night gathers, and now my watch begins. It shall not end until my death.
On Tue, Feb 8, 2022 at 8:30 AM Victor Stinner <vstinner@python.org> wrote:
Hi Petr,
Thanks for the SC review, it's very helpful! I know that it's a big PEP :-)
On Tue, Feb 8, 2022 at 11:33 AM Petr Viktorin <pviktori@redhat.com> wrote:
*All other things being equal, static inline functions are better than macros.* Specifically, inline static functions should be preferred over function-like macros in new code. Please add a note about this to PEP 7.
Ok, I created: https://github.com/python/peps/pull/2315
*When there is no backwards compatibility issue, macros can be changed to static inline functions.* In effect, the SC accepts the first part of the PEP, except cases where: a macro is converted to macro and function ("Cast to PyObject*"),
Right now, a large number of macros cast their argument to a type. A few examples:
#define PyObject_TypeCheck(ob, type) PyObject_TypeCheck(_PyObject_CAST(ob), type) #define PyTuple_GET_ITEM(op, i) (_PyTuple_CAST(op)->ob_item[i]) #define PyDict_GET_SIZE(mp) (assert(PyDict_Check(mp)),((PyDictObject *)mp)->ma_used) #define PyWeakref_GET_OBJECT(ref) (... ((PyWeakReference *)(ref))->wr_object ...)
Does it mean that these macros must remain macros?
PEP 670 proposes adding a macro to still cast their arguments to the requested type so the PEP doesn't add any new compiler warnings. The PEP proposes to consider removing these implicit casts later (to reduce the scope and limit annoyance caused by the PEP).
If the macro is converted to a static inline function without such convenient macro, C extensions which don't pass the expected types will get spammed by warnings. Well, it's easy to fix, and once it is fixed, the code remains compatible with old Python versions.
I'm not sure that I understood the SC statement here: does it mean that these specific macros (which cast their arguments) must remain macros, or that it's ok to convert them to static inline functions (even if it can emit new warnings)?
If the SC is ok to add new compiler warnings, I'm fine with it. Most functions are documented with an exact type, they are not documented to cast implicitly their arguments to the expected types.
For example, Py_INCREF() expects a PyObject* type in the documentation: https://docs.python.org/dev/c-api/refcounting.html#c.Py_INCREF
--
Or is the problem that having a macro to wrap a static inline function requires to change the function name? Well, technically, it's not needed... I wrote a PR so the macro and the function have the same name: https://github.com/python/cpython/pull/31217
For example, the Py_INCREF() macro now calls Py_INCREF(), instead of calling _Py_INCREF().
Static inline functions are not part of the ABI since they are inlined, but it's maybe better to avoid the "_" prefix to avoid confusion in debuggers and profilers (especially when comparing old and new Python versions).
and where the return value is removed.
In this case, I propose to leave these macros unchanged in PEP 670 in this case, as it already excludes macros which can be used as l-values (macros changed by PEP 674).
I would prefer to have the whole PEP 670 either accepted or rejected. I'm not comfortable with a half-accepted status, it's misleading for readers.
That's why we pointed out what is not controversial and can be done without a PEP. That lets you update the PEP for everything remaining so we can focus on what really need discussion instead on things that are an obvious "yes" to us.
On 08. 02. 22 17:27, Victor Stinner wrote:
Hi Petr,
Thanks for the SC review, it's very helpful! I know that it's a big PEP :-)
On Tue, Feb 8, 2022 at 11:33 AM Petr Viktorin <pviktori@redhat.com> wrote:
*All other things being equal, static inline functions are better than macros.* Specifically, inline static functions should be preferred over function-like macros in new code. Please add a note about this to PEP 7.
Ok, I created: https://github.com/python/peps/pull/2315
*When there is no backwards compatibility issue, macros can be changed to static inline functions.* In effect, the SC accepts the first part of the PEP, except cases where: a macro is converted to macro and function ("Cast to PyObject*"),
Right now, a large number of macros cast their argument to a type. A few examples:
#define PyObject_TypeCheck(ob, type) PyObject_TypeCheck(_PyObject_CAST(ob), type) #define PyTuple_GET_ITEM(op, i) (_PyTuple_CAST(op)->ob_item[i]) #define PyDict_GET_SIZE(mp) (assert(PyDict_Check(mp)),((PyDictObject *)mp)->ma_used)
When I look at the Rationale points, and for the first three of these macros I didn't find any that sound very convincing: - Functions don't have macro pitfalls, but these simple macros don't fall into the pits. - Fully defining the argument types means getting rid of the cast, breaking some code that uses the macro - Debugger support really isn't that useful for these simple macros - There are no new variables - Macro tricks (parentheses and comma-separated expressions) are needed, but they're present and tested. Sure, if these were written now, they should be static inline functions. But I don't think it's worth a massive change to working code. The "massive change to working code" part is important. Such efforts tend to have unexpected issues, which have an unfortunate tendency to surface before release and contribute to release manager burnout.
#define PyWeakref_GET_OBJECT(ref) (... ((PyWeakReference *)(ref))->wr_object ...)
That one looks more reasonable. How common is it? How should we identify which macros are dangerous enough to need changing now?
Does it mean that these macros must remain macros?
For now, yes. I hope that changing the macros that are OK will make it easier to review the remaining ones, and it might also help future versions of the PEP to focus on the problematic cases. Maybe it might also make sense to go from the other side: are there any macros that are clearly dangerous and should be changed ASAP? (I hope not -- those are hopefully handled one-by-one in individual issues.)
PEP 670 proposes adding a macro to still cast their arguments to the requested type so the PEP doesn't add any new compiler warnings. The PEP proposes to consider removing these implicit casts later (to reduce the scope and limit annoyance caused by the PEP).
If the macro is converted to a static inline function without such convenient macro, C extensions which don't pass the expected types will get spammed by warnings. Well, it's easy to fix, and once it is fixed, the code remains compatible with old Python versions.
I'm not sure that I understood the SC statement here: does it mean that these specific macros (which cast their arguments) must remain macros, or that it's ok to convert them to static inline functions (even if it can emit new warnings)?
You understood correctly. It's not clear that there is enough reason to mass-change working code.
If the SC is ok to add new compiler warnings, I'm fine with it. Most functions are documented with an exact type, they are not documented to cast implicitly their arguments to the expected types.
For example, Py_INCREF() expects a PyObject* type in the documentation: https://docs.python.org/dev/c-api/refcounting.html#c.Py_INCREF
--
Or is the problem that having a macro to wrap a static inline function requires to change the function name? Well, technically, it's not needed... I wrote a PR so the macro and the function have the same name: https://github.com/python/cpython/pull/31217
For example, the Py_INCREF() macro now calls Py_INCREF(), instead of calling _Py_INCREF().
Static inline functions are not part of the ABI since they are inlined, but it's maybe better to avoid the "_" prefix to avoid confusion in debuggers and profilers (especially when comparing old and new Python versions).
and where the return value is removed.
In this case, I propose to leave these macros unchanged in PEP 670 in this case, as it already excludes macros which can be used as l-values (macros changed by PEP 674).
I would prefer to have the whole PEP 670 either accepted or rejected. I'm not comfortable with a half-accepted status, it's misleading for readers.
The SC is *not sure yet* if it can be accepted as-is. It continues to be like any other draft PEP: a proposal, which can still be changed. You now can: - wait for the SC to discuss the rest of the proposal (which will take time, and will probably involve more public discussion, like this). - withdraw the PEP and write a new one. - rewrite the PEP to focus on the remaining issues. Petr (*not* writing on behalf of the entire SC this time)
On Wed, Feb 9, 2022 at 1:04 PM Petr Viktorin <encukou@gmail.com> wrote:
Right now, a large number of macros cast their argument to a type. A few examples:
#define PyObject_TypeCheck(ob, type) PyObject_TypeCheck(_PyObject_CAST(ob), type) #define PyTuple_GET_ITEM(op, i) (_PyTuple_CAST(op)->ob_item[i]) #define PyDict_GET_SIZE(mp) (assert(PyDict_Check(mp)),((PyDictObject *)mp)->ma_used)
When I look at the Rationale points, and for the first three of these macros I didn't find any that sound very convincing: - Functions don't have macro pitfalls, but these simple macros don't fall into the pits. - Fully defining the argument types means getting rid of the cast, breaking some code that uses the macro - Debugger support really isn't that useful for these simple macros - There are no new variables
Using a static inline function, profilers like Linux perf can count the CPU time spend in static inline functions (on each CPU instruction when using annotated assembly code). For example, you can see how much time (accumulated time) is spent in Py_INCREF(), to have an idea of the cost of reference counting in Python. It's not possible when using macros. For debuggers, you're right that Py_INCREF() and PyTuple_GET_ITEM() macros are very simple and it's not so hard to guess that the debugger is executing their code in the C code or the assembly code. But the purpose of PEP 670 is to convert way more complex macros. I wrote a PR to convert unicodeobject.h macros, IMO there are one of the worst macros in Python C API: https://github.com/python/cpython/pull/31221 I always wanted to convert them, but some core devs were afraid of performance regressions. So I wrote a PEP to prove that there is no impact on performance. IMO the new unicodeobject.h code is way more readable. I added two "kind" variables which have a well defined scope. In macros, no macro is used currently to avoid macro pitfalls: name conflict if there is already a "kind" macro where the macro is used. The conversion to static inline macros also detected a bug with "const PyObject*": using PyUnicode_READY() on a const Unicode string is wrong, this function modifies the object if it's not ready (WCHAR kind). It also detected bugs on "const void *data" used to *write* into string characters (PyUnicode_WRITE).
- Macro tricks (parentheses and comma-separated expressions) are needed, but they're present and tested.
The PEP rationale starts with: "The use of macros may have unintended adverse effects that are hard to avoid, even for experienced C developers. Some issues have been known for years, while others have been discovered recently in Python. Working around macro pitfalls makes the macro coder harder to read and to maintain." Are you saying that all core devs are well aware of all macro pitfalls and always avoid them? I'm well aware of these pitfalls, and I fall into their trap multiple times. The bpo-30459 issue about PyList_SET_ITEM() is a concrete example of old bugs that nobody noticed before.
The "massive change to working code" part is important. Such efforts tend to have unexpected issues, which have an unfortunate tendency to surface before release and contribute to release manager burnout.
Aren't you exaggerating a bit? Would you mind to elaborate? Do you have examples of issues caused by converting macros to static inline functions? I'm not talking about incompatible C API changes made on purpose, but about PEP 670 changes. The PEP lists many macros converted to static inline functions and static inline functions converted to regular functions: https://www.python.org/dev/peps/pep-0670/#macros-converted-to-functions-sinc... Are you aware of release manager burnout caused by these changes? Victor -- Night gathers, and now my watch begins. It shall not end until my death.
On Wed, Feb 9, 2022 at 8:54 AM Victor Stinner <vstinner@python.org> wrote:
On Wed, Feb 9, 2022 at 1:04 PM Petr Viktorin <encukou@gmail.com> wrote:
Right now, a large number of macros cast their argument to a type. A few examples:
#define PyObject_TypeCheck(ob, type) PyObject_TypeCheck(_PyObject_CAST(ob), type) #define PyTuple_GET_ITEM(op, i) (_PyTuple_CAST(op)->ob_item[i]) #define PyDict_GET_SIZE(mp) (assert(PyDict_Check(mp)),((PyDictObject *)mp)->ma_used)
When I look at the Rationale points, and for the first three of these macros I didn't find any that sound very convincing: - Functions don't have macro pitfalls, but these simple macros don't fall into the pits. - Fully defining the argument types means getting rid of the cast, breaking some code that uses the macro - Debugger support really isn't that useful for these simple macros - There are no new variables
Using a static inline function, profilers like Linux perf can count the CPU time spend in static inline functions (on each CPU instruction when using annotated assembly code). For example, you can see how much time (accumulated time) is spent in Py_INCREF(), to have an idea of the cost of reference counting in Python. It's not possible when using macros.
For debuggers, you're right that Py_INCREF() and PyTuple_GET_ITEM() macros are very simple and it's not so hard to guess that the debugger is executing their code in the C code or the assembly code. But the purpose of PEP 670 is to convert way more complex macros. I wrote a PR to convert unicodeobject.h macros, IMO there are one of the worst macros in Python C API: https://github.com/python/cpython/pull/31221
I always wanted to convert them, but some core devs were afraid of performance regressions. So I wrote a PEP to prove that there is no impact on performance.
IMO the new unicodeobject.h code is way more readable. I added two "kind" variables which have a well defined scope. In macros, no macro is used currently to avoid macro pitfalls: name conflict if there is already a "kind" macro where the macro is used.
The conversion to static inline macros also detected a bug with "const PyObject*": using PyUnicode_READY() on a const Unicode string is wrong, this function modifies the object if it's not ready (WCHAR kind). It also detected bugs on "const void *data" used to *write* into string characters (PyUnicode_WRITE).
Nice example PR. For now what we're suggesting is to proceed with changes where it cannot lead to compilation warnings (failures in the widely used -Werror mode) being introduced. Yes, that may still leave a lot of desirable cleanup to be done. But we don't want to block everything while discussing the rest. Incremental improvement is good even if not everything is being done yet. Two differing examples from that PR 31221: Hold off as unsafe for now: Macros that do things like (PyWhateverObject*)(op) such as PyUnicode_CHECK_INTERNED(op) should not have the casting part of macros replaced yet. Doing so could result in a warning or failure at -Werror compile time if someone was not using a PyObject*. Code is *supposed* to, but the cast means anyone could've used PyUnicodeObject or whatnot itself. Perhaps use a hybrid approach when feasible similar to: #define PyUnicode_CHECK_INTERNED(op) _PyUnicode_CHECK_INTERNED((PyASCIIObject *)(op)) That should make it safe. And I believe you do mention this technique as something to be used in the PEP. Safe: PyUnicode_WRITE() in that PR. At first glance that is full of casts on its data and value parameters so it raises suspicion. But further analysis shows that data becomes a void* so there is no casting issue there unless someone were passing a non-pointer in which isn't rightfully something code should *ever* have done. And value would only be an issue if someone were passing a non-integer type in, that also seems exceedingly unlikely as there'd never be a point in writing code like that. So that kind of change is fine to proceed with.
The "massive change to working code" part is important. Such efforts
tend to have unexpected issues, which have an unfortunate tendency to surface before release and contribute to release manager burnout.
Aren't you exaggerating a bit? Would you mind to elaborate? Do you have examples of issues caused by converting macros to static inline functions?
Not quite the same but I've got a related example similar to what macros casting pointer becoming a function accepting PyObject* without a cast *could* do: Between 3.6 and 3.7 we added const to a number of our public Python C APIs. Migration to 3.7 required updating all sorts of C and C++ extension module code to be pedantically correct, up through its call chain in some cases with matching const declarations on types. (including conditional compilation based on the Python version to support compilation under both during the transition - which is more than a pointer cast change would normally entail). It was an unintended consequence of a seemingly minor API annotation for pedantic correctness. Cast changes should be easier to fix. It's more the act of needing to fix things when not important that we want to avoid. The PEP lists many macros converted to static inline functions and
static inline functions converted to regular functions:
https://www.python.org/dev/peps/pep-0670/#macros-converted-to-functions-sinc...
Yep! Those are great success data points to have. Note that Py_INCREF remains a macro so that the cast stays in place: #define Py_INCREF(op) _Py_INCREF(_PyObject_CAST(op)) -gps
On 09. 02. 22 21:41, Gregory P. Smith wrote:
On Wed, Feb 9, 2022 at 8:54 AM Victor Stinner <vstinner@python.org
[...]
Two differing examples from that PR 31221:
Hold off as unsafe for now: Macros that do things like (PyWhateverObject*)(op) such as PyUnicode_CHECK_INTERNED(op) should not have the casting part of macros replaced yet. Doing so could result in a warning or failure at -Werror compile time if someone was not using a PyObject*. Code is /supposed/ to, but the cast means anyone could've used PyUnicodeObject or whatnot itself. Perhaps use a hybrid approach when feasible similar to: #define PyUnicode_CHECK_INTERNED(op) _PyUnicode_CHECK_INTERNED((PyASCIIObject *)(op))
That should make it safe. And I believe you do mention this technique as something to be used in the PEP.
That technique is mentioned in the PEP, but it looks like we just found a better way to do it than what the PEP suggests: the macro and the function can have the same namea Are there any downsides to that? We don't know of any so far. Is it a widely used trick in some other codebases? Should we try to find to people who use it regularly and ask what to do and what to watch out for? If we just approved the PEP, a lot of the macros would probably get converted to what the PEP suggests. With a bit more time and discussion, we can arrive at a better solution. This is a large change across the whole codebase. It won't be easy to fix it up later. It won't be easy to revert it if things go wrong. IMO this is a case where "never is better than *right* now". But it doesn't need to be "never".
On Thu, Feb 10, 2022 at 10:58 AM Petr Viktorin <encukou@gmail.com> wrote:
On 09. 02. 22 21:41, Gregory P. Smith wrote:
On Wed, Feb 9, 2022 at 8:54 AM Victor Stinner <vstinner@python.org Perhaps use a hybrid approach when feasible similar to: #define PyUnicode_CHECK_INTERNED(op) _PyUnicode_CHECK_INTERNED((PyASCIIObject *)(op))
That should make it safe. And I believe you do mention this technique as something to be used in the PEP.
It is exactly what the PEP proposes in the Specification section to avoid adding new compiler warnings: https://www.python.org/dev/peps/pep-0670/#cast-to-pyobject Macro which is only there to add the cast, to avoid adding a compiler warning: #define Py_TYPE(ob) _Py_TYPE(_PyObject_CAST_CONST(ob)) Static inline function: static inline PyTypeObject* _Py_TYPE(const PyObject *ob) { ... } Use the existing macro to do the cast: #define _PyObject_CAST_CONST(op) ((const PyObject*)(op))
That technique is mentioned in the PEP, but it looks like we just found a better way to do it than what the PEP suggests: the macro and the function can have the same name
Well, when I started to convert macros to static inline functions, I chose to have a different name to help *me* identifying what is what: "Pyxxx" is the macro and "_Pyxxx" is the static inline functions. It looked convenient at the beginning. It looked like a good idea... Then I started to see "_Pyxxx" functions in debuggers and profilers. It's not convenient since in C extensions, the public "Pyxxx" (macro) name is used, same in old Python versions. Having a different name can cause practical issues: Lib/test/test_gdb.py and Tools/gdb/libpython.py have to search for multiple names for the same function in this case (to support old and new Python versions). The new private/secret "_Pyxxx" name is unexpected, so I proposed to abandon this bad habit and use the same name for the macro and for the static inline function.
Are there any downsides to that?
There is no downside. Macros are only for the C preprocessor, they are gone at the ABI level. The macro is just there to add a cast to avoid adding new compiler warnings (since the old macro already did that). Moreover, PEP 670 proposes to replace some static inline functions with regular functions to make them available for extension modules which cannot use static inline functions, like the vim text editor (written in C) which only loads symbols from libpython. Having a consistent name for macros, static inline functions and functions should help for that. -- There is one exception: when a function as exposed as static inline function *and* a regular function. _Py_Dealloc() and Py_NewRef() for example. Py_NewRef() is defined as a macro which calls _Py_NewRef() static inline function (for best performance), and there is a regular Py_NewRef() regular function (exposed in the ABI): the regular function and the static inline functions cannot have the same name: PyAPI_FUNC(PyObject*) Py_NewRef(PyObject *obj); static inline PyObject* _Py_NewRef(PyObject *obj) { Py_INCREF(obj); return obj; } #define Py_NewRef(obj) _Py_NewRef(_PyObject_CAST(obj)) Names: * Macro: Py_NewRef * Regular function: Py_NewRef * Static inline function: _Py_NewRef (different name) If _Py_NewRef() is renamed to Py_NewRef(), the compiler fails with: ./Include/object.h:597:25: error: static declaration of 'Py_NewRef' follows non-static declaration Another example is _Py_Dealloc() declared as a macro+static inline function (best performance) *and* a regular function (expose it in the ABI): the static inline function has a different name. IMO the most important name is the regular function name since it's the one which lands in libpython ABI. Static inline functions are *not* part of the libpython ABI, they are either inlined or copied as regular functions (depending on what the compiler prefers) in each extension module using it. Victor -- Night gathers, and now my watch begins. It shall not end until my death.
On Wed, 9 Feb 2022 17:49:19 +0100 Victor Stinner <vstinner@python.org> wrote:
On Wed, Feb 9, 2022 at 1:04 PM Petr Viktorin <encukou@gmail.com> wrote:
Right now, a large number of macros cast their argument to a type. A few examples:
#define PyObject_TypeCheck(ob, type) PyObject_TypeCheck(_PyObject_CAST(ob), type) #define PyTuple_GET_ITEM(op, i) (_PyTuple_CAST(op)->ob_item[i]) #define PyDict_GET_SIZE(mp) (assert(PyDict_Check(mp)),((PyDictObject *)mp)->ma_used)
When I look at the Rationale points, and for the first three of these macros I didn't find any that sound very convincing: - Functions don't have macro pitfalls, but these simple macros don't fall into the pits. - Fully defining the argument types means getting rid of the cast, breaking some code that uses the macro - Debugger support really isn't that useful for these simple macros - There are no new variables
Using a static inline function, profilers like Linux perf can count the CPU time spend in static inline functions (on each CPU instruction when using annotated assembly code). For example, you can see how much time (accumulated time) is spent in Py_INCREF(), to have an idea of the cost of reference counting in Python.
The "time spent in Py_INCREF" doesn't tell you the cost of reference counting. Modern CPUs execute code out-of-order and rely on many internal structures (such as branch predictors, reorder buffers...). The *visible* time elapsed between the instruction pointer entering and leaving a function doesn't tell you whether Py_INCREF had adverse effects on the utilization of such internal structures (making reference counting more costly than it appears to be), or on the contrary whether the instructions in Py_INCREF were successfully overlapped with other computations (making reference counting practically free). The only reliable way to evaluate the cost of reference counting is to compare it against alternatives in realistic scenarios. Regards Antoine.
It's not possible when using macros.
For debuggers, you're right that Py_INCREF() and PyTuple_GET_ITEM() macros are very simple and it's not so hard to guess that the debugger is executing their code in the C code or the assembly code. But the purpose of PEP 670 is to convert way more complex macros. I wrote a PR to convert unicodeobject.h macros, IMO there are one of the worst macros in Python C API: https://github.com/python/cpython/pull/31221
I always wanted to convert them, but some core devs were afraid of performance regressions. So I wrote a PEP to prove that there is no impact on performance.
IMO the new unicodeobject.h code is way more readable. I added two "kind" variables which have a well defined scope. In macros, no macro is used currently to avoid macro pitfalls: name conflict if there is already a "kind" macro where the macro is used.
The conversion to static inline macros also detected a bug with "const PyObject*": using PyUnicode_READY() on a const Unicode string is wrong, this function modifies the object if it's not ready (WCHAR kind). It also detected bugs on "const void *data" used to *write* into string characters (PyUnicode_WRITE).
- Macro tricks (parentheses and comma-separated expressions) are needed, but they're present and tested.
The PEP rationale starts with: "The use of macros may have unintended adverse effects that are hard to avoid, even for experienced C developers. Some issues have been known for years, while others have been discovered recently in Python. Working around macro pitfalls makes the macro coder harder to read and to maintain."
Are you saying that all core devs are well aware of all macro pitfalls and always avoid them? I'm well aware of these pitfalls, and I fall into their trap multiple times.
The bpo-30459 issue about PyList_SET_ITEM() is a concrete example of old bugs that nobody noticed before.
The "massive change to working code" part is important. Such efforts tend to have unexpected issues, which have an unfortunate tendency to surface before release and contribute to release manager burnout.
Aren't you exaggerating a bit? Would you mind to elaborate? Do you have examples of issues caused by converting macros to static inline functions?
I'm not talking about incompatible C API changes made on purpose, but about PEP 670 changes.
The PEP lists many macros converted to static inline functions and static inline functions converted to regular functions: https://www.python.org/dev/peps/pep-0670/#macros-converted-to-functions-sinc...
Are you aware of release manager burnout caused by these changes?
Victor
Well, maybe it's a bad example. I just wanted to say that converting macros to static inline functions provide more accurate profiler data and debuggers can more easily show static inline functions names when they are inlined and put breakpoints of them. But you're right, it's not a silver bullet ;-) Victor On Mon, Feb 14, 2022 at 11:29 AM Antoine Pitrou <antoine@python.org> wrote:
On Wed, 9 Feb 2022 17:49:19 +0100 Victor Stinner <vstinner@python.org> wrote:
On Wed, Feb 9, 2022 at 1:04 PM Petr Viktorin <encukou@gmail.com> wrote:
Right now, a large number of macros cast their argument to a type. A few examples:
#define PyObject_TypeCheck(ob, type) PyObject_TypeCheck(_PyObject_CAST(ob), type) #define PyTuple_GET_ITEM(op, i) (_PyTuple_CAST(op)->ob_item[i]) #define PyDict_GET_SIZE(mp) (assert(PyDict_Check(mp)),((PyDictObject *)mp)->ma_used)
When I look at the Rationale points, and for the first three of these macros I didn't find any that sound very convincing: - Functions don't have macro pitfalls, but these simple macros don't fall into the pits. - Fully defining the argument types means getting rid of the cast, breaking some code that uses the macro - Debugger support really isn't that useful for these simple macros - There are no new variables
Using a static inline function, profilers like Linux perf can count the CPU time spend in static inline functions (on each CPU instruction when using annotated assembly code). For example, you can see how much time (accumulated time) is spent in Py_INCREF(), to have an idea of the cost of reference counting in Python.
The "time spent in Py_INCREF" doesn't tell you the cost of reference counting. Modern CPUs execute code out-of-order and rely on many internal structures (such as branch predictors, reorder buffers...). The *visible* time elapsed between the instruction pointer entering and leaving a function doesn't tell you whether Py_INCREF had adverse effects on the utilization of such internal structures (making reference counting more costly than it appears to be), or on the contrary whether the instructions in Py_INCREF were successfully overlapped with other computations (making reference counting practically free).
The only reliable way to evaluate the cost of reference counting is to compare it against alternatives in realistic scenarios.
Regards
Antoine.
It's not possible when using macros.
For debuggers, you're right that Py_INCREF() and PyTuple_GET_ITEM() macros are very simple and it's not so hard to guess that the debugger is executing their code in the C code or the assembly code. But the purpose of PEP 670 is to convert way more complex macros. I wrote a PR to convert unicodeobject.h macros, IMO there are one of the worst macros in Python C API: https://github.com/python/cpython/pull/31221
I always wanted to convert them, but some core devs were afraid of performance regressions. So I wrote a PEP to prove that there is no impact on performance.
IMO the new unicodeobject.h code is way more readable. I added two "kind" variables which have a well defined scope. In macros, no macro is used currently to avoid macro pitfalls: name conflict if there is already a "kind" macro where the macro is used.
The conversion to static inline macros also detected a bug with "const PyObject*": using PyUnicode_READY() on a const Unicode string is wrong, this function modifies the object if it's not ready (WCHAR kind). It also detected bugs on "const void *data" used to *write* into string characters (PyUnicode_WRITE).
- Macro tricks (parentheses and comma-separated expressions) are needed, but they're present and tested.
The PEP rationale starts with: "The use of macros may have unintended adverse effects that are hard to avoid, even for experienced C developers. Some issues have been known for years, while others have been discovered recently in Python. Working around macro pitfalls makes the macro coder harder to read and to maintain."
Are you saying that all core devs are well aware of all macro pitfalls and always avoid them? I'm well aware of these pitfalls, and I fall into their trap multiple times.
The bpo-30459 issue about PyList_SET_ITEM() is a concrete example of old bugs that nobody noticed before.
The "massive change to working code" part is important. Such efforts tend to have unexpected issues, which have an unfortunate tendency to surface before release and contribute to release manager burnout.
Aren't you exaggerating a bit? Would you mind to elaborate? Do you have examples of issues caused by converting macros to static inline functions?
I'm not talking about incompatible C API changes made on purpose, but about PEP 670 changes.
The PEP lists many macros converted to static inline functions and static inline functions converted to regular functions: https://www.python.org/dev/peps/pep-0670/#macros-converted-to-functions-sinc...
Are you aware of release manager burnout caused by these changes?
Victor
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/2FWINTMN... Code of Conduct: http://python.org/psf/codeofconduct/
-- Night gathers, and now my watch begins. It shall not end until my death.
participants (7)
-
Antoine Pitrou
-
Brett Cannon
-
Erlend Aasland
-
Gregory P. Smith
-
Petr Viktorin
-
Petr Viktorin
-
Victor Stinner