PEP 674 – Disallow using macros as l-value (version 2)
Hi, I made the following changes to the PEP since the version 1 (November 30): * Elaborate the HPy section and add a new "Relationship with the HPy project" section * Elaborate which projects are impacted and which changes are needed * Remove the PyPy section The PEP can be read online: https://python.github.io/peps/ep-0674/ And you can find the plain text below. Victor PEP: 674 Title: Disallow using macros as l-value Author: Victor Stinner <vstinner@python.org> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 30-Nov-2021 Python-Version: 3.11 Abstract ======== Incompatible C API change disallowing using macros as l-value to: * Allow evolving CPython internals; * Ease the C API implementation on other Python implementation; * Help migrating existing C extensions to the HPy API. On the PyPI top 5000 projects, only 14 projects (0.3%) are affected by 4 macro changes. Moreover, 24 projects just have to regenerate their Cython code to use ``Py_SET_TYPE()``. In practice, the majority of affected projects only have to make two changes: * Replace ``Py_TYPE(obj) = new_type;`` with ``Py_SET_TYPE(obj, new_type);``. * Replace ``Py_SIZE(obj) = new_size;`` with ``Py_SET_SIZE(obj, new_size);``. Rationale ========= Using a macro as a l-value -------------------------- In the Python C API, some functions are implemented as macro because writing a macro is simpler than writing a regular function. If a macro exposes directly a structure member, it is technically possible to use this macro to not only get the structure member but also set it. Example with the Python 3.10 ``Py_TYPE()`` macro:: #define Py_TYPE(ob) (((PyObject *)(ob))->ob_type) This macro can be used as a **r-value** to **get** an object type:: type = Py_TYPE(object); It can also be used as **l-value** to **set** an object type:: Py_TYPE(object) = new_type; It is also possible to set an object reference count and an object size using ``Py_REFCNT()`` and ``Py_SIZE()`` macros. Setting directly an object attribute relies on the current exact CPython implementation. Implementing this feature in other Python implementations can make their C API implementation less efficient. CPython nogil fork ------------------ Sam Gross forked Python 3.9 to remove the GIL: the `nogil branch <https://github.com/colesbury/nogil/>`_. This fork has no ``PyObject.ob_refcnt`` member, but a more elaborated implementation for reference counting, and so the ``Py_REFCNT(obj) = new_refcnt;`` code fails with a compiler error. Merging the nogil fork into the upstream CPython main branch requires first to fix this C API compatibility issue. It is a concrete example of a Python optimization blocked indirectly by the C API. This issue was already fixed in Python 3.10: the ``Py_REFCNT()`` macro has been already modified to disallow using it as a l-value. These statements are endorsed by Sam Gross (nogil developer). HPy project ----------- The `HPy project <https://hpyproject.org/>`_ is a brand new C API for Python using only handles and function calls: handles are opaque, structure members cannot be accessed directly, and pointers cannot be dereferenced. Searching and replacing ``Py_SET_SIZE()`` is easier and safer than searching and replacing some strange macro uses of ``Py_SIZE()``. ``Py_SIZE()`` can be semi-mechanically replaced by ``HPy_Length()``, whereas seeing ``Py_SET_SIZE()`` would immediately make clear that the code needs bigger changes in order to be ported to HPy (for example by using ``HPyTupleBuilder`` or ``HPyListBuilder``). The fewer internal details exposed via macros, the easier it will be for HPy to provide direct equivalents. Any macro that references "non-public" interfaces effectively exposes those interfaces publicly. These statements are endorsed by Antonio Cuni (HPy developer). GraalVM Python -------------- In GraalVM, when a Python object is accessed by the Python C API, the C API emulation layer has to wrap the GraalVM objects into wrappers that expose the internal structure of the CPython structures (PyObject, PyLongObject, PyTypeObject, etc). This is because when the C code accesses it directly or via macros, all GraalVM can intercept is a read at the struct offset, which has to be mapped back to the representation in GraalVM. The smaller the "effective" number of exposed struct members (by replacing macros with functions), the simpler GraalVM wrappers can be. This PEP alone is not enough to get rid of the wrappers in GraalVM, but it is a step towards this long term goal. GraalVM already supports HPy which is a better solution in the long term. These statements are endorsed by Tim Felgentreff (GraalVM Python developer). Specification ============= Disallow using macros as l-value -------------------------------- PyObject and PyVarObject macros ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * ``Py_TYPE()``: ``Py_SET_TYPE()`` must be used instead * ``Py_SIZE()``: ``Py_SET_SIZE()`` must be used instead "GET" macros ^^^^^^^^^^^^ * ``PyByteArray_GET_SIZE()`` * ``PyBytes_GET_SIZE()`` * ``PyCFunction_GET_CLASS()`` * ``PyCFunction_GET_FLAGS()`` * ``PyCFunction_GET_FUNCTION()`` * ``PyCFunction_GET_SELF()`` * ``PyCell_GET()`` * ``PyCode_GetNumFree()`` * ``PyDict_GET_SIZE()`` * ``PyFunction_GET_ANNOTATIONS()`` * ``PyFunction_GET_CLOSURE()`` * ``PyFunction_GET_CODE()`` * ``PyFunction_GET_DEFAULTS()`` * ``PyFunction_GET_GLOBALS()`` * ``PyFunction_GET_KW_DEFAULTS()`` * ``PyFunction_GET_MODULE()`` * ``PyHeapType_GET_MEMBERS()`` * ``PyInstanceMethod_GET_FUNCTION()`` * ``PyList_GET_SIZE()`` * ``PyMemoryView_GET_BASE()`` * ``PyMemoryView_GET_BUFFER()`` * ``PyMethod_GET_FUNCTION()`` * ``PyMethod_GET_SELF()`` * ``PySet_GET_SIZE()`` * ``PyTuple_GET_SIZE()`` * ``PyUnicode_GET_DATA_SIZE()`` * ``PyUnicode_GET_LENGTH()`` * ``PyUnicode_GET_LENGTH()`` * ``PyUnicode_GET_SIZE()`` * ``PyWeakref_GET_OBJECT()`` "AS" macros ^^^^^^^^^^^ * ``PyByteArray_AS_STRING()`` * ``PyBytes_AS_STRING()`` * ``PyFloat_AS_DOUBLE()`` * ``PyUnicode_AS_DATA()`` * ``PyUnicode_AS_UNICODE()`` PyUnicode macros ^^^^^^^^^^^^^^^^ * ``PyUnicode_1BYTE_DATA()`` * ``PyUnicode_2BYTE_DATA()`` * ``PyUnicode_4BYTE_DATA()`` * ``PyUnicode_DATA()`` * ``PyUnicode_IS_ASCII()`` * ``PyUnicode_IS_COMPACT()`` * ``PyUnicode_IS_READY()`` * ``PyUnicode_KIND()`` * ``PyUnicode_READ()`` * ``PyUnicode_READ_CHAR()`` PyDateTime "GET" macros ^^^^^^^^^^^^^^^^^^^^^^^ * ``PyDateTime_DATE_GET_FOLD()`` * ``PyDateTime_DATE_GET_HOUR()`` * ``PyDateTime_DATE_GET_MICROSECOND()`` * ``PyDateTime_DATE_GET_MINUTE()`` * ``PyDateTime_DATE_GET_SECOND()`` * ``PyDateTime_DATE_GET_TZINFO()`` * ``PyDateTime_DELTA_GET_DAYS()`` * ``PyDateTime_DELTA_GET_MICROSECONDS()`` * ``PyDateTime_DELTA_GET_SECONDS()`` * ``PyDateTime_GET_DAY()`` * ``PyDateTime_GET_MONTH()`` * ``PyDateTime_GET_YEAR()`` * ``PyDateTime_TIME_GET_FOLD()`` * ``PyDateTime_TIME_GET_HOUR()`` * ``PyDateTime_TIME_GET_MICROSECOND()`` * ``PyDateTime_TIME_GET_MINUTE()`` * ``PyDateTime_TIME_GET_SECOND()`` * ``PyDateTime_TIME_GET_TZINFO()`` PyDescr macros ^^^^^^^^^^^^^^ * ``PyDescr_NAME()`` * ``PyDescr_TYPE()`` Port C extensions to Python 3.11 -------------------------------- In practice, the majority of projects affected by these PEP only have to make two changes: * Replace ``Py_TYPE(obj) = new_type;`` with ``Py_SET_TYPE(obj, new_type);``. * Replace ``Py_SIZE(obj) = new_size;`` with ``Py_SET_SIZE(obj, new_size);``. The `pythoncapi_compat project <https://github.com/pythoncapi/pythoncapi_compat>`_ can be used to update automatically C extensions: add Python 3.11 support without losing support with older Python versions. The project provides a header file which provides ``Py_SET_REFCNT()``, ``Py_SET_TYPE()`` and ``Py_SET_SIZE()`` functions to Python 3.8 and older. PyTuple_GET_ITEM() and PyList_GET_ITEM() ---------------------------------------- The ``PyTuple_GET_ITEM()`` and ``PyList_GET_ITEM()`` macros are left unchanged. The code pattern ``&PyTuple_GET_ITEM(tuple, 0)`` and ``&PyList_GET_ITEM(list, 0)`` is still commonly used to get access to the inner ``PyObject**`` array. Changing these macros is out of the scope of this PEP. Backwards Compatibility ======================= The proposed C API changes are backward incompatible on purpose. At December 1, 2021, a code search on the PyPI top 5000 projects (4760 projects in practice, others don't have a source archive) found that `only 14 projects are affected <https://bugs.python.org/issue45476#msg407456>`_ (0.3%): * datatable (1.0.0) * frozendict (2.1.1) * guppy3 (3.1.2) * M2Crypto (0.38.0) * mecab-python3 (1.0.4) * mypy (0.910) * Naked (0.1.31) * pickle5 (0.0.12) * pysha3 (1.0.2) * python-snappy (0.6.0) * recordclass (0.16.3) * scipy (1.7.3) * zodbpickle (2.2.0) * zstd (1.5.0.2) These 14 projects only use 4 macros as l-value: * ``PyDescr_NAME()`` and ``PyDescr_TYPE()`` (2 projects) * ``Py_SIZE()`` (8 projects) * ``Py_TYPE()`` (4 projects) Moreover, `24 projects just have to regenerate their Cython code <https://bugs.python.org/issue45476#msg407416>`_ to use ``Py_SET_TYPE()``. This change does not follow the PEP 387 deprecation process. There is no known way to emit a deprecation warning only when a macro is used as a l-value, but not when it's used differently (ex: as a r-value). Relationship with the HPy project ================================= The HPy project --------------- The hope with the HPy project is indeed to provide a C API that is close to the original API to make porting easy and have it perform as close to the existing API as possible. At the same time, HPy is sufficently removed to be a good "C extension API" (as opposed to a stable subset of the CPython implementation API) that does not leak implementation details. To ensure this latter property is why the HPy project tries to develop everything in parallel for CPython, PyPy, and GraalVM Python. HPy is still evolving very fast. Issues are still being solved while migrating NumPy, and work has begun on adding support for HPy to Cython. Work on pybind11 is starting soon. Tim Felgentreff believes by the time HPy has these users of the existing C API working, HPy should be in a state where it is generally useful and can be deemed stable enough that further development can follow a more stable process. In the long run the HPy project would like to become a promoted API to write Python C extensions. The HPy project is a good solution for the long term. It has the advantage of being developed outside Python and it doesn't require any C API change. The C API is here is stay for a few more years ---------------------------------------------- The first concern about HPy is that right now, HPy is not mature nor widely used, and CPython still has to continue supporting a large amount of C extensions which are not likely to be ported to HPy soon. The second concern is the inability to evolve CPython internals to implement new optimizations, and the inefficient implementation of the current C API in PyPy, GraalPython, etc. Sadly, HPy will only solve these problems when most C extensions will be fully ported to HPy: when it will become reasonable to consider dropping the "legacy" Python C API. While porting a C extension to HPy can be done incrementally on CPython, it requires to modify a lot of code and takes time. Porting most C extensions to HPy is expected to take a few years. This PEP proposes to make the C API "less bad" by fixing one problem which is clearily identified as causing practical issues: macros used as l-values. This PEP only requires to update a minority of C extensions, and usually only a few lines need to be changed in impacted extensions. For example, numpy 1.22 is made of 307,300 lines of C code, and adapting numpy to the this PEP only modified 11 lines (use Py_SET_TYPE and Py_SET_SIZE) and adding 4 lines (to define Py_SET_TYPE and Py_SET_SIZE for Python 3.8 and older). Porting numpy to HPy is expected to require modifying more lines than that. Right now, it's hard to bet which approach is the best: fixing the current C API, or focusing on HPy. It would be risky to only focus on HPy. Rejected Idea: Leave the macros as they are =========================================== The documentation of each function can discourage developers to use macros to modify Python objects. If these is a need to make an assignment, a setter function can be added and the macro documentation can require to use the setter function. For example, a ``Py_SET_TYPE()`` function has been added to Python 3.9 and the ``Py_TYPE()`` documentation now requires to use the ``Py_SET_TYPE()`` function to set an object type. If developers use macros as l-value, it's their responsibility when their code breaks, not the Python responsibility. We are operating under the consenting adults principle: we expect users of the Python C API to use it as documented and expect them to take care of the fallout, if things break when they don't. This idea was rejected because only few developers read the documentation, and only a minority is tracking changes of the Python C API documentation. The majority of developers are only using CPython and so are not aware of compatibility issues with other Python implementations. Moreover, continuing to allow using macros as l-value does not help the HPy project and leaves the burden of emulating them on GraalVM's Python implementation. Macros already modified ======================= The following C API macros have already been modified to disallow using them as l-value: * ``PyCell_SET()`` * ``PyList_SET_ITEM()`` * ``PyTuple_SET_ITEM()`` * ``Py_REFCNT()`` (Python 3.10): ``Py_SET_REFCNT()`` must be used * ``_PyGCHead_SET_FINALIZED()`` * ``_PyGCHead_SET_NEXT()`` * ``asdl_seq_GET()`` * ``asdl_seq_GET_UNTYPED()`` * ``asdl_seq_LEN()`` * ``asdl_seq_SET()`` * ``asdl_seq_SET_UNTYPED()`` For example, ``PyList_SET_ITEM(list, 0, item) < 0`` now fails with a compiler error as expected. Discussion ========== * `PEP 674: Disallow using macros as l-value <https://mail.python.org/archives/list/python-dev@python.org/thread/KPIJPPJ6XVNOLGZQD2PFGMT7LBJMTTCO/>`_ (Nov 30, 2021) References ========== * `Python C API: Add functions to access PyObject <https://vstinner.github.io/c-api-abstract-pyobject.html>`_ (October 2021) article by Victor Stinner * `[C API] Disallow using PyFloat_AS_DOUBLE() as l-value <https://bugs.python.org/issue45476>`_ (October 2021) * `[capi-sig] Py_TYPE() and Py_SIZE() become static inline functions <https://mail.python.org/archives/list/capi-sig@python.org/thread/WGRLTHTHC32DQTACPPX36TPR2GLJAFRB/>`_ (September 2021) * `[C API] Avoid accessing PyObject and PyVarObject members directly: add Py_SET_TYPE() and Py_IS_TYPE(), disallow Py_TYPE(obj)=type <https://bugs.python.org/issue39573>`__ (February 2020) * `bpo-30459: PyList_SET_ITEM could be safer <https://bugs.python.org/issue30459>`_ (May 2017) Copyright ========= This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
Hi, My PEP 674 proposed to change PyDescr_TYPE() and PyDescr_NAME() macros. This change breaks M2Crypto and mecab-python3 projects in code generated by SWIG. I tried two solutions to prevent SWIG accessing PyDescrObject members directly: https://bugs.python.org/issue46538 At the end, IMO it's too much work, whereas there is no need in the short term to modify the PyDescrObject structure, or structure inheriting from PyDescrObject. So I excluded PyDescr_TYPE() and PyDescr_NAME() macros from PEP 674 to leave them unchanged: https://python.github.io/peps/pep-0674/#pydescr-name-and-pydescr-type-are-le... Today, the PEP 674 only affects 9 projects on the top 5000 PyPI projects (+ 22 projects which have to re-run Cython). Victor
On 1/26/22 3:02 PM, Victor Stinner wrote:
Hi,
My PEP 674 proposed to change PyDescr_TYPE() and PyDescr_NAME() macros. This change breaks M2Crypto and mecab-python3 projects in code generated by SWIG. I tried two solutions to prevent SWIG accessing PyDescrObject members directly: https://bugs.python.org/issue46538
At the end, IMO it's too much work, whereas there is no need in the short term to modify the PyDescrObject structure, or structure inheriting from PyDescrObject.
So I excluded PyDescr_TYPE() and PyDescr_NAME() macros from PEP 674 to leave them unchanged: https://python.github.io/peps/pep-0674/#pydescr-name-and-pydescr-type-are-le...
Just so I understand: is this effectively permanent? My first thought was, we get SWIG to change its code generation, and in a couple of years we can follow up and make this change. But perhaps there's so much SWIG code out there already, and the change would propagate so slowly, that it's not even worth considering. Is it something like that? Or is it just that it's such a minor change that it's not worth fighting about? //arry/ p.s. thanks for doing this!
On Thu, Jan 27, 2022 at 1:14 AM Larry Hastings <larry@hastings.org> wrote:
Just so I understand: is this effectively permanent? My first thought was, we get SWIG to change its code generation, and in a couple of years we can follow up and make this change. But perhaps there's so much SWIG code out there already, and the change would propagate so slowly, that it's not even worth considering. Is it something like that? Or is it just that it's such a minor change that it's not worth fighting about?
I'm interested to be able to change Python core structures like PyObject, PyTypeObject or PyListObject. It's just that PyDescrObject is out of my scope for now. As you explained, changing SWIG is not free, so it's better to have a good reason to do it :-) If someone wants to do it, I already did the work, I attached patches to: https://bugs.python.org/issue46538 -- About generated code, there is a similar concern with Cython. Hopefully, it seems like maintainers do regenerate the C code of their project using a recent Cython when they release a new version of their project (ex: numpy). So for Cython, fixing Cython is enough, then it's just a matter of time. In the case of the PEP 674, I already updated Cython in May 2020, and Cython is already released with my changes ;-) Victor -- Night gathers, and now my watch begins. It shall not end until my death.
On 18. 01. 22 9:30, Victor Stinner wrote:
Hi,
I made the following changes to the PEP since the version 1 (November 30):
* Elaborate the HPy section and add a new "Relationship with the HPy project" section * Elaborate which projects are impacted and which changes are needed * Remove the PyPy section
The PEP can be read online: https://python.github.io/peps/ep-0674/
And you can find the plain text below.
Victor
PEP: 674 Title: Disallow using macros as l-value
The current PEP is named "Disallow using Py_TYPE() and Py_SIZE() macros as l-values", which is misleading: it proposes to change 65 macros.
Author: Victor Stinner <vstinner@python.org> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 30-Nov-2021 Python-Version: 3.11
Abstract ========
Incompatible C API change disallowing using macros as l-value to:
* Allow evolving CPython internals;
In the current PEP, this is now "Allow evolving CPython internals (change the PyObject structure);" But that's not really true: the PyObject won't become changeable if the PEP is accepted, since `ob_type` is part of the stable ABI, and direct access to this field is compiled into existing extensions that use the stable ABI. This PEP alone won't really help nogil.
* Ease the C API implementation on other Python implementation; * Help migrating existing C extensions to the HPy API.
On the PyPI top 5000 projects, only 14 projects (0.3%) are affected by 4 macro changes. Moreover, 24 projects just have to regenerate their Cython code to use ``Py_SET_TYPE()``.
This data was gathered in 2022. Since this change was made in 2020 and then reverted because it broke too many projects (which were presumably notified of the breakage and had 2 years to update), the 0.3% is misleading. It is a measure of the current porting effort, not an estimate of the chance of a given project being affected.
In practice, the majority of affected projects only have to make two changes:
* Replace ``Py_TYPE(obj) = new_type;`` with ``Py_SET_TYPE(obj, new_type);``. * Replace ``Py_SIZE(obj) = new_size;`` with ``Py_SET_SIZE(obj, new_size);``.
I'll post my counterproposal, please tell me what's wrong with it (and ideally add it to Rejected ideas): - document that these macros should not be used as l-values - and that's it for CPython. When this is done, interested people can try compiling projects with HPy, and send fixes to various projects. The above documentation should be a strong reason to accept the PRs. And finding the issue is relatively easy -- compile with modified CPython headers. Meanwhile, projects that don't care about HPy (yet), or projects that want to keep working with minimum maintenance, can do the change on their own timeline. If and when the time comes and the proposed change is actually beneficial to all/most users in the short term (for example it is needed for some major performance improvement or nogil), or we can make it opt-in, or it becomes one of a few things that prevent CPython from switching to HPy, then we can do the change. Within one release, even. The PEP doesn't propose any deprecation period because it's impractical to produce compiler warnings; IMO we should handle this by having a long "docs-only deprecation" period, rather than not having a deprecation period at all.
On Mon, Jan 31, 2022 at 2:31 PM Petr Viktorin <encukou@gmail.com> wrote:
PEP: 674 Title: Disallow using macros as l-value
The current PEP is named "Disallow using Py_TYPE() and Py_SIZE() macros as l-values", which is misleading: it proposes to change 65 macros.
Right, I made changes since I posted the "version 2" to python-dev 13 days ago. Since nobody replied to my thread (before you), I directly submitted the latest (updated) PEP to the Steering Council: https://github.com/python/steering-council/issues/100 I decided to put "Py_TYPE() and Py_SIZE()" in the title, since in practice, only these two macros are causing troubles. The PEP changes 65 macros in total. In the latest PEP version, exactly 2 macros are breaking projects: Py_TYPE and Py_SIZE.
* Allow evolving CPython internals;
In the current PEP, this is now "Allow evolving CPython internals (change the PyObject structure);"
But that's not really true: the PyObject won't become changeable if the PEP is accepted, since `ob_type` is part of the stable ABI, and direct access to this field is compiled into existing extensions that use the stable ABI.
When I created https://bugs.python.org/issue39573, I chose the title "Make PyObject an opaque structure in the limited C API". Two years later, I changed the title to "[C API] Avoid accessing PyObject and PyVarObject members directly: add Py_SET_TYPE() and Py_IS_TYPE(), disallow Py_TYPE(obj)=type" since I understood that this project requires multiple incremental steps (and maybe changes should be done in multiple Python versions). You're right that the PEP 674 alone is not enough to fully make the structure opaque. The next interesting problem is that PyType_FromSpec() and PyType_Spec API (which are part of the stable ABI!) use indirectly sizeof(PyObject) in the PyType_Spec.basicsize member. One option to make the PyObject structure opaque would be to modify the PyObject structure to make it empty, and move its members into a new private _PyObject structure. This _PyObject structure would be allocated before the PyObject* pointer, same idea as the current PyGC_Head header which is also allocated before the PyObject* pointer. The main blocker issue is that it breaks the stable ABI. How should the PEP abstract be rephrased to be more accurate?
On the PyPI top 5000 projects, only 14 projects (0.3%) are affected by 4 macro changes. Moreover, 24 projects just have to regenerate their Cython code to use ``Py_SET_TYPE()``.
This data was gathered in 2022. Since this change was made in 2020 and then reverted because it broke too many projects (which were presumably notified of the breakage and had 2 years to update), the 0.3% is misleading. It is a measure of the current porting effort, not an estimate of the chance of a given project being affected.
I tried to provide all data that I gathered. The following section list 14 projects that had to be fixed: https://www.python.org/dev/peps/pep-0674/#projects-released-with-a-fix I didn't check if they are part of the current top 5000 PyPI project or not. How should the abstract be rephrased to mention these projects? Does someone know how to check if these projects are part of the top 5000 PyPI projects? Victor -- Night gathers, and now my watch begins. It shall not end until my death.
On 31. 01. 22 15:30, Victor Stinner wrote:
On Mon, Jan 31, 2022 at 2:31 PM Petr Viktorin <encukou@gmail.com> wrote:
PEP: 674 Title: Disallow using macros as l-value
The current PEP is named "Disallow using Py_TYPE() and Py_SIZE() macros as l-values", which is misleading: it proposes to change 65 macros.
Right, I made changes since I posted the "version 2" to python-dev 13 days ago. Since nobody replied to my thread (before you), I directly submitted the latest (updated) PEP to the Steering Council: https://github.com/python/steering-council/issues/100
I decided to put "Py_TYPE() and Py_SIZE()" in the title, since in practice, only these two macros are causing troubles. The PEP changes 65 macros in total. In the latest PEP version, exactly 2 macros are breaking projects: Py_TYPE and Py_SIZE.
This is a reasonable assumption if you think that backwards-incompatible changes are OK to make if they don't affect "projects". (Presumably, projects that you know about.) If I don't agree wit that, I can't rely on the title/abstract to decide how much time I'll need to spend on the PEP :(
* Allow evolving CPython internals;
In the current PEP, this is now "Allow evolving CPython internals (change the PyObject structure);"
But that's not really true: the PyObject won't become changeable if the PEP is accepted, since `ob_type` is part of the stable ABI, and direct access to this field is compiled into existing extensions that use the stable ABI.
When I created https://bugs.python.org/issue39573, I chose the title "Make PyObject an opaque structure in the limited C API". Two years later, I changed the title to "[C API] Avoid accessing PyObject and PyVarObject members directly: add Py_SET_TYPE() and Py_IS_TYPE(), disallow Py_TYPE(obj)=type" since I understood that this project requires multiple incremental steps (and maybe changes should be done in multiple Python versions).
You're right that the PEP 674 alone is not enough to fully make the structure opaque. The next interesting problem is that PyType_FromSpec() and PyType_Spec API (which are part of the stable ABI!) use indirectly sizeof(PyObject) in the PyType_Spec.basicsize member.
One option to make the PyObject structure opaque would be to modify the PyObject structure to make it empty, and move its members into a new private _PyObject structure. This _PyObject structure would be allocated before the PyObject* pointer, same idea as the current PyGC_Head header which is also allocated before the PyObject* pointer.
The main blocker issue is that it breaks the stable ABI.
How should the PEP abstract be rephrased to be more accurate?
Remove the point from the Abstract?
On the PyPI top 5000 projects, only 14 projects (0.3%) are affected by 4 macro changes. Moreover, 24 projects just have to regenerate their Cython code to use ``Py_SET_TYPE()``.
This data was gathered in 2022. Since this change was made in 2020 and then reverted because it broke too many projects (which were presumably notified of the breakage and had 2 years to update), the 0.3% is misleading. It is a measure of the current porting effort, not an estimate of the chance of a given project being affected.
I tried to provide all data that I gathered. The following section list 14 projects that had to be fixed: https://www.python.org/dev/peps/pep-0674/#projects-released-with-a-fix
I didn't check if they are part of the current top 5000 PyPI project or not.
How should the abstract be rephrased to mention these projects? Does someone know how to check if these projects are part of the top 5000 PyPI projects?
I think you should make it clear that the percentage was taken after fixing many of the projects. That change will probably make the numbers look rather meaningless. So either explain why they're meaningful, or remove the paragraph. (Somewhat unrelated: you could also put the motivation in a Motivation section, rather than the Abstract. If that leaves the abstract with just one sentence, it's good -- provided the sentence is still a good summary of the proposed changes.)
On Mon, Jan 31, 2022 at 3:57 PM Petr Viktorin <encukou@gmail.com> wrote:
Remove the point from the Abstract? (...) I think you should make it clear that the percentage was taken after fixing many of the projects.
Thanks for your review. I pushed a change to take your remarks in account: https://github.com/python/peps/commit/e9456773643c7dcca584ae0100db42639b93f0... I simplified the Abstract and added a new "Backward compatibility > Statistics" section to show all numbers at one place: https://python.github.io/peps/pep-0674/#statistics I also moved projects already fixed in the "Top 5000 PyPI" category or in a new "Other affected projects" category, depending if they are in the top 5000 PyPI projects or not. It should now be easier to understand statistics. Victor -- Night gathers, and now my watch begins. It shall not end until my death.
participants (3)
-
Larry Hastings
-
Petr Viktorin
-
Victor Stinner