Can we stop adding to the C API, please?
Hi, The size of the C API, as measured by `git grep PyAPI_FUNC | wc -l` has been steadily increasing over the last few releases. 3.5 1237 3.6 1304 3.7 1408 3.8 1478 3.9 1518 For reference the 2.7 branch has "only" 973 functions I've heard many criticisms of Python 2 over the years, but that it needed a bigger C API wasn't one of them ;) Why are these functions being added? Wasn't 1000 C functions enough? Every one of these functions represents a maintenance burden. Removing them is painful and takes a lot of effort, but adding them is done casually, without a PEP or, in many cases, even a review. We need to address what to do about the C API in the long term, but for now can we just stop making it larger? Please. Also, can we remove all the new API functions added in 3.9 before the release and it is too late? Cheers, Mark.
Hi, In Python 3.9, I *removed* dozens of functions from the *public* C API, or moved them to the "internal" C API: https://docs.python.org/dev/whatsnew/3.9.html#id3 For a few internal C API, I replaced PyAPI_FUNC() with extern to ensure that they cannot be used outside CPython code base: Python 3.9 is now built with -fvisibility=hidden on compilers supporting it (like GCC and clang). I also *added* a bunch of *new* "getter" or "setter" functions to the public C API for my project of hiding implementation details, like making structures opaque: https://docs.python.org/dev/whatsnew/3.9.html#id1 For example, I added PyThreadState_GetInterpreter() which replaces "tstate->interp", to prepare C extensions for an opaque PyThreadState structure. The other 4 new Python 3.9 functions: * PyObject_CallNoArgs(): "most efficient way to call a callable Python object without any argument" * PyModule_AddType(): "adding a type to a module". I hate the PyObject_AddObject() function which steals a reference on success. * PyObject_GC_IsTracked() and PyObject_GC_IsFinalized(): "query if Python objects are being currently tracked or have been already finalized by the garbage collector respectively": functions requested in bpo-40241. Would you mind to elaborate why you consider that these functions must not be added to Python 3.9?
Every one of these functions represents a maintenance burden. Removing them is painful and takes a lot of effort, but adding them is done casually, without a PEP or, in many cases, even a review.
For the new functions related to hiding implementation details, I have a draft PEP: https://github.com/vstinner/misc/blob/master/cpython/pep-opaque-c-api.rst But it seems like this PEP is trying to solve too many problems in a single document, and that I have to split it into multiple PEPs.
Why are these functions being added? Wasn't 1000 C functions enough?
My PEP lists flaws of the existing C API functions. Sadly, fixing flaws requires adding new functions and deprecating old ones in a slow migration. I'm open to ideas how to fix these flaws differently (without having new functions?). As written in my PEP, another approach is to design a new C API on top of the existing one. That's exactly what the HPy project does. But my PEP also explains why I consider that it only fixes a subset of the issues that I listed. ;-) https://github.com/vstinner/misc/blob/master/cpython/pep-opaque-c-api.rst#hp... Victor -- Night gathers, and now my watch begins. It shall not end until my death.
Hi Victor, On 03/06/2020 2:42 pm, Victor Stinner wrote:
Hi,
In Python 3.9, I *removed* dozens of functions from the *public* C API, or moved them to the "internal" C API: https://docs.python.org/dev/whatsnew/3.9.html#id3
For a few internal C API, I replaced PyAPI_FUNC() with extern to ensure that they cannot be used outside CPython code base: Python 3.9 is now built with -fvisibility=hidden on compilers supporting it (like GCC and clang).
I also *added* a bunch of *new* "getter" or "setter" functions to the public C API for my project of hiding implementation details, like making structures opaque: https://docs.python.org/dev/whatsnew/3.9.html#id1
Adding "setters" is generally a bad idea. "getters" can be computed if the underlying field disappears, but the same may not be true for setters if the relation is not one-to-one. I don't think there are any new setters in 3.9, so it's not an immediate problem.
For example, I added PyThreadState_GetInterpreter() which replaces "tstate->interp", to prepare C extensions for an opaque PyThreadState structure.
`PyThreadState_GetInterpreter()` can't replace `tstate->interp` for two reasons. 1. There is no way to stop third party C code accessing the internals of data structures. We can warn them not to, but that's all. 2. The internal layout of C structures has never been part of the API, with arguably two exceptions; the PyTypeObject struct and the `ob_refcnt` field of PyObject.
The other 4 new Python 3.9 functions:
* PyObject_CallNoArgs(): "most efficient way to call a callable Python object without any argument" * PyModule_AddType(): "adding a type to a module". I hate the PyObject_AddObject() function which steals a reference on success. * PyObject_GC_IsTracked() and PyObject_GC_IsFinalized(): "query if Python objects are being currently tracked or have been already finalized by the garbage collector respectively": functions requested in bpo-40241.
Would you mind to elaborate why you consider that these functions must not be added to Python 3.9?
I'm not saying that no C functions should be added to the API. I am saying that none should be added without a PEP or proper review. Addressing the four function you list. PyObject_CallNoArgs() seems harmless. Rationalizing the call API has merit, but PyObject_CallNoArgs() leads to PyObject_CallOneArg(), PyObject_CallTwoArgs(), etc. and an even larger API. PyModule_AddType(). This seems perfectly reasonable, although if it is a straight replacement for another function, that other function should be deprecated. PyObject_GC_IsTracked(). I don't like this. Shouldn't GC track *all* objects? Even if it were named PyObject_Cycle_GC_IsTracked() it would be exposing internal implementation details for no good reason. A cycle GC that doesn't "track" individual objects, but treats all objects the same could be more efficient. In which case, what would this mean? What is the purpose of PyObject_GC_IsFinalized()? Third party objects can easily tell if they have been finalized. Why they would ever need this information is a mystery to me.
Every one of these functions represents a maintenance burden. Removing them is painful and takes a lot of effort, but adding them is done casually, without a PEP or, in many cases, even a review.
For the new functions related to hiding implementation details, I have a draft PEP: https://github.com/vstinner/misc/blob/master/cpython/pep-opaque-c-api.rst
But it seems like this PEP is trying to solve too many problems in a single document, and that I have to split it into multiple PEPs.
It does need splitting up, I agree.
Why are these functions being added? Wasn't 1000 C functions enough?
My PEP lists flaws of the existing C API functions. Sadly, fixing flaws requires adding new functions and deprecating old ones in a slow migration.
IMO, at least one function should be deprecated for each new function added. That way the API won't get any bigger. Cheers, Mark.
I'm open to ideas how to fix these flaws differently (without having new functions?). > As written in my PEP, another approach is to design a new C API on top of the existing one. That's exactly what the HPy project does. But my PEP also explains why I consider that it only fixes a subset of the issues that I listed. ;-) https://github.com/vstinner/misc/blob/master/cpython/pep-opaque-c-api.rst#hp...
Victor
Just some comments on the GC stuff as I added them myself.
Shouldn't GC track *all* objects? No, extension types need to opt-in to the garbage collector and if so, implement the interface.
Even if it were named PyObject_Cycle_GC_IsTracked() it would be exposing internal implementation details for no good reason.
In python, there is gc.is_tracked() in Python 3.1 and the GC module already exposes a lot of GC functionality since many versions ago. This just allows the same calls that you can do in Python using the C-API.
What is the purpose of PyObject_GC_IsFinalized()?
Because some objects may have been resurrected and this allows you to know if a given object has already been finalized. This can help to gather advance GC stats, to control some tricky situations with finalizers and the gc in C extensions or just to know all objects that are being resurrected. Note that an equivalent gc.is_finalized() was added in 3.8 as well to query this information from Python in the GC module and this call just allows you to do the same from the C-API. Cheers, Pablo On Wed, 3 Jun 2020 at 18:26, Mark Shannon <mark@hotpy.org> wrote:
Hi Victor,
On 03/06/2020 2:42 pm, Victor Stinner wrote:
Hi,
In Python 3.9, I *removed* dozens of functions from the *public* C API, or moved them to the "internal" C API: https://docs.python.org/dev/whatsnew/3.9.html#id3
For a few internal C API, I replaced PyAPI_FUNC() with extern to ensure that they cannot be used outside CPython code base: Python 3.9 is now built with -fvisibility=hidden on compilers supporting it (like GCC and clang).
I also *added* a bunch of *new* "getter" or "setter" functions to the public C API for my project of hiding implementation details, like making structures opaque: https://docs.python.org/dev/whatsnew/3.9.html#id1
Adding "setters" is generally a bad idea. "getters" can be computed if the underlying field disappears, but the same may not be true for setters if the relation is not one-to-one. I don't think there are any new setters in 3.9, so it's not an immediate problem.
For example, I added PyThreadState_GetInterpreter() which replaces "tstate->interp", to prepare C extensions for an opaque PyThreadState structure.
`PyThreadState_GetInterpreter()` can't replace `tstate->interp` for two reasons. 1. There is no way to stop third party C code accessing the internals of data structures. We can warn them not to, but that's all. 2. The internal layout of C structures has never been part of the API, with arguably two exceptions; the PyTypeObject struct and the `ob_refcnt` field of PyObject.
The other 4 new Python 3.9 functions:
* PyObject_CallNoArgs(): "most efficient way to call a callable Python object without any argument" * PyModule_AddType(): "adding a type to a module". I hate the PyObject_AddObject() function which steals a reference on success. * PyObject_GC_IsTracked() and PyObject_GC_IsFinalized(): "query if Python objects are being currently tracked or have been already finalized by the garbage collector respectively": functions requested in bpo-40241.
Would you mind to elaborate why you consider that these functions must not be added to Python 3.9?
I'm not saying that no C functions should be added to the API. I am saying that none should be added without a PEP or proper review.
Addressing the four function you list.
PyObject_CallNoArgs() seems harmless. Rationalizing the call API has merit, but PyObject_CallNoArgs() leads to PyObject_CallOneArg(), PyObject_CallTwoArgs(), etc. and an even larger API.
PyModule_AddType(). This seems perfectly reasonable, although if it is a straight replacement for another function, that other function should be deprecated.
PyObject_GC_IsTracked(). I don't like this. Shouldn't GC track *all* objects? Even if it were named PyObject_Cycle_GC_IsTracked() it would be exposing internal implementation details for no good reason. A cycle GC that doesn't "track" individual objects, but treats all objects the same could be more efficient. In which case, what would this mean?
What is the purpose of PyObject_GC_IsFinalized()? Third party objects can easily tell if they have been finalized. Why they would ever need this information is a mystery to me.
Every one of these functions represents a maintenance burden. Removing them is painful and takes a lot of effort, but adding them is done casually, without a PEP or, in many cases, even a review.
For the new functions related to hiding implementation details, I have a draft PEP:
https://github.com/vstinner/misc/blob/master/cpython/pep-opaque-c-api.rst
But it seems like this PEP is trying to solve too many problems in a single document, and that I have to split it into multiple PEPs.
It does need splitting up, I agree.
Why are these functions being added? Wasn't 1000 C functions enough?
My PEP lists flaws of the existing C API functions. Sadly, fixing flaws requires adding new functions and deprecating old ones in a slow migration.
IMO, at least one function should be deprecated for each new function added. That way the API won't get any bigger.
Cheers, Mark.
I'm open to ideas how to fix these flaws differently (without having new functions?). > As written in my PEP, another approach is to design a new C API on top of the existing one. That's exactly what the HPy project does. But my PEP also explains why I consider that it only fixes a subset of the issues that I listed. ;-)
https://github.com/vstinner/misc/blob/master/cpython/pep-opaque-c-api.rst#hp...
Victor
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/5WLOHBCS... Code of Conduct: http://python.org/psf/codeofconduct/
On 03/06/2020 8:49 pm, Pablo Galindo Salgado wrote:
Just some comments on the GC stuff as I added them myself.
Shouldn't GC track *all* objects? No, extension types need to opt-in to the garbage collector and if so, implement the interface.
When you say "GC" I think you mean the backup cycle breaker. But that's not what it means to me, or in general. GC includes reference counting and applies to all objects. Naming is hard, which is why proper review is important.
Even if it were named PyObject_Cycle_GC_IsTracked() it would be exposing internal implementation details for no good reason.
In python, there is gc.is_tracked() in Python 3.1 and the GC module already exposes a lot of GC functionality since many versions ago. This just allows the same calls that you can do in Python using the C-API.
Just because something is exposed in Python doesn't mean an extra function has to be added to the C API. More general functions like `PyObject_Call()` exist already to call any Python object.
What is the purpose of PyObject_GC_IsFinalized()?
Because some objects may have been resurrected and this allows you to know if a given object has already been finalized. This can help to gather advance GC stats, to control some tricky situations with finalizers and the gc in C extensions or just to know all objects that are being resurrected. Note that an equivalent gc.is_finalized() was added in 3.8 as well to query this information from Python in the GC module and this call just allows you to do the same from the C-API.
Cheers, Pablo
On Wed, 3 Jun 2020 at 18:26, Mark Shannon <mark@hotpy.org <mailto:mark@hotpy.org>> wrote:
Hi Victor,
On 03/06/2020 2:42 pm, Victor Stinner wrote: > Hi, > > In Python 3.9, I *removed* dozens of functions from the *public* C > API, or moved them to the "internal" C API: > https://docs.python.org/dev/whatsnew/3.9.html#id3 > > For a few internal C API, I replaced PyAPI_FUNC() with extern to > ensure that they cannot be used outside CPython code base: Python 3.9 > is now built with -fvisibility=hidden on compilers supporting it (like > GCC and clang). > > I also *added* a bunch of *new* "getter" or "setter" functions to the > public C API for my project of hiding implementation details, like > making structures opaque: > https://docs.python.org/dev/whatsnew/3.9.html#id1
Adding "setters" is generally a bad idea. "getters" can be computed if the underlying field disappears, but the same may not be true for setters if the relation is not one-to-one. I don't think there are any new setters in 3.9, so it's not an immediate problem.
> > For example, I added PyThreadState_GetInterpreter() which replaces > "tstate->interp", to prepare C extensions for an opaque PyThreadState > structure.
`PyThreadState_GetInterpreter()` can't replace `tstate->interp` for two reasons. 1. There is no way to stop third party C code accessing the internals of data structures. We can warn them not to, but that's all. 2. The internal layout of C structures has never been part of the API, with arguably two exceptions; the PyTypeObject struct and the `ob_refcnt` field of PyObject.
> > The other 4 new Python 3.9 functions: > > * PyObject_CallNoArgs(): "most efficient way to call a callable Python > object without any argument" > * PyModule_AddType(): "adding a type to a module". I hate the > PyObject_AddObject() function which steals a reference on success. > * PyObject_GC_IsTracked() and PyObject_GC_IsFinalized(): "query if > Python objects are being currently tracked or have been already > finalized by the garbage collector respectively": functions requested > in bpo-40241. > > Would you mind to elaborate why you consider that these functions must > not be added to Python 3.9?
I'm not saying that no C functions should be added to the API. I am saying that none should be added without a PEP or proper review.
Addressing the four function you list.
PyObject_CallNoArgs() seems harmless. Rationalizing the call API has merit, but PyObject_CallNoArgs() leads to PyObject_CallOneArg(), PyObject_CallTwoArgs(), etc. and an even larger API.
PyModule_AddType(). This seems perfectly reasonable, although if it is a straight replacement for another function, that other function should be deprecated.
PyObject_GC_IsTracked(). I don't like this. Shouldn't GC track *all* objects? Even if it were named PyObject_Cycle_GC_IsTracked() it would be exposing internal implementation details for no good reason. A cycle GC that doesn't "track" individual objects, but treats all objects the same could be more efficient. In which case, what would this mean?
What is the purpose of PyObject_GC_IsFinalized()? Third party objects can easily tell if they have been finalized. Why they would ever need this information is a mystery to me.
> > >> Every one of these functions represents a maintenance burden. >> Removing them is painful and takes a lot of effort, but adding them is >> done casually, without a PEP or, in many cases, even a review. > > For the new functions related to hiding implementation details, I have > a draft PEP: > https://github.com/vstinner/misc/blob/master/cpython/pep-opaque-c-api.rst > > But it seems like this PEP is trying to solve too many problems in a > single document, and that I have to split it into multiple PEPs. >
It does need splitting up, I agree.
> >> Why are these functions being added? Wasn't 1000 C functions enough? > > My PEP lists flaws of the existing C API functions. Sadly, fixing > flaws requires adding new functions and deprecating old ones in a slow > migration.
IMO, at least one function should be deprecated for each new function added. That way the API won't get any bigger.
Cheers, Mark.
> > I'm open to ideas how to fix these flaws differently (without having > new functions?). > > As written in my PEP, another approach is to design a new C API on top > of the existing one. That's exactly what the HPy project does. But my > PEP also explains why I consider that it only fixes a subset of the > issues that I listed. ;-) > https://github.com/vstinner/misc/blob/master/cpython/pep-opaque-c-api.rst#hp... > > Victor > _______________________________________________ Python-Dev mailing list -- python-dev@python.org <mailto:python-dev@python.org> To unsubscribe send an email to python-dev-leave@python.org <mailto:python-dev-leave@python.org> https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/5WLOHBCS... Code of Conduct: http://python.org/psf/codeofconduct/
Le mer. 3 juin 2020 à 19:17, Mark Shannon <mark@hotpy.org> a écrit :
I also *added* a bunch of *new* "getter" or "setter" functions to the public C API for my project of hiding implementation details, like making structures opaque: https://docs.python.org/dev/whatsnew/3.9.html#id1
Adding "setters" is generally a bad idea. "getters" can be computed if the underlying field disappears, but the same may not be true for setters if the relation is not one-to-one. I don't think there are any new setters in 3.9, so it's not an immediate problem.
You're making the assumption that the member can be set directly. But my plan is to make the structure opaque. In that case, you need getters and setters for all fields you would like to access. No member would be accessible directly anymore.
`PyThreadState_GetInterpreter()` can't replace `tstate->interp` for two reasons. 1. There is no way to stop third party C code accessing the internals of data structures. We can warn them not to, but that's all. 2. The internal layout of C structures has never been part of the API, with arguably two exceptions; the PyTypeObject struct and the `ob_refcnt` field of PyObject.
My long term plan is to make all structures opaque :-) So far, PyInterpreterState structure was made opaque in Python 3.7. It helped *a lot* the development of Python 3.8 and 3.9, especially for subinterpreters. And I made PyGC_Head opaque in Python 3.9. Examples of issues to make structures opaque: PyGC_Head: https://bugs.python.org/issue40241 (done in Python 3.9) PyObject: https://bugs.python.org/issue39573 PyTypeObject: https://bugs.python.org/issue40170 PyThreadState: https://bugs.python.org/issue39947 PyInterpreterState: https://bugs.python.org/issue35886 (done in Python 3.8) For the short term, my plan is to make structure opaque in the limited C API, before breaking more stuff in the public C API :-)
PyObject_CallNoArgs() seems harmless. Rationalizing the call API has merit, but PyObject_CallNoArgs() leads to PyObject_CallOneArg(), PyObject_CallTwoArgs(), etc. and an even larger API.
PyObject_CallOneArg() also exists: https://docs.python.org/dev/c-api/call.html#c.PyObject_CallOneArg It was added as a private function https://bugs.python.org/issue37483 add made public in commit 3f563cea567fbfed9db539ecbbacfee2d86f7735 "bpo-39245: Make Vectorcall C API public (GH-17893)". But it's missing in What's New in Python 3.9. There is no plan for two or more arguments.
PyObject_GC_IsTracked(). I don't like this. Shouldn't GC track *all* objects? Even if it were named PyObject_Cycle_GC_IsTracked() it would be exposing internal implementation details for no good reason. A cycle GC that doesn't "track" individual objects, but treats all objects the same could be more efficient. In which case, what would this mean?
What is the purpose of PyObject_GC_IsFinalized()? Third party objects can easily tell if they have been finalized. Why they would ever need this information is a mystery to me.
Did you read the issues which added these functions to see the rationale? https://bugs.python.org/issue40241 I like the "(Contributed by xxx in bpo-xxx.)" in What's New in Python 3.9: it became trivial to find such rationale. Victor -- Night gathers, and now my watch begins. It shall not end until my death.
On Wed, Jun 3, 2020 at 2:10 PM Victor Stinner <vstinner@python.org> wrote:
For the short term, my plan is to make structure opaque in the limited C API, before breaking more stuff in the public C API :-)
But you're also breaking the public C API: https://github.com/MagicStack/immutables/issues/46 https://github.com/pycurl/pycurl/pull/636 I'm not saying you're wrong to do so, I'm just confused about whether your plan is to break stuff or not and on which timescale. -n -- Nathaniel J. Smith -- https://vorpus.org
Le jeu. 4 juin 2020 à 00:14, Nathaniel Smith <njs@pobox.com> a écrit :
On Wed, Jun 3, 2020 at 2:10 PM Victor Stinner <vstinner@python.org> wrote:
For the short term, my plan is to make structure opaque in the limited C API, before breaking more stuff in the public C API :-)
But you're also breaking the public C API: https://github.com/MagicStack/immutables/issues/46 https://github.com/pycurl/pycurl/pull/636
I'm not saying you're wrong to do so, I'm just confused about whether your plan is to break stuff or not and on which timescale.
Yes, my plan includes backward incompatible changes on purpose: https://github.com/vstinner/misc/blob/master/cpython/pep-opaque-c-api.rst#ap... The practical issue is to estimate how many C extension modules are broken by a specific C API change. I plan to help out to port C extensions to the updated C API. If the number of broken extensions is fine and updating them is easy/short: fine! If the number is too high, we have until "3.10.0 final: Monday, 2021-10-04" to revert incompatible changes which caused most troubles. I adopted a similar approach for Python incompatible changes in Python 3.9 and IMO it was successful. I made most incompatible changes at the *beginning* of the 3.9 devcycle. Then my team at Red Hat rebuilt Fedora operating systems with Python 3.9. We got tons of package build failures. We grouped failures and looked for the incompatible changes causing most issues. Then Miro and me proposed to revert a few specific changes: * https://lwn.net/ml/python-dev/CABqyc3y4SBCqt5knPjgO3dVsde2nhYrOpYedyCqG+Y+0f... * https://lwn.net/Articles/811369/ At the end, we only revert exactly two changes: (1) aliases to ABC in the collections module and (2) the "U" mode for open(). Python 3.9 still has an impressive list of removed features (incompatible changes!): https://docs.python.org/dev/whatsnew/3.9.html#removed So far, I'm aware that last changes of https://bugs.python.org/issue39573 "[C API] Make PyObject an opaque structure in the limited C API" broke 4 projects (cython, numpy, immutables, pycurl). I'm surprised by the length of fixes. For the giant numpy project, only 5 lines were modified: https://github.com/numpy/numpy/commit/a96b18e3d4d11be31a321999cda4b795ea9ecc... FYI I introduced Py_SET_SIZE(), Py_SET_REFCNT() and Py_SET_TYPE() functions in 3.9 and waited until Python 3.10 to introduce incompatible changes. So Python 3.9 is the transition version. Sadly, I failed to find a way to emit a deprecation warning when Py_TYPE(), Py_SIZE() or Py_REFCNT() is used as an l-value (but not emit a warning when it's used as an r-value). ... This discussion shifted from the initial issue raised by Mark. Please join the capi-sig mailing list to discuss incompatible changes, I started 2 threads there last month ;-) Victor -- Night gathers, and now my watch begins. It shall not end until my death.
On 4 Jun 2020, at 16:34, Victor Stinner <vstinner@python.org> wrote:
Le jeu. 4 juin 2020 à 00:14, Nathaniel Smith <njs@pobox.com> a écrit :
On Wed, Jun 3, 2020 at 2:10 PM Victor Stinner <vstinner@python.org> wrote:
For the short term, my plan is to make structure opaque in the limited C API, before breaking more stuff in the public C API :-)
But you're also breaking the public C API: https://github.com/MagicStack/immutables/issues/46 https://github.com/pycurl/pycurl/pull/636
I'm not saying you're wrong to do so, I'm just confused about whether your plan is to break stuff or not and on which timescale.
Yes, my plan includes backward incompatible changes on purpose: https://github.com/vstinner/misc/blob/master/cpython/pep-opaque-c-api.rst#ap...
The practical issue is to estimate how many C extension modules are broken by a specific C API change. I plan to help out to port C extensions to the updated C API. If the number of broken extensions is fine and updating them is easy/short: fine! If the number is too high, we have until "3.10.0 final: Monday, 2021-10-04" to revert incompatible changes which caused most troubles.
snip My experience with keeping PyCXX up to date with these changes is that its not hard to be compatible. I support python 3.4 to 3.9 for limited and unlimited API. (I also support python 2.7 with the same unlimited API) Barry
Victor -- Night gathers, and now my watch begins. It shall not end until my death. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/23LYPNHS... Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Jun 3, 2020 at 2:13 PM Victor Stinner <vstinner@python.org> wrote:
Le mer. 3 juin 2020 à 19:17, Mark Shannon <mark@hotpy.org> a écrit :
I also *added* a bunch of *new* "getter" or "setter" functions to the public C API for my project of hiding implementation details, like making structures opaque: https://docs.python.org/dev/whatsnew/3.9.html#id1
Adding "setters" is generally a bad idea. "getters" can be computed if the underlying field disappears, but the same may not be true for setters if the relation is not one-to-one. I don't think there are any new setters in 3.9, so it's not an immediate problem.
You're making the assumption that the member can be set directly. But my plan is to make the structure opaque. In that case, you need getters and setters for all fields you would like to access. No member would be accessible directly anymore.
`PyThreadState_GetInterpreter()` can't replace `tstate->interp` for two reasons. 1. There is no way to stop third party C code accessing the internals of data structures. We can warn them not to, but that's all. 2. The internal layout of C structures has never been part of the API, with arguably two exceptions; the PyTypeObject struct and the `ob_refcnt` field of PyObject.
My long term plan is to make all structures opaque :-) So far, PyInterpreterState structure was made opaque in Python 3.7. It helped *a lot* the development of Python 3.8 and 3.9, especially for subinterpreters. And I made PyGC_Head opaque in Python 3.9.
Examples of issues to make structures opaque:
PyGC_Head: https://bugs.python.org/issue40241 (done in Python 3.9) PyObject: https://bugs.python.org/issue39573 PyTypeObject: https://bugs.python.org/issue40170 PyThreadState: https://bugs.python.org/issue39947 PyInterpreterState: https://bugs.python.org/issue35886 (done in Python 3.8)
For the short term, my plan is to make structure opaque in the limited C API, before breaking more stuff in the public C API :-)
Indeed, your plan and the work you've been doing and discussing with other core devs about this (including at multiple sprints and summits) over the past 4+ years is the right one. Our reliance on structs and related cpp macros unfortunately exposed as public is a burden that freezes reasonable CPython VM implementation evolution options. This work moves us away from that into a better place one step at a time without mass disruption. More prior references related to this work are critical reading and should not be overlooked: [2017 "Keeping Python Competitive"] https://lwn.net/Articles/723949/ [2018 "Lets change the C API" thread] https://mail.python.org/archives/list/python-dev@python.org/thread/B67MYCAO4... [2019 "The C API"] https://pyfound.blogspot.com/2019/06/python-language-summit-lightning-talks-... [2020-04 "PEP: Modify the C API to hide implementation details" thread - with a lot of links to much earlier 2017 and such references] https://mail.python.org/archives/list/python-dev@python.org/thread/HKM774XKU... and Victors overall https://pythoncapi.readthedocs.io/roadmap.html as referenced a few places in those. It is also worth paying attention to the https://mail.python.org/archives/list/capi-sig@python.org/latest mailing list for anyone with a CPython C API interest. -gps
PyObject_CallNoArgs() seems harmless. Rationalizing the call API has merit, but PyObject_CallNoArgs() leads to PyObject_CallOneArg(), PyObject_CallTwoArgs(), etc. and an even larger API.
PyObject_CallOneArg() also exists: https://docs.python.org/dev/c-api/call.html#c.PyObject_CallOneArg
It was added as a private function https://bugs.python.org/issue37483 add made public in commit 3f563cea567fbfed9db539ecbbacfee2d86f7735 "bpo-39245: Make Vectorcall C API public (GH-17893)".
But it's missing in What's New in Python 3.9.
There is no plan for two or more arguments.
PyObject_GC_IsTracked(). I don't like this. Shouldn't GC track *all* objects? Even if it were named PyObject_Cycle_GC_IsTracked() it would be exposing internal implementation details for no good reason. A cycle GC that doesn't "track" individual objects, but treats all objects the same could be more efficient. In which case, what would this mean?
What is the purpose of PyObject_GC_IsFinalized()? Third party objects can easily tell if they have been finalized. Why they would ever need this information is a mystery to me.
Did you read the issues which added these functions to see the rationale? https://bugs.python.org/issue40241
I like the "(Contributed by xxx in bpo-xxx.)" in What's New in Python 3.9: it became trivial to find such rationale.
Victor -- Night gathers, and now my watch begins. It shall not end until my death. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/QZ2Q7ELT... Code of Conduct: http://python.org/psf/codeofconduct/
Maybe we can have a two-for-one special? You can add a new function to the API if you deprecate two.
On Wed, Jun 3, 2020 at 6:09 AM Mark Shannon <mark@hotpy.org> wrote:
Also, can we remove all the new API functions added in 3.9 before the release and it is too late?
I think it would be helpful to open an issue that lists the 40 new functions, so people could more easily review them before 3.9 is released. Only a few were discussed in this thread. Also, if the new function was "private" ("_" prefix), is there still a concern? --Chris
On Wed, Jun 3, 2020 at 7:12 AM Mark Shannon <mark@hotpy.org> wrote:
The size of the C API, as measured by `git grep PyAPI_FUNC | wc -l` has been steadily increasing over the last few releases.
3.5 1237 3.6 1304 3.7 1408 3.8 1478 3.9 1518
For reference the 2.7 branch has "only" 973 functions
It isn't as bad as that. Here I'm only looking at PyAPI_FUNC under Include/. From 3.5 to master the *public* C-API has increased by 71 functions (and the "private"/internal C-API by 189). "Private" is functions starting with "_" and VER TOT PUB + "_" 2.7 932 (752 + 178) 3.5 1181 (846 + 320) 3.6 1247 (851 + 380) 3.7 1350 (875 + 460 + 13 internal) 3.8 1424 (908 + 422 + 79 internal) 3.9 1447 (917 + 403 + 110 internal) m 1443 (917 + 401 + 108 internal) (This does not count changes in the number of macros, which may have gone down...or not.) FWIW, relative to the "cpython" API split that happened in 3.8 (and "internal" in 3.7): VER total Include/*.h Include/cpython/*.h Include/internal/*.h 2.7 932 932 (752 + 178) - - 3.5 1181 1181 (846 + 320) - - 3.6 1247 1247 (851 + 380) - - 3.7 1350 1350 (875 + 460) - 13 (0 + 13) 3.8 1424 1050 (800 + 249) 295 (108 + 173) 79 (0 + 79) 3.9 1447 944 (789 + 153) 393 (128 + 250) 110 (105 + 5) m 1443 941 (789 + 150) 394 (128 + 251) 108 (103 + 5) Here's the "command" I ran: for pat in 'Include/' 'Include/*.h' 'Include/cpython/*.h' 'Include/internal/*.h'; do echo " -- $pat --" echo $(git grep 'PyAPI_FUNC(' -- $pat | wc -l) '('$(git grep 'PyAPI_FUNC(.*) [^_]' -- $pat | wc -l) '+' $(git grep 'PyAPI_FUNC(.*) [_]' -- $pat | wc -l)')' done
Every one of these functions represents a maintenance burden. Removing them is painful and takes a lot of effort, but adding them is done casually, without a PEP or, in many cases, even a review.
I agree with regards to the public C-API, particularly the stable API.
We need to address what to do about the C API in the long term, but for now can we just stop making it larger? Please.
Also, can we remove all the new API functions added in 3.9 before the release and it is too late?
In 3.9 we have added 9 functions to the public C-API and removed 19 from the "private" C-API. The "internal" C-API grew by 31, but I don't see the point in changing any of those. -eric
participants (9)
-
Barry Scott
-
Chris Jerdonek
-
Eric Snow
-
Gregory P. Smith
-
Mark Shannon
-
Nathaniel Smith
-
Pablo Galindo Salgado
-
Simon Cross
-
Victor Stinner