
Maybe we could start by having the tool regenerate the file and verifying that it produces the same results? Then in the future we keep the file in the repo so changes to it can be tracked separately, but we run the tool as part of CI to make sure that its output still matches. This is what we do for other generated files like opcode.h, parser.c and so on. On Thu, Dec 9, 2021 at 10:31 AM Petr Viktorin <encukou@gmail.com> wrote:
I'll not get back to CPython until Tuesday, but I'll add a quick note for now. It's a bit blunt for lack of time; please don't be offended.
If the code is the authoritative source of truth, we need a proper parser to extract the information. But we can't really use an existing parser (e.g. we need to navigate various #ifdef combinations), and writing a correct (=tested) custom C parser is pretty expensive. C declarations being "deterministically discoverable by tools" is a myth. I know you wrote a parser (kudos!), but unfortunately I don't trust it enough to let it define the API. Bugs in the parser could result in the API definition silently changing.
That's why the info is in a separate version-controlled file, which must be explicitly modified. That file is the source of truth (or at least intent). There are also checks to ensure the code matches the manifest, so if you break things the CI should let you know. See the rationale in PEP 652: https://www.python.org/dev/peps/pep-0652/#rationale
As for the types you mentioned: * PyAPI_ABI_INDIRECT, PyAPI_ABI_ONLY - these should get a comment. I don't think adding machine-readable metadata (and tooling for it) would be worth it, but I won't block it. * PyAPI_ABI_ACCIDENTAL - could be deprecated in the Limited API, and later removed from it, becoming "PyAPI_ABI_ONLY".
On Thu, Dec 9, 2021 at 6:41 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
(replying to
https://mail.python.org/archives/list/python-dev@python.org/message/OJ65FPCJ... )
On Wed, Dec 8, 2021 at 10:06 AM Eric Snow <ericsnowcurrently@gmail.com>
wrote:
What about the various symbols listed in Misc/stable_abi.txt that were accidentally added to the limited API? Can we move toward dropping them from the stable ABI?
tl;dr We should consider making classifications related to the stable ABI harder to miss.
<context>
Knowing what is in the limited API is fairly straightforward. [1] However, it's clear that identifying what is part of the stable ABI, and why, is not so easy. Currently, we must rely on Misc/stable_abi.txt [2] (and the associated Tools/scripts/stable_abi.py). Documentation (C-API docs, PEPs, devguide) help too.
Yet, there's a concrete disconnect here: the header files are by definition the authoritative single-source-of-truth for the C-API and it's too easy to forget about supplemental info in another file or document. This out-of-sight-out-of-mind situation is part of how we accidentally added things to the limited API for a while. [3]
The stable ABI isn't the only area where we must identify different subsets of the C-API. However, in those other cases we use different structural/naming conventions to explicitly group things. Most importantly, each of those conventions makes the grouping unavoidable when reading the code. [4] For example:
* closely related declarations go in the same header file (and then also exposed via Include/Python.h) * prefixes (e.g. Py_, PyDict_) provides similar grouping * an additional underscore prefix identifies "private" C-API * symbols are explicitly identified as part of the C-API via macros (PyAPI_FUNC, PyAPI_DATA) [5] * relatively recently, different directories correspond to different API layers (Include, Include/cpython, Include/internal) [3]
</context>
Could we take a similar explicit, coupled-to-the-code approach to identify when the different stable ABI situations apply? Here's the specific approach I had in mind, with macros similar to PyAPI_FUNC:
* PyAPI_ABI_FUNC - in stable ABI when it wouldn't be normally (e.g. underscore prefix, in Include/internal) * PyAPI_ABI_INDIRECT - exposed in stable ABI due to a macro * PyAPI_ABI_ONLY - it only exists for ABI compatibility and isn't actually used any more * PyAPI_ABI_ACCIDENTAL - unintentionally added to limited API, probably not used there
(...or perhaps use a PyABI_ prefix, though that's a bit easy to miss when reading.)
As a reader I would find markers like this helpful in recognizing those special situations, as well as the constraints those situations impose on modification. At the least such macros would indicate something different is going on, and the macro name would be something I could look up if I needed more info. I expect others reading the code would get comparable value. I also expect tools like Tools/scripts/stable_abi.py would benefit.
-eric
[1] in Include/*.h and not #ifndef Py_LIMITED_API (sadly also making it easy to accidentally add things to the limited API, see [3]) [2] Before that you had to rely on comments or external documents or, in the worst case, work it out through careful study of the code, commit history, and mailing list archives. [3] The addition of Include/cpython and Include/internal helped us stop accidentally adding to the limited API. [4] It also makes the groupings deterministically discoverable by tools. [5] explicit use of "extern" indicates a different intent
Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CANB7JOA... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>