Making code object APIs unstable
In 3.11 we're changing a lot of details about code objects. Part of this is the "Faster CPython" work, part of it is other things (e.g. PEP 657 -- Fine Grained Error Locations in Tracebacks). As a result, the set of fields of the code object is changing. This is fine, the structure is part of the internal API anyway. But there's a problem with two public API functions, PyCode_New() and PyCode_NewWithPosArgs(). As we have them in the main (3.11) branch, their signatures are incompatible with previous versions, and they have to be since the set of values needed to create a code object is different. (The types.CodeType constructor signature is also changed, and so is its replace() method, but these aren't part of any stable API). Unfortunately, PyCode_New() and PyCode_NewWithPosArgs() are part of the PEP 387 stable ABI. What should we do? A. We could deprecate them, keep (restore) their old signatures, and create crippled code objects (no exception table, no endline/column tables, qualname defaults to name). B. We could deprecate them, restore the old signatures, and always raise an error when they are called. C. We could just delete them. D. We could keep them, with modified signatures, and to heck with ABI compatibility for these two. E. We could get rid of PyCode_NewWithPosArgs(), update PyCode() to add the posonlyargcount (which is the only difference between the two), and d*mn the torpedoes. F. Like (E), but keep PyCode_NewWithPosArgs() as an alias for PyCode_New() (and deprecate it). If these weren't part of the stable ABI, I'd choose (E). But because they are, I think only (A) or (B) are our options. The problem with (C) is that if there's code that links to them but doesn't call them (except in some corner case that the user can avoid), the code won't link even though it would work fine. The problem with (D) is that if it *is* called by code expecting the old signature it will segfault. I'm not keen on (A) since it can cause broken code objects when used to copy a code object with some modified metadata (e.g. a different filename), since there's no way to pass the exception table (and several other fields, but the exception table is an integral part of the code now). Code wanting to make slight modifications to code objects such as changing co_name or co_filename should switch to the .replace() API, which is much better at this (though calling it from C is a pain, it's possible). Code wanting to synthesize code should be updated for each release; we should probably require such code to be built with the internal API and use _PyCode_New(), which takes a struct argument containing all the necessary fields. Thoughts? I'm especially interested in Petr's opinion given that this is a case where we'd like to deprecate something in the stable ABI. See also discussion in https://bugs.python.org/issue40222 (esp. near the end). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 8/13/2021 1:24 PM, Guido van Rossum wrote:
In 3.11 we're changing a lot of details about code objects. Part of this is the "Faster CPython" work, part of it is other things (e.g. PEP 657 -- Fine Grained Error Locations in Tracebacks).
As a result, the set of fields of the code object is changing. This is fine, the structure is part of the internal API anyway.
But there's a problem with two public API functions, PyCode_New() and PyCode_NewWithPosArgs(). As we have them in the main (3.11) branch, their signatures are incompatible with previous versions, and they have to be since the set of values needed to create a code object is different. (The types.CodeType constructor signature is also changed, and so is its replace() method, but these aren't part of any stable API).
Unfortunately, PyCode_New() and PyCode_NewWithPosArgs() are part of the PEP 387 stable ABI. What should we do?
PEP 387 is Backwards Compatibility Policy https://www.python.org/dev/peps/pep-0387/ Did you mean PEP 384 -- Defining a Stable ABI https://www.python.org/dev/peps/pep-0384/ How are PyCode_xxx included? 384 defines code objects as 'internal objects'. "In some cases, the incompatibilities only affect internal objects of the interpreter, such as frame or code objects." And Firefox does not find 'pycode'.
A. We could deprecate them, keep (restore) their old signatures, and create crippled code objects (no exception table, no endline/column tables, qualname defaults to name).
B. We could deprecate them, restore the old signatures, and always raise an error when they are called.
These seem pretty useless.
C. We could just delete them.
D. We could keep them, with modified signatures, and to heck with ABI compatibility for these two.
E. We could get rid of PyCode_NewWithPosArgs(), update PyCode() to add the posonlyargcount (which is the only difference between the two), and d*mn the torpedoes.
F. Like (E), but keep PyCode_NewWithPosArgs() as an alias for PyCode_New() (and deprecate it).
-- Terry Jan Reedy
On Fri, Aug 13, 2021 at 11:17 AM Terry Reedy <tjreedy@udel.edu> wrote:
On 8/13/2021 1:24 PM, Guido van Rossum wrote: [...]
Unfortunately, PyCode_New() and PyCode_NewWithPosArgs() are part of the PEP 387 stable ABI. What should we do?
PEP 387 is Backwards Compatibility Policy https://www.python.org/dev/peps/pep-0387/ Did you mean PEP 384 -- Defining a Stable ABI https://www.python.org/dev/peps/pep-0384/
Aargh! You're right! I misremembered the PEP number (and didn't double check) and when Pablo said we were "constrained by PEP 387" I assumed he meant the stable ABI. Sorry for the goof-up. Now, backwards compatibility is still nothing to sneeze at, but at least we don't have to hem and haw about ABI compatibility. The question then is, can we break (source) backwards compatibility for these two functions, because in practice code objects have been unstable? Or do we need to go the deprecation route here using (A) or (B) for two releases. Presumably we could combine these two so that it reports a warning when called, and if the warning has been configured to be replaced with an error it will raise that. Thanks also to Eric for pointing out the same thing. (I should probably have waited to discuss this with him before posting. Sorry again.) On Fri, Aug 13, 2021 at 3:39 PM Jim J. Jewett <jimjjewett@gmail.com> wrote:
How badly would the code objects be crippled?
(no exception table, no endline/column tables, qualname defaults to name)
That sounds like it would be a pain for debugging, but might still work for source/code that hadn't actually changed and was just being re-compiled and run (possibly with the caveat that the data needs to be clean enough to avoid exceptions).
Yeah, I would be okay with faking it for endline/column tables and qualname, but I balk at the exception table -- this is used for the new 3.11 concept "zero cost exception handling" where a try block doesn't require any setup (the opcode for this has gone away). Too much valid code catches exceptions (e.g. KeyError) to trust that this will mostly do the right thing.
Since that is a common use case, and one where there is a good reason not to make any source-level changes, it would be good to keep compatibility for that minimal level.
I'm actually not sure what use case you're talking about. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 8/13/2021 8:45 PM, Guido van Rossum wrote:
Now, backwards compatibility is still nothing to sneeze at, but at least we don't have to hem and haw about ABI compatibility.
If back compatibility were our sacred prime directive, then we would not (or should not) have changed code objects in way that threatens it.
The question then is, can we break (source) backwards compatibility for these two functions, because in practice code objects have been unstable?
I presume that the existence of those functions is based on an expectation of stability that may have been true for years, but not recently. Or an expectation that any change would be part of a one-time overhaul where compatibility was suspended as much as for 3.0. I think that whatever we do now should allow for continued version-by-version changes in the future. -- Terry Jan Reedy
Guido van Rossum wrote:
On Fri, Aug 13, 2021 at 11:17 AM Terry Reedy tjreedy@udel.edu wrote:
On 8/13/2021 1:24 PM, Guido van Rossum wrote:
I'm actually not sure what use case you're talking about.
"source/code that hadn't actually changed and was just being re-compiled and run". I've done it as the first stage of porting or reviving long-dead code. I've seen it done when deployment is handed off to a different group than development, and also when using a library component that had not been recently maintained.
I'm a major consumer of these APIs as part of some commercial projects (unfortunately I can't discuss too much further) but we find it worth the effort of keeping up with CPython internal changes to continue using them. Option D seems like the best option from my point of view; any user would need to be able to keep up with bytecode changes already so I think backwards compatibility isn't really a concern. Unless you feel it's extremely important to follow PEP 387 (slippery slope?), I'd just say "to hell with it". Also, at least if it segfaults, people who haven't updated their code will know very quickly, whereas some other options might cause subtler bugs. On 13/08/2021 18:24, Guido van Rossum wrote:
In 3.11 we're changing a lot of details about code objects. Part of this is the "Faster CPython" work, part of it is other things (e.g. PEP 657 -- Fine Grained Error Locations in Tracebacks).
As a result, the set of fields of the code object is changing. This is fine, the structure is part of the internal API anyway.
But there's a problem with two public API functions, PyCode_New() and PyCode_NewWithPosArgs(). As we have them in the main (3.11) branch, their signatures are incompatible with previous versions, and they have to be since the set of values needed to create a code object is different. (The types.CodeType constructor signature is also changed, and so is its replace() method, but these aren't part of any stable API).
Unfortunately, PyCode_New() and PyCode_NewWithPosArgs() are part of the PEP 387 stable ABI. What should we do?
A. We could deprecate them, keep (restore) their old signatures, and create crippled code objects (no exception table, no endline/column tables, qualname defaults to name).
B. We could deprecate them, restore the old signatures, and always raise an error when they are called.
C. We could just delete them.
D. We could keep them, with modified signatures, and to heck with ABI compatibility for these two.
E. We could get rid of PyCode_NewWithPosArgs(), update PyCode() to add the posonlyargcount (which is the only difference between the two), and d*mn the torpedoes.
F. Like (E), but keep PyCode_NewWithPosArgs() as an alias for PyCode_New() (and deprecate it).
If these weren't part of the stable ABI, I'd choose (E). But because they are, I think only (A) or (B) are our options. The problem with (C) is that if there's code that links to them but doesn't call them (except in some corner case that the user can avoid), the code won't link even though it would work fine. The problem with (D) is that if it *is* called by code expecting the old signature it will segfault. I'm not keen on (A) since it can cause broken code objects when used to copy a code object with some modified metadata (e.g. a different filename), since there's no way to pass the exception table (and several other fields, but the exception table is an integral part of the code now).
Code wanting to make slight modifications to code objects such as changing co_name or co_filename should switch to the .replace() API, which is much better at this (though calling it from C is a pain, it's possible). Code wanting to synthesize code should be updated for each release; we should probably require such code to be built with the internal API and use _PyCode_New(), which takes a struct argument containing all the necessary fields.
Thoughts? I'm especially interested in Petr's opinion given that this is a case where we'd like to deprecate something in the stable ABI.
See also discussion in https://bugs.python.org/issue40222 <https://bugs.python.org/issue40222> (esp. near the end).
-- --Guido van Rossum (python.org/~guido <http://python.org/~guido>) /Pronouns: he/him //(why is my pronoun here?)/ <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ZWTBR5ES... Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Aug 13, 2021 at 11:29 AM Guido van Rossum <guido@python.org> wrote:
If these weren't part of the stable ABI, I'd choose (E).
They aren't in the stable ABI (or limited API). Instead, they are part of the broader public API (declared in Include/cpython/code.h, along with "struct PyCodeObject" and others). FWIW, there is actually very little API related to PyCodeObject that is in the limited API: * Include/code.h:typedef struct PyCodeObject PyCodeObject; * Include/genobject.h: PyCodeObject *prefix##_code; * Include/pyframe.h:PyAPI_FUNC(PyCodeObject *) PyFrame_GetCode(PyFrameObject *frame); All that said, the issue of compatibility remains. I mostly agree with Guido's analysis and his choice of (E), as long as it's appropriately documented as unstable. However, I'd probably pick (C) with a caveat. We already have a classification for this sort of unstable API: "internal". Given how code objects are so coupled to the CPython internals, I suggest that most API related to PyCodeObject belongs in the internal API (in Include/internal/pycore_code.h) and thus moved out of the public API. Folks that are creating code objects manually via the C-API are probably already doing low-level stuff that requires other "internal" API (via Py_BUILD_CORE, etc.). Otherwise they should use types.CodeType instead. Making that change would naturally include dropping PyCode_New() and PyCode_NewWithPosArgs(), as described in (C). However, we already have _PyCode_New() in the internal API. (It is slightly different but effectively equivalent.) We could either drop the underscore on _PyCode_New() or move the existing PyCode_NewWithPosArgs() (renamed to PyCode_New) to live beside it. -eric
How badly would the code objects be crippled?
(no exception table, no endline/column tables, qualname defaults to name)
That sounds like it would be a pain for debugging, but might still work for source/code that hadn't actually changed and was just being re-compiled and run (possibly with the caveat that the data needs to be clean enough to avoid exceptions). Since that is a common use case, and one where there is a good reason not to make any source-level changes, it would be good to keep compatibility for that minimal level. -jJ
13.08.21 20:24, Guido van Rossum пише:
If these weren't part of the stable ABI, I'd choose (E). But because they are, I think only (A) or (B) are our options. The problem with (C) is that if there's code that links to them but doesn't call them (except in some corner case that the user can avoid), the code won't link even though it would work fine. The problem with (D) is that if it *is* called by code expecting the old signature it will segfault. I'm not keen on (A) since it can cause broken code objects when used to copy a code object with some modified metadata (e.g. a different filename), since there's no way to pass the exception table (and several other fields, but the exception table is an integral part of the code now).
I agree that (A) and (B) are only options if we preserve binary compatibility. For practical reasons I prefer (B). We can make (A) working if add the exception table to the end of the bytecode array and the endline/column tables to the end of the lineno table. It would allow to re-construct the code object with some simple changes (like filename or replace some constants). Creating the code object from zero is version-specific in any case, because bytecode is changed in every version, and semantic of some fields can be changed too (e.g. support of negative offsets in the lineno table). But it would complicate the code object structure and the code that works with it in long term.
On Sat, Aug 14, 2021 at 4:56 AM Serhiy Storchaka <storchaka@gmail.com> wrote:
13.08.21 20:24, Guido van Rossum пише:
If these weren't part of the stable ABI, I'd choose (E). But because they are, I think only (A) or (B) are our options. The problem with (C) is that if there's code that links to them but doesn't call them (except in some corner case that the user can avoid), the code won't link even though it would work fine. The problem with (D) is that if it *is* called by code expecting the old signature it will segfault. I'm not keen on (A) since it can cause broken code objects when used to copy a code object with some modified metadata (e.g. a different filename), since there's no way to pass the exception table (and several other fields, but the exception table is an integral part of the code now).
I agree that (A) and (B) are only options if we preserve binary compatibility. For practical reasons I prefer (B).
We can make (A) working if add the exception table to the end of the bytecode array and the endline/column tables to the end of the lineno table. It would allow to re-construct the code object with some simple changes (like filename or replace some constants). Creating the code object from zero is version-specific in any case, because bytecode is changed in every version, and semantic of some fields can be changed too (e.g. support of negative offsets in the lineno table). But it would complicate the code object structure and the code that works with it in long term.
That sounds like a perversion of backward compatibility. The endline and column tables are already optional (there's a flag to suppress them and then the fields are set to None) and we can just not support code that catches exceptions. If you take e.g. def f(x): try: 1/0 except: print("NO") and you remove the exception table you get code that is equivalent to this: def f(x): 1/0 plus some unreachable bytecode. My current proposal is to issue a DeprecationWarning in PyCode_New() and PyCode_NewWithPosArgs(), which can be turned into an error using a command-line flag. If it's made an error, we effectively have (B); by default, we have (A). Then in 3.13 we can drop them completely. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 8/16/2021 12:47 AM, Guido van Rossum wrote:
My current proposal is to issue a DeprecationWarning in PyCode_New() and PyCode_NewWithPosArgs(), which can be turned into an error using a command-line flag. If it's made an error, we effectively have (B); by default, we have (A).
Then in 3.13 we can drop them completely.
We definitely had legitimate use cases come up when adding positional arguments (hence the new API, rather than breaking the existing one, which was the first attempt at adding the feature). I don't recall exactly what they are (perhaps Pablo does, or they may be in email/issue archives), but since they exist, presumably they are useful and viable _despite_ the bytecode varying between releases. This suggests there's probably a better API we should add at the same time - possibly some kind of unmarshalling or cloning-with-updates function? Cheers, Steve
On Mon, Aug 16, 2021 at 9:30 AM Steve Dower <steve.dower@python.org> wrote:
On 8/16/2021 12:47 AM, Guido van Rossum wrote:
My current proposal is to issue a DeprecationWarning in PyCode_New() and PyCode_NewWithPosArgs(), which can be turned into an error using a command-line flag. If it's made an error, we effectively have (B); by default, we have (A).
Then in 3.13 we can drop them completely.
We definitely had legitimate use cases come up when adding positional arguments (hence the new API, rather than breaking the existing one, which was the first attempt at adding the feature).
I don't recall exactly what they are (perhaps Pablo does, or they may be in email/issue archives), but since they exist, presumably they are useful and viable _despite_ the bytecode varying between releases. This suggests there's probably a better API we should add at the same time - possibly some kind of unmarshalling or cloning-with-updates function?
I presume the use cases are essentially some variant of the .replace() API that exists at the Python level. At the C level you would get all fields from an existing code object and pass them to PyCode_New[WithPosArgs] except for e.g. the co_filename field. Unfortunately those use cases will still break if there are any try blocks in the code or if the endline/column info is needed. Also you can't access any of the code object's fields without the internal API (this wasn't always so). So I think it's different now. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Tue, 17 Aug 2021, 4:30 am Guido van Rossum, <guido@python.org> wrote:
On Mon, Aug 16, 2021 at 9:30 AM Steve Dower <steve.dower@python.org> wrote:
On 8/16/2021 12:47 AM, Guido van Rossum wrote:
My current proposal is to issue a DeprecationWarning in PyCode_New() and PyCode_NewWithPosArgs(), which can be turned into an error using a command-line flag. If it's made an error, we effectively have (B); by default, we have (A).
Then in 3.13 we can drop them completely.
We definitely had legitimate use cases come up when adding positional arguments (hence the new API, rather than breaking the existing one, which was the first attempt at adding the feature).
I don't recall exactly what they are (perhaps Pablo does, or they may be in email/issue archives), but since they exist, presumably they are useful and viable _despite_ the bytecode varying between releases. This suggests there's probably a better API we should add at the same time - possibly some kind of unmarshalling or cloning-with-updates function?
I presume the use cases are essentially some variant of the .replace() API that exists at the Python level. At the C level you would get all fields from an existing code object and pass them to PyCode_New[WithPosArgs] except for e.g. the co_filename field. Unfortunately those use cases will still break if there are any try blocks in the code or if the endline/column info is needed. Also you can't access any of the code object's fields without the internal API (this wasn't always so). So I think it's different now.
A cloning-with-replacement API that accepted the base code object and the "safe to modify" fields could be a good complement to the API deprecation proposal. Moving actual "from scratch" code object creation behind the Py_BUILD_CORE guard with an underscore prefix on the name would also make sense, since it defines a key piece of the compiler/interpreter boundary. Cheers, Nick. P.S. Noting an idea that won't work, in case anyone else reading the thread was thinking the same thing: a "PyType_FromSpec" style API won't help here, as the issue is that the compiler is now doing more work up front and recording that extra info in the code object for the interpreter to use. There is no way to synthesise that info if it isn't passed to the constructor, as it isn't intrinsically recorded in the opcode sequence.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/K2PUYK6M... Code of Conduct: http://python.org/psf/codeofconduct/
On Mon, Aug 16, 2021 at 4:44 PM Nick Coghlan <ncoghlan@gmail.com> wrote:
[...] A cloning-with-replacement API that accepted the base code object and the "safe to modify" fields could be a good complement to the API deprecation proposal.
Yes (I forgot to mention that).
Moving actual "from scratch" code object creation behind the Py_BUILD_CORE guard with an underscore prefix on the name would also make sense, since it defines a key piece of the compiler/interpreter boundary.
Yeah, we have _PyCode_New() for that.
Cheers, Nick.
P.S. Noting an idea that won't work, in case anyone else reading the thread was thinking the same thing: a "PyType_FromSpec" style API won't help here, as the issue is that the compiler is now doing more work up front and recording that extra info in the code object for the interpreter to use. There is no way to synthesise that info if it isn't passed to the constructor, as it isn't intrinsically recorded in the opcode sequence.
That's the API style that _PyCode_New() uses (thanks to Eric who IIRC pushed for this and implemented it). You gave me an idea now: the C equivalent to .replace() could use the same input structure; one can leave fields NULL that should be copied from the original unmodified. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Doing a search of a huge codebase (work), the predominant user of PyCode_New* APIs appears to be checked in Cython generated code (in all sorts of third_party OSS projects). It's in the boilerplate that Cython extensions make use of via it's __Pyx_PyCode_New macro. https://github.com/cython/cython/blob/master/Cython/Utility/ModuleSetupCode.... I saw very few non-Cython uses. There are some, but at a very quick first glance they appear simple - easy enough to reach out to the projects with a PR to update their code. The Cython use will require people to upgrade Cython and regenerate their code before they can use the Python version that changes these. That is not an uncommon thing for Cython. It's unfortunate that many projects on ship generated sources rather than use Cython at build time, but that isn't _our_ problem to solve. The more often we change internal APIs that things depend on, the more people will move their projects towards doing the right thing with regards to either not using said APIs or rerunning an up to date code generator as part of their build instead of checking in generated unstable API using sources. -gps On Mon, Aug 16, 2021 at 8:04 PM Guido van Rossum <guido@python.org> wrote:
On Mon, Aug 16, 2021 at 4:44 PM Nick Coghlan <ncoghlan@gmail.com> wrote:
[...] A cloning-with-replacement API that accepted the base code object and the "safe to modify" fields could be a good complement to the API deprecation proposal.
Yes (I forgot to mention that).
Moving actual "from scratch" code object creation behind the Py_BUILD_CORE guard with an underscore prefix on the name would also make sense, since it defines a key piece of the compiler/interpreter boundary.
Yeah, we have _PyCode_New() for that.
Cheers, Nick.
P.S. Noting an idea that won't work, in case anyone else reading the thread was thinking the same thing: a "PyType_FromSpec" style API won't help here, as the issue is that the compiler is now doing more work up front and recording that extra info in the code object for the interpreter to use. There is no way to synthesise that info if it isn't passed to the constructor, as it isn't intrinsically recorded in the opcode sequence.
That's the API style that _PyCode_New() uses (thanks to Eric who IIRC pushed for this and implemented it). You gave me an idea now: the C equivalent to .replace() could use the same input structure; one can leave fields NULL that should be copied from the original unmodified.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/NWYMCDAM... Code of Conduct: http://python.org/psf/codeofconduct/
I'm late to the thread, and as I read it I see everything I wanted to say was covered already :) So just a few clarifications. The stable ABI is not defined by PEP 384 or PEP 652 or by the header something is defined in, but by the docs: - https://docs.python.org/dev/c-api/stable.html and changes to it are covered in the devguide: - https://devguide.python.org/c-api/ We have 3 API layers: 1. Internal API, guarded by Py_BUILD_CORE, can break *even in point releases*. (Py_BUILD_CORE means just that: things like `_PyCode_New` can only be used safely if you build/embed CPython yourself.) 2. Regular C-API, covered by PEP 387 (breaking changes need deprecation for 2 releases, or an exception from the SC); `PyCode_New*` is here now 2. Stable ABI, which is hard to change, and thankfully isn't involved here. I can see that having `.replace()` equivalent in the C API would be "worth the effort of [its users having to] keeping up with CPython internal changes" (to quote Patrick). Looks like we could use something between layers 1 and 2 above for "high-maintenance" users (like Cython): API that will work for all of 3.11.x, but can freely break for 3.12. I don't think this needs an explicit API layer, though: just a note in the docs that a new `PyCode_NewWithAllTheWithBellsAndWhistles` is expected to change in point releases. But... Guido:
[struct rather than N arguments] is the API style that _PyCode_New() uses (thanks to Eric who IIRC pushed for this and implemented it). You gave me an idea now: the C equivalent to .replace() could use the same input structure; one can leave fields NULL that should be copied from the original unmodified.
From a usability point of view, that's a much better idea than a function that's expected to change. It would probably also be easier to implement than an entirely separate public API. Nick:
P.S. Noting an idea that won't work, in case anyone else reading the thread was thinking the same thing: a "PyType_FromSpec" style API won't help here, as the issue is that the compiler is now doing more work up front and recording that extra info in the code object for the interpreter to use. There is no way to synthesise that info if it isn't passed to the constructor, as it isn't intrinsically recorded in the opcode sequence.
I guess it might be possible to add a flag that says a piece of bytecode object has exception handling and so it needs an exception table, and have the old API raise when the flag is on. It's probably not worth the effort, though.
Since Cython is a common consumer of this C API, can somone please dig into Cython to see exactly what it needs in terms of API? How does Cython create all arguments of the __Pyx_PyCode_New() macro? Does it copy an existing function to only override some fields, something like CodeType.replace(field=new_value)? If possible, I would prefer that Cython only uses the *public* C API. Otherwise, it will be very likely that Cython will break at every single Python release. Cython has a small team to maintain the code base, whereas CPython evolves much faster with a larger team. Victor On Tue, Aug 17, 2021 at 8:51 AM Gregory P. Smith <greg@krypto.org> wrote:
Doing a search of a huge codebase (work), the predominant user of PyCode_New* APIs appears to be checked in Cython generated code (in all sorts of third_party OSS projects). It's in the boilerplate that Cython extensions make use of via it's __Pyx_PyCode_New macro. https://github.com/cython/cython/blob/master/Cython/Utility/ModuleSetupCode....
I saw very few non-Cython uses. There are some, but at a very quick first glance they appear simple - easy enough to reach out to the projects with a PR to update their code.
The Cython use will require people to upgrade Cython and regenerate their code before they can use the Python version that changes these. That is not an uncommon thing for Cython. It's unfortunate that many projects on ship generated sources rather than use Cython at build time, but that isn't _our_ problem to solve. The more often we change internal APIs that things depend on, the more people will move their projects towards doing the right thing with regards to either not using said APIs or rerunning an up to date code generator as part of their build instead of checking in generated unstable API using sources.
-gps
On Mon, Aug 16, 2021 at 8:04 PM Guido van Rossum <guido@python.org> wrote:
On Mon, Aug 16, 2021 at 4:44 PM Nick Coghlan <ncoghlan@gmail.com> wrote:
[...] A cloning-with-replacement API that accepted the base code object and the "safe to modify" fields could be a good complement to the API deprecation proposal.
Yes (I forgot to mention that).
Moving actual "from scratch" code object creation behind the Py_BUILD_CORE guard with an underscore prefix on the name would also make sense, since it defines a key piece of the compiler/interpreter boundary.
Yeah, we have _PyCode_New() for that.
Cheers, Nick.
P.S. Noting an idea that won't work, in case anyone else reading the thread was thinking the same thing: a "PyType_FromSpec" style API won't help here, as the issue is that the compiler is now doing more work up front and recording that extra info in the code object for the interpreter to use. There is no way to synthesise that info if it isn't passed to the constructor, as it isn't intrinsically recorded in the opcode sequence.
That's the API style that _PyCode_New() uses (thanks to Eric who IIRC pushed for this and implemented it). You gave me an idea now: the C equivalent to .replace() could use the same input structure; one can leave fields NULL that should be copied from the original unmodified.
-- --Guido van Rossum (python.org/~guido) Pronouns: he/him (why is my pronoun here?) _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/NWYMCDAM... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/67DMIW7N... Code of Conduct: http://python.org/psf/codeofconduct/
-- Night gathers, and now my watch begins. It shall not end until my death.
Victor Stinner wrote:
Since Cython is a common consumer of this C API, can somone please dig into Cython to see exactly what it needs in terms of API? How does Cython create all arguments of the __Pyx_PyCode_New() macro? Does it copy an existing function to only override some fields, something like CodeType.replace(field=new_value)? If possible, I would prefer that Cython only uses the *public* C API. Otherwise, it will be very likely that Cython will break at every single Python release. Cython has a small team to maintain the code base, whereas CPython evolves much faster with a larger team. Victor
Cython only uses it here, to construct a fake code object with mostly blank values for each of the Cython function objects: https://github.com/cython/cython/blob/8af0271186cc642436306274564986888d5e64... It's not actually intending to execute the code object at any point, it's created so that fake tracebacks get a filename/line number, the function objects have the argument count values for inspect to read and so Cython can emulate profiler/tracer events. So for Cython's case in particular it could a more stable API function, though said hypothetical API would have to have updated to handle pos-only args anyway also.
+cc: cython-devel background reading for those new to the thread: https://mail.python.org/archives/list/python-dev@python.org/thread/ZWTBR5ESY... On Tue, Aug 17, 2021 at 9:47 AM Victor Stinner <vstinner@python.org> wrote:
Since Cython is a common consumer of this C API, can somone please dig into Cython to see exactly what it needs in terms of API? How does Cython create all arguments of the __Pyx_PyCode_New() macro? Does it copy an existing function to only override some fields, something like CodeType.replace(field=new_value)?
If possible, I would prefer that Cython only uses the *public* C API. Otherwise, it will be very likely that Cython will break at every single Python release. Cython has a small team to maintain the code base, whereas CPython evolves much faster with a larger team.
Victor
I don't claim knowledge of Cython internals, but the two places it appears to call it's __Pyx_PyCode_New macro are: https://github.com/cython/cython/blob/master/Cython/Utility/Exceptions.c#L76... in __Pyx_CreateCodeObjectForTraceback() - this one already has a `#if CYTHON_COMPILING_IN_LIMITED_API` code path option in it. and https://github.com/cython/cython/blob/master/Cython/Compiler/ExprNodes.py#L9... in CodeObjectNode.generate_result_code() that creates PyCodeObject's for CyFunction instances per its comment. Slightly described in this comment http://google3/third_party/py/cython/files/Cython/Compiler/ExprNodes.py?l=39.... I don't see anything obvious mentioning the limited API in that code generator. it'd be best to loop in Cython maintainers for more of an idea of Cython's intents and needs with PyCode_New APIs. I've cc'd cython-devel@python.org. -Greg
On Tue, Aug 17, 2021 at 8:51 AM Gregory P. Smith <greg@krypto.org> wrote:
Doing a search of a huge codebase (work), the predominant user of
PyCode_New* APIs appears to be checked in Cython generated code (in all sorts of third_party OSS projects). It's in the boilerplate that Cython extensions make use of via it's __Pyx_PyCode_New macro. https://github.com/cython/cython/blob/master/Cython/Utility/ModuleSetupCode....
I saw very few non-Cython uses. There are some, but at a very quick
first glance they appear simple - easy enough to reach out to the projects with a PR to update their code.
The Cython use will require people to upgrade Cython and regenerate
their code before they can use the Python version that changes these. That is not an uncommon thing for Cython. It's unfortunate that many projects on ship generated sources rather than use Cython at build time, but that isn't _our_ problem to solve. The more often we change internal APIs that things depend on, the more people will move their projects towards doing the right thing with regards to either not using said APIs or rerunning an up to date code generator as part of their build instead of checking in generated unstable API using sources.
-gps
On Mon, Aug 16, 2021 at 8:04 PM Guido van Rossum <guido@python.org>
On Mon, Aug 16, 2021 at 4:44 PM Nick Coghlan <ncoghlan@gmail.com>
wrote:
[...] A cloning-with-replacement API that accepted the base code object and
Yes (I forgot to mention that).
Moving actual "from scratch" code object creation behind the
Py_BUILD_CORE guard with an underscore prefix on the name would also make sense, since it defines a key piece of the compiler/interpreter boundary.
Yeah, we have _PyCode_New() for that.
Cheers, Nick.
P.S. Noting an idea that won't work, in case anyone else reading the
That's the API style that _PyCode_New() uses (thanks to Eric who IIRC
wrote: the "safe to modify" fields could be a good complement to the API deprecation proposal. thread was thinking the same thing: a "PyType_FromSpec" style API won't help here, as the issue is that the compiler is now doing more work up front and recording that extra info in the code object for the interpreter to use. There is no way to synthesise that info if it isn't passed to the constructor, as it isn't intrinsically recorded in the opcode sequence. pushed for this and implemented it). You gave me an idea now: the C equivalent to .replace() could use the same input structure; one can leave fields NULL that should be copied from the original unmodified.
-- --Guido van Rossum (python.org/~guido) Pronouns: he/him (why is my pronoun here?) _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/NWYMCDAM...
Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/67DMIW7N... Code of Conduct: http://python.org/psf/codeofconduct/
-- Night gathers, and now my watch begins. It shall not end until my death.
Both locations Greg found seem to be creating a completely *empty* code object except for the name, filename and lineno. So that would actually work fine with my proposal. On Tue, Aug 17, 2021 at 4:24 PM Gregory P. Smith <greg@krypto.org> wrote:
+cc: cython-devel
background reading for those new to the thread: https://mail.python.org/archives/list/python-dev@python.org/thread/ZWTBR5ESY...
On Tue, Aug 17, 2021 at 9:47 AM Victor Stinner <vstinner@python.org> wrote:
Since Cython is a common consumer of this C API, can somone please dig into Cython to see exactly what it needs in terms of API? How does Cython create all arguments of the __Pyx_PyCode_New() macro? Does it copy an existing function to only override some fields, something like CodeType.replace(field=new_value)?
If possible, I would prefer that Cython only uses the *public* C API. Otherwise, it will be very likely that Cython will break at every single Python release. Cython has a small team to maintain the code base, whereas CPython evolves much faster with a larger team.
Victor
I don't claim knowledge of Cython internals, but the two places it appears to call it's __Pyx_PyCode_New macro are:
https://github.com/cython/cython/blob/master/Cython/Utility/Exceptions.c#L76... in __Pyx_CreateCodeObjectForTraceback() - this one already has a `#if CYTHON_COMPILING_IN_LIMITED_API` code path option in it.
and
https://github.com/cython/cython/blob/master/Cython/Compiler/ExprNodes.py#L9... in CodeObjectNode.generate_result_code() that creates PyCodeObject's for CyFunction instances per its comment. Slightly described in this comment http://google3/third_party/py/cython/files/Cython/Compiler/ExprNodes.py?l=39.... I don't see anything obvious mentioning the limited API in that code generator.
it'd be best to loop in Cython maintainers for more of an idea of Cython's intents and needs with PyCode_New APIs. I've cc'd cython-devel@python.org .
-Greg
On Tue, Aug 17, 2021 at 8:51 AM Gregory P. Smith <greg@krypto.org> wrote:
Doing a search of a huge codebase (work), the predominant user of
PyCode_New* APIs appears to be checked in Cython generated code (in all sorts of third_party OSS projects). It's in the boilerplate that Cython extensions make use of via it's __Pyx_PyCode_New macro. https://github.com/cython/cython/blob/master/Cython/Utility/ModuleSetupCode....
I saw very few non-Cython uses. There are some, but at a very quick
first glance they appear simple - easy enough to reach out to the projects with a PR to update their code.
The Cython use will require people to upgrade Cython and regenerate
their code before they can use the Python version that changes these. That is not an uncommon thing for Cython. It's unfortunate that many projects on ship generated sources rather than use Cython at build time, but that isn't _our_ problem to solve. The more often we change internal APIs that things depend on, the more people will move their projects towards doing the right thing with regards to either not using said APIs or rerunning an up to date code generator as part of their build instead of checking in generated unstable API using sources.
-gps
On Mon, Aug 16, 2021 at 8:04 PM Guido van Rossum <guido@python.org>
On Mon, Aug 16, 2021 at 4:44 PM Nick Coghlan <ncoghlan@gmail.com>
wrote:
[...] A cloning-with-replacement API that accepted the base code object and
Yes (I forgot to mention that).
Moving actual "from scratch" code object creation behind the
Py_BUILD_CORE guard with an underscore prefix on the name would also make sense, since it defines a key piece of the compiler/interpreter boundary.
Yeah, we have _PyCode_New() for that.
Cheers, Nick.
P.S. Noting an idea that won't work, in case anyone else reading the
That's the API style that _PyCode_New() uses (thanks to Eric who IIRC
wrote: the "safe to modify" fields could be a good complement to the API deprecation proposal. thread was thinking the same thing: a "PyType_FromSpec" style API won't help here, as the issue is that the compiler is now doing more work up front and recording that extra info in the code object for the interpreter to use. There is no way to synthesise that info if it isn't passed to the constructor, as it isn't intrinsically recorded in the opcode sequence. pushed for this and implemented it). You gave me an idea now: the C equivalent to .replace() could use the same input structure; one can leave fields NULL that should be copied from the original unmodified.
-- --Guido van Rossum (python.org/~guido) Pronouns: he/him (why is my pronoun here?) _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/NWYMCDAM...
Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/67DMIW7N... Code of Conduct: http://python.org/psf/codeofconduct/
-- Night gathers, and now my watch begins. It shall not end until my death.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Guido van Rossum schrieb am 13.08.21 um 19:24:
In 3.11 we're changing a lot of details about code objects. Part of this is the "Faster CPython" work, part of it is other things (e.g. PEP 657 -- Fine Grained Error Locations in Tracebacks).
As a result, the set of fields of the code object is changing. This is fine, the structure is part of the internal API anyway.
But there's a problem with two public API functions, PyCode_New() and PyCode_NewWithPosArgs(). As we have them in the main (3.11) branch, their signatures are incompatible with previous versions, and they have to be since the set of values needed to create a code object is different. (The types.CodeType constructor signature is also changed, and so is its replace() method, but these aren't part of any stable API).
Unfortunately, PyCode_New() and PyCode_NewWithPosArgs() are part of the PEP 387 stable ABI. What should we do?
A. We could deprecate them, keep (restore) their old signatures, and create crippled code objects (no exception table, no endline/column tables, qualname defaults to name).
B. We could deprecate them, restore the old signatures, and always raise an error when they are called.
C. We could just delete them.
D. We could keep them, with modified signatures, and to heck with ABI compatibility for these two.
E. We could get rid of PyCode_NewWithPosArgs(), update PyCode() to add the posonlyargcount (which is the only difference between the two), and d*mn the torpedoes.
F. Like (E), but keep PyCode_NewWithPosArgs() as an alias for PyCode_New() (and deprecate it).
If these weren't part of the stable ABI, I'd choose (E). [...]
I also vote for (E). The creation of a code object is tied to interpreter internals and thus shouldn't be (or have been) declared stable. I think the only problem with that argument is that code objects are required for frames. You could argue the same way about frames, but then it becomes really tricky to, you know, create frames for non-Python code. Since we're discussing this in the context of PEP 657, I wonder if there's a better way to create tracebacks from C code, other than creating fake frames with fake code objects. Cython uses code objects and frames for the following use cases: - tracing generated C code at the Python syntax level - profiling C-implemented functions - tracebacks for C code Having a way to do these three efficiently (i.e. with close to zero runtime overhead) without having to reach into internals of the interpreter state, code objects and frames, would be nice. Failing that, I'm ok with declaring the relevant structs and C-API functions non-stable and letting Cython use them as such, as we always did. Stefan
If creating a fake frame is a common use case, we can maybe write a public C API for that. For example, I saw parser injecting frames to show the file name and line number of the parsed file in the traceback. Victor On Wed, Sep 1, 2021 at 4:07 AM Stefan Behnel <stefan_ml@behnel.de> wrote:
Guido van Rossum schrieb am 13.08.21 um 19:24:
In 3.11 we're changing a lot of details about code objects. Part of this is the "Faster CPython" work, part of it is other things (e.g. PEP 657 -- Fine Grained Error Locations in Tracebacks).
As a result, the set of fields of the code object is changing. This is fine, the structure is part of the internal API anyway.
But there's a problem with two public API functions, PyCode_New() and PyCode_NewWithPosArgs(). As we have them in the main (3.11) branch, their signatures are incompatible with previous versions, and they have to be since the set of values needed to create a code object is different. (The types.CodeType constructor signature is also changed, and so is its replace() method, but these aren't part of any stable API).
Unfortunately, PyCode_New() and PyCode_NewWithPosArgs() are part of the PEP 387 stable ABI. What should we do?
A. We could deprecate them, keep (restore) their old signatures, and create crippled code objects (no exception table, no endline/column tables, qualname defaults to name).
B. We could deprecate them, restore the old signatures, and always raise an error when they are called.
C. We could just delete them.
D. We could keep them, with modified signatures, and to heck with ABI compatibility for these two.
E. We could get rid of PyCode_NewWithPosArgs(), update PyCode() to add the posonlyargcount (which is the only difference between the two), and d*mn the torpedoes.
F. Like (E), but keep PyCode_NewWithPosArgs() as an alias for PyCode_New() (and deprecate it).
If these weren't part of the stable ABI, I'd choose (E). [...]
I also vote for (E). The creation of a code object is tied to interpreter internals and thus shouldn't be (or have been) declared stable.
I think the only problem with that argument is that code objects are required for frames. You could argue the same way about frames, but then it becomes really tricky to, you know, create frames for non-Python code.
Since we're discussing this in the context of PEP 657, I wonder if there's a better way to create tracebacks from C code, other than creating fake frames with fake code objects.
Cython uses code objects and frames for the following use cases:
- tracing generated C code at the Python syntax level - profiling C-implemented functions - tracebacks for C code
Having a way to do these three efficiently (i.e. with close to zero runtime overhead) without having to reach into internals of the interpreter state, code objects and frames, would be nice.
Failing that, I'm ok with declaring the relevant structs and C-API functions non-stable and letting Cython use them as such, as we always did.
Stefan
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/XYNNMH57... Code of Conduct: http://python.org/psf/codeofconduct/
-- Night gathers, and now my watch begins. It shall not end until my death.
If creating a fake frame is a common use case, we can maybe write a public C API for that.
For example, I saw parser injecting frames to show the file name and line number of the parsed file in the
I don't think we should think on those terms. We certainly don't want to be on a case where yet again we cannot change the internals because we have an official C-API exposed. traceback. You saw "parser" doing that? What is "parser"? Certainly is not the CPython parser because I don't recall that we do any of that. On Wed, 1 Sept 2021 at 17:48, Victor Stinner <vstinner@python.org> wrote:
If creating a fake frame is a common use case, we can maybe write a public C API for that. For example, I saw parser injecting frames to show the file name and line number of the parsed file in the traceback.
Victor
On Wed, Sep 1, 2021 at 4:07 AM Stefan Behnel <stefan_ml@behnel.de> wrote:
Guido van Rossum schrieb am 13.08.21 um 19:24:
In 3.11 we're changing a lot of details about code objects. Part of
the "Faster CPython" work, part of it is other things (e.g. PEP 657 -- Fine Grained Error Locations in Tracebacks).
As a result, the set of fields of the code object is changing. This is fine, the structure is part of the internal API anyway.
But there's a problem with two public API functions, PyCode_New() and PyCode_NewWithPosArgs(). As we have them in the main (3.11) branch,
signatures are incompatible with previous versions, and they have to be since the set of values needed to create a code object is different. (The types.CodeType constructor signature is also changed, and so is its replace() method, but these aren't part of any stable API).
Unfortunately, PyCode_New() and PyCode_NewWithPosArgs() are part of
387 stable ABI. What should we do?
A. We could deprecate them, keep (restore) their old signatures, and create crippled code objects (no exception table, no endline/column tables, qualname defaults to name).
B. We could deprecate them, restore the old signatures, and always raise an error when they are called.
C. We could just delete them.
D. We could keep them, with modified signatures, and to heck with ABI compatibility for these two.
E. We could get rid of PyCode_NewWithPosArgs(), update PyCode() to add
posonlyargcount (which is the only difference between the two), and d*mn the torpedoes.
F. Like (E), but keep PyCode_NewWithPosArgs() as an alias for PyCode_New() (and deprecate it).
If these weren't part of the stable ABI, I'd choose (E). [...]
I also vote for (E). The creation of a code object is tied to interpreter internals and thus shouldn't be (or have been) declared stable.
I think the only problem with that argument is that code objects are required for frames. You could argue the same way about frames, but then it becomes really tricky to, you know, create frames for non-Python code.
Since we're discussing this in the context of PEP 657, I wonder if
this is their the PEP the there's
a better way to create tracebacks from C code, other than creating fake frames with fake code objects.
Cython uses code objects and frames for the following use cases:
- tracing generated C code at the Python syntax level - profiling C-implemented functions - tracebacks for C code
Having a way to do these three efficiently (i.e. with close to zero runtime overhead) without having to reach into internals of the interpreter state, code objects and frames, would be nice.
Failing that, I'm ok with declaring the relevant structs and C-API functions non-stable and letting Cython use them as such, as we always did.
Stefan
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/XYNNMH57... Code of Conduct: http://python.org/psf/codeofconduct/
-- Night gathers, and now my watch begins. It shall not end until my death. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/QYDIQ5SS... Code of Conduct: http://python.org/psf/codeofconduct/
I saw Python projects injecting fake frames for XML and JSON parsers, maybe also configuration file (.ini?) parsers. So text files which have line numbers ;-) On Wed, Sep 1, 2021 at 7:33 PM Pablo Galindo Salgado <pablogsal@gmail.com> wrote:
I don't think we should think on those terms. We certainly don't want to be on a case where yet again we cannot change the internals because we have an official C-API exposed.
PyCode_New() is annoying since it requires to provide *all* arguments. I'm thinking of an Frame API which only allows to set 2 values: filename and line number. Nothing else. Something like: frame = PyFrame_New(filename, lineno). Victor -- Night gathers, and now my watch begins. It shall not end until my death.
On Thu, 2 Sep 2021, 7:08 am Victor Stinner, <vstinner@python.org> wrote:
I saw Python projects injecting fake frames for XML and JSON parsers, maybe also configuration file (.ini?) parsers. So text files which have line numbers ;-)
On Wed, Sep 1, 2021 at 7:33 PM Pablo Galindo Salgado <pablogsal@gmail.com> wrote:
I don't think we should think on those terms. We certainly don't want to be on a case where yet again we cannot change the internals because we have an official C-API exposed.
PyCode_New() is annoying since it requires to provide *all* arguments. I'm thinking of an Frame API which only allows to set 2 values: filename and line number. Nothing else.
Something like: frame = PyFrame_New(filename, lineno).
Perhaps "PyCode_FromLocation(file name, lineno)" for the new "good enough to satisfy trace hooks and exception tracebacks" API? These are needed when creating function-like (et al) objects from extension module code, not just for runtime frames. And then explicitly define which code object fields are expected to always be populated and which are expected to be None on emulated code objects that don't contain Python byte code? (e.g. some fields should be zero or the empty tuple, rather than being set to None) Cheers, Nick.
On 2/09/21 4:46 am, Victor Stinner wrote:
If creating a fake frame is a common use case, we can maybe write a public C API for that. For example, I saw parser injecting frames to show the file name and line number of the parsed file in the traceback.
The way I would like to see this addressed is to make it possible to attach a filename and line number directly to a traceback object, without needing a frame or code object at all. Creating a fake frame and code object just to do this is IMO an ugly hack that should not be necessary. -- Greg
On Thu, 2 Sep 2021 13:31:32 +1200 Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 2/09/21 4:46 am, Victor Stinner wrote:
If creating a fake frame is a common use case, we can maybe write a public C API for that. For example, I saw parser injecting frames to show the file name and line number of the parsed file in the traceback.
The way I would like to see this addressed is to make it possible to attach a filename and line number directly to a traceback object, without needing a frame or code object at all.
Tracebacks are linked in a single direction, to go the other direction you need to walk the frames attached to the traceback. If there is no frame on the traceback, you cannot go the other direction. So a (fake or not) frame object is still desirable, IMHO. Regards Antoine.
On 2/09/21 7:46 pm, Antoine Pitrou wrote:
Tracebacks are linked in a single direction, to go the other direction you need to walk the frames attached to the traceback.
So a (fake or not) frame object is still desirable, IMHO.
Could we at least remove the necessity for a fake code object? -- Greg
FWIW I've applied for an exception from the two-release deprecation policy from the SC: https://github.com/python/steering-council/issues/75 On Thu, Sep 2, 2021 at 1:12 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 2/09/21 7:46 pm, Antoine Pitrou wrote:
Tracebacks are linked in a single direction, to go the other direction you need to walk the frames attached to the traceback.
So a (fake or not) frame object is still desirable, IMHO.
Could we at least remove the necessity for a fake code object?
-- Greg
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/TDLCJHNQ... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Thu, Sep 2, 2021 at 11:15 PM Guido van Rossum <guido@python.org> wrote:
FWIW I've applied for an exception from the two-release deprecation policy from the SC: https://github.com/python/steering-council/issues/75
On the PyPI top 5000 packages, 136 contain "PyCode" in the source. I didn't check how many are using Cython. yarl-1.6.3.tar.gz wsaccel-0.6.3.tar.gz wordcloud-1.8.1.tar.gz uvloop-0.16.0.tar.gz tslearn-0.5.2.tar.gz uamqp-1.4.1.tar.gz tweedledum-1.1.0.tar.gz tinycss-0.4.tar.gz thriftpy-0.3.9.tar.gz thriftpy2-0.4.14.tar.gz Theano-PyMC-1.1.2.tar.gz Theano-1.0.5.tar.gz TA-Lib-0.4.21.tar.gz tables-3.6.1.tar.gz ssh2-python-0.26.0.tar.gz srsly-2.4.1.tar.gz ssh-python-0.9.0.tar.gz statsmodels-0.12.2.tar.gz sphinx-gallery-0.9.0.tar.gz sktime-0.7.0.tar.gz Shapely-1.7.1.tar.gz wxPython-4.1.1.tar.gz scikit-image-0.18.2.tar.gz sasl-0.3.1.tar.gz scikit-surprise-1.1.1.tar.gz s2sphere-0.2.5.tar.gz ruptures-1.1.4.tar.gz runstats-2.0.0.tar.gz ruamel.yaml.clib-0.2.6.tar.gz reedsolo-1.5.4.tar.gz recordclass-0.15.1.tar.gz reportlab-3.6.1.tar.gz rasterio-1.2.6.tar.gz rapidfuzz-1.4.1.tar.gz qiskit-terra-0.18.1.tar.gz pyzmq-22.2.1.tar.gz pyxDamerauLevenshtein-1.7.0.tar.gz PyWavelets-1.1.1.tar.gz python-crfsuite-0.9.7.tar.gz py_spy-0.3.8.tar.gz pysimdjson-4.0.2.tar.gz pysam-0.16.0.1.tar.gz pypcap-1.2.3.tar.gz pyngrok-5.0.6.tar.gz PyLBFGS-0.2.0.13.tar.gz pyjq-2.5.2.tar.gz pyhacrf-datamade-0.2.5.tar.gz pyflux-0.4.15.tar.gz pygame-2.0.1.tar.gz pyemd-0.5.1.tar.gz pydevd-2.4.1.tar.gz pydevd-pycharm-212.5080.18.tar.gz plyvel-1.3.0.tar.gz pmdarima-1.8.2.tar.gz peewee-3.14.4.tar.gz paramiko-2.7.2.tar.gz osmium-3.2.0.tar.gz orderedset-2.0.3.tar.gz numpydoc-1.1.0.tar.gz numdifftools-0.9.40.tar.gz numba-0.53.1.tar.gz numcodecs-0.8.1.tar.gz NetfilterQueue-0.8.1.tar.gz neobolt-1.7.17.tar.gz Naked-0.1.31.tar.gz mypy-0.910.tar.gz msgpack-python-0.5.6.tar.gz msgpack-1.0.2.tar.gz mojimoji-0.0.11.tar.gz mpi4py-3.1.1.tar.gz matrixprofile-1.1.10.tar.gz marisa-trie-0.7.7.tar.gz lupa-1.9.tar.gz lxml-4.6.3.tar.gz lsm-db-0.6.4.tar.gz linearmodels-4.24.tar.gz lightfm-1.16.tar.gz Levenshtein-0.13.0.tar.gz leven-1.0.4.tar.gz lda-2.0.0.tar.gz jsonobject-0.9.10.tar.gz jq-1.2.1.tar.gz JPype1-1.3.0.tar.gz jenkspy-0.2.0.tar.gz implicit-0.4.4.tar.gz imgui-1.3.0.tar.gz imbalanced-learn-0.8.0.tar.gz imagecodecs-2021.7.30.tar.gz httptools-0.3.0.tar.gz httpretty-1.1.4.tar.gz hmmlearn-0.2.6.tar.gz hdbscan-0.8.27.tar.gz gssapi-1.6.14.tar.gz grpcio-tools-1.39.0.tar.gz grpcio-1.39.0.tar.gz graphene-federation-0.1.0.tar.gz GPy-1.10.0.tar.gz gluonnlp-0.10.0.tar.gz gevent-21.8.0.tar.gz gensim-4.0.1.tar.gz fuzzyset-0.0.19.tar.gz fuzzysearch-0.7.3.tar.gz frozendict-2.0.6.tar.gz flower-1.0.0.tar.gz Fiona-1.8.20.tar.gz fastrlock-0.6.tar.gz fastparquet-0.7.1.tar.gz fastdtw-0.3.4.tar.gz fastavro-1.4.4.tar.gz edlib-1.3.8.post2.tar.gz editdistance-0.5.3.tar.gz econml-0.12.0.tar.gz dtaidistance-2.3.2.tar.gz DoubleMetaphone-0.1.tar.gz django-localflavor-3.1.tar.gz dependency-injector-4.35.2.tar.gz dedupe-hcluster-0.3.8.tar.gz dedupe-2.0.8.tar.gz ddtrace-0.51.2.tar.gz cytoolz-0.11.0.tar.gz Cython-0.29.24.tar.gz correctionlib-2.0.0.tar.gz clickhouse-driver-0.2.1.tar.gz cityhash-0.2.3.post9.tar.gz cchardet-2.1.7.tar.gz causalml-0.11.1.tar.gz Cartopy-0.19.0.post1.tar.gz av-8.0.3.tar.gz asyncpg-0.24.0.tar.gz astral-2.2.tar.gz arch-5.0.1.tar.gz arcgis-1.9.0.tar.gz altgraph-0.17.tar.gz aiokafka-0.7.1.tar.gz aiohttp-3.7.4.post0.tar.gz affinegap-1.11.tar.gz Victor
On Fri, Sep 3, 2021 at 4:12 PM Victor Stinner <vstinner@python.org> wrote:
FWIW I've applied for an exception from the two-release deprecation
On Thu, Sep 2, 2021 at 11:15 PM Guido van Rossum <guido@python.org> wrote: policy from the SC:
On the PyPI top 5000 packages, 136 contain "PyCode" in the source. I didn't check how many are using Cython.
Most of them. :-) I wrote a script that to do a similar search on the 4000 most popular packages, disregarding Cython-generated files (these have "/* Generated by Cython <version> */" in their first line). Now the list collapsed to this: Cython-3.0a7.tar.gz: 11 hits in 3 files frozendict-2.0.6.tar.gz: 14 hits in 8 files gevent-21.8.0.tar.gz: 1 hits in 1 files JPype1-1.3.0.tar.gz: 1 hits in 1 files mypy-0.910.tar.gz: 2 hits in 1 files reportlab-3.6.1.tar.gz: 1 hits in 1 files setuptools-9.1.tar.gz: 1 hits in 1 files Of these: Cython: obviously :-) frozendict: calls PyCode_NewEmpty; seems to include modified CPython headers gevent: Uses Cython's __Pyx_PyCode_New in a generated .h file JPype: calls PyCode_NewEmpty mypy: PyCode_NewEmpty mentioned in a comment reportlab: calls PyCode_NewEmpty setuptools: in a file generated by Pyrex (Cython's predecessor) There wasn't a single call to PyCode_NewWithPosOnlyArgs in any of these apart from Cython. In addition, I just heard from the SC that they've approved the exception. So we will remove these two APIs from 3.11 without deprecation. I've filed https://bugs.python.org/issue45122 to get this done (looking for volunteers). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Oh, I didn't know this *existing* C API function: PyCode_NewEmpty(const char *filename, const char *funcname, int firstlineno) So Cython could be modified to use it, no? Victor On Tue, Sep 7, 2021 at 12:44 AM Guido van Rossum <guido@python.org> wrote:
On Fri, Sep 3, 2021 at 4:12 PM Victor Stinner <vstinner@python.org> wrote:
On Thu, Sep 2, 2021 at 11:15 PM Guido van Rossum <guido@python.org> wrote:
FWIW I've applied for an exception from the two-release deprecation policy from the SC: https://github.com/python/steering-council/issues/75
On the PyPI top 5000 packages, 136 contain "PyCode" in the source. I didn't check how many are using Cython.
Most of them. :-)
I wrote a script that to do a similar search on the 4000 most popular packages, disregarding Cython-generated files (these have "/* Generated by Cython <version> */" in their first line). Now the list collapsed to this:
Cython-3.0a7.tar.gz: 11 hits in 3 files frozendict-2.0.6.tar.gz: 14 hits in 8 files gevent-21.8.0.tar.gz: 1 hits in 1 files JPype1-1.3.0.tar.gz: 1 hits in 1 files mypy-0.910.tar.gz: 2 hits in 1 files reportlab-3.6.1.tar.gz: 1 hits in 1 files setuptools-9.1.tar.gz: 1 hits in 1 files
Of these:
Cython: obviously :-) frozendict: calls PyCode_NewEmpty; seems to include modified CPython headers gevent: Uses Cython's __Pyx_PyCode_New in a generated .h file JPype: calls PyCode_NewEmpty mypy: PyCode_NewEmpty mentioned in a comment reportlab: calls PyCode_NewEmpty setuptools: in a file generated by Pyrex (Cython's predecessor)
There wasn't a single call to PyCode_NewWithPosOnlyArgs in any of these apart from Cython.
In addition, I just heard from the SC that they've approved the exception. So we will remove these two APIs from 3.11 without deprecation. I've filed https://bugs.python.org/issue45122 to get this done (looking for volunteers).
-- --Guido van Rossum (python.org/~guido) Pronouns: he/him (why is my pronoun here?)
-- Night gathers, and now my watch begins. It shall not end until my death.
In addition, I just heard from the SC that they've approved the exception. So we will remove these two APIs from 3.11 without deprecation. Erm, hang on – when I wrote that I'm fine with *changing* them, I wasn't
Guido van Rossum schrieb am 07.09.21 um 00:44: thinking of actually *removing* them. At least not both. PyCode_NewEmpty() isn't a good replacement since it takes low level arguments … char* instead of Python strings. It's good for the simple use case that it was written for (and Cython already uses it for that), but not so great for anything beyond that. What I could try is to create only a single dummy code object and then always call .replace() on it to create new ones. But that seems hackish and requires managing yet another bit of global state across static and generated code parts. I could also switch to _PyCode_New(), though it's not exactly what I would call an attractive option, both for usability reasons and its future API stability. (Cython also still generates C89 code, i.e. no partial struct initialisations.) Any suggestions? Stefan
On Tue, Sep 7, 2021 at 10:00 AM Stefan Behnel <stefan_ml@behnel.de> wrote:
Guido van Rossum schrieb am 07.09.21 um 00:44:
In addition, I just heard from the SC that they've approved the exception. So we will remove these two APIs from 3.11 without deprecation.
Erm, hang on – when I wrote that I'm fine with *changing* them, I wasn't thinking of actually *removing* them. At least not both. PyCode_NewEmpty() isn't a good replacement since it takes low level arguments … char* instead of Python strings. It's good for the simple use case that it was written for (and Cython already uses it for that), but not so great for anything beyond that.
Is the issue that you want to specify a few additional simple arguments, or that you already have unicode objects and you don't want to have them converted back to char * (which will then be converted to new unicode objects by PyCode_NewEmpty)?
What I could try is to create only a single dummy code object and then always call .replace() on it to create new ones. But that seems hackish and requires managing yet another bit of global state across static and generated code parts.
I can't argue with the need for an extra bit of global state, but to me, calling .replace() is not hackish, it's the best way to create new code objects -- it works for older Python versions and will keep working in the future.
I could also switch to _PyCode_New(), though it's not exactly what I would call an attractive option, both for usability reasons and its future API stability. (Cython also still generates C89 code, i.e. no partial struct initialisations.)
That's too bad. With the new struct initializations it's actually more usable and more stable going forward.
Any suggestions?
Stefan
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/B6WFHGVA... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
(context)
In 3.11 we're changing a lot of details about code objects. Part of this is the "Faster CPython" work, part of it is other things (e.g. PEP 657 -- Fine Grained Error Locations in Tracebacks).
As a result, the set of fields of the code object is changing. This is fine, the structure is part of the internal API anyway.
But there's a problem with two public API functions, PyCode_New() and PyCode_NewWithPosArgs(). As we have them in the main (3.11) branch, their signatures are incompatible with previous versions, and they have to be since the set of values needed to create a code object is different. (The types.CodeType constructor signature is also changed, and so is its replace() method, but these aren't part of any stable API).
Unfortunately, PyCode_New() and PyCode_NewWithPosArgs() are part of the PEP 387 stable ABI. What should we do?
A. We could deprecate them, keep (restore) their old signatures, and create crippled code objects (no exception table, no endline/column tables, qualname defaults to name).
B. We could deprecate them, restore the old signatures, and always raise an error when they are called.
C. We could just delete them.
D. We could keep them, with modified signatures, and to heck with ABI compatibility for these two.
E. We could get rid of PyCode_NewWithPosArgs(), update PyCode() to add
Guido van Rossum schrieb am 13.08.21 um 19:24: the
posonlyargcount (which is the only difference between the two), and d*mn the torpedoes.
F. Like (E), but keep PyCode_NewWithPosArgs() as an alias for PyCode_New() (and deprecate it).
If these weren't part of the stable ABI, I'd choose (E). [...]
On Tue, Aug 31, 2021 at 7:07 PM Stefan Behnel <stefan_ml@behnel.de> wrote:
I also vote for (E). The creation of a code object is tied to interpreter internals and thus shouldn't be (or have been) declared stable.
I think you're one of the few people who call those functions, and if even you think it's okay to break backward compatibility here, I think we should just talk to the SC to be absolved of having these two in the stable ABI. (Petr, do you agree? Without your backing I don't feel comfortable even asking for this.)
I think the only problem with that argument is that code objects are required for frames. You could argue the same way about frames, but then it becomes really tricky to, you know, create frames for non-Python code.
Note there's nothing in the stable ABI to create frames. There are only functions to *get* an existing frame, to inspect a frame, and to eval it. In any case even if there was a stable ABI function to create a frame from a code object, one could argue that it's sufficient to be able to get an existing code object from e.g. a function object.
Since we're discussing this in the context of PEP 657, I wonder if there's a better way to create tracebacks from C code, other than creating fake frames with fake code objects.
Cython uses code objects and frames for the following use cases:
- tracing generated C code at the Python syntax level - profiling C-implemented functions - tracebacks for C code
Having a way to do these three efficiently (i.e. with close to zero runtime overhead) without having to reach into internals of the interpreter state, code objects and frames, would be nice.
Failing that, I'm ok with declaring the relevant structs and C-API functions non-stable and letting Cython use them as such, as we always did.
I think others have answered this already -- in any case it's not the immediate subject of this thread, and I don't have a strong opinion on it. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
I apologize, I keep making the same mistake. The PyCode_New[WithPosArgs] functions are *not* in the stable ABI or in the limited API, so there's no need to petition the SC, nor do I need Petr's approval. We may be bound by backwards compatibility for the *cpython* API, but I think that if Cython is okay if we just break this we should be fine. Users of the CPython API are expected to recompile for each new version, and if someone were to be using these functions with the old set of parameters the compiler would give them an error. So let's just choose (E) and d*mn backwards compatibility for these two functions. That means: - Get rid of PyCode_NewWithPosArgs altogether - PyCode_New becomes unstable (and gets a new posinlyargcount argument) On Wed, Sep 1, 2021 at 11:52 AM Guido van Rossum <guido@python.org> wrote:
(context)
In 3.11 we're changing a lot of details about code objects. Part of
the "Faster CPython" work, part of it is other things (e.g. PEP 657 -- Fine Grained Error Locations in Tracebacks).
As a result, the set of fields of the code object is changing. This is fine, the structure is part of the internal API anyway.
But there's a problem with two public API functions, PyCode_New() and PyCode_NewWithPosArgs(). As we have them in the main (3.11) branch,
signatures are incompatible with previous versions, and they have to be since the set of values needed to create a code object is different. (The types.CodeType constructor signature is also changed, and so is its replace() method, but these aren't part of any stable API).
Unfortunately, PyCode_New() and PyCode_NewWithPosArgs() are part of the PEP 387 stable ABI. What should we do?
A. We could deprecate them, keep (restore) their old signatures, and create crippled code objects (no exception table, no endline/column tables, qualname defaults to name).
B. We could deprecate them, restore the old signatures, and always raise an error when they are called.
C. We could just delete them.
D. We could keep them, with modified signatures, and to heck with ABI compatibility for these two.
E. We could get rid of PyCode_NewWithPosArgs(), update PyCode() to add
Guido van Rossum schrieb am 13.08.21 um 19:24: this is their the
posonlyargcount (which is the only difference between the two), and d*mn the torpedoes.
F. Like (E), but keep PyCode_NewWithPosArgs() as an alias for PyCode_New() (and deprecate it).
If these weren't part of the stable ABI, I'd choose (E). [...]
On Tue, Aug 31, 2021 at 7:07 PM Stefan Behnel <stefan_ml@behnel.de> wrote:
I also vote for (E). The creation of a code object is tied to interpreter internals and thus shouldn't be (or have been) declared stable.
I think you're one of the few people who call those functions, and if even you think it's okay to break backward compatibility here, I think we should just talk to the SC to be absolved of having these two in the stable ABI. (Petr, do you agree? Without your backing I don't feel comfortable even asking for this.)
I think the only problem with that argument is that code objects are required for frames. You could argue the same way about frames, but then it becomes really tricky to, you know, create frames for non-Python code.
Note there's nothing in the stable ABI to create frames. There are only functions to *get* an existing frame, to inspect a frame, and to eval it. In any case even if there was a stable ABI function to create a frame from a code object, one could argue that it's sufficient to be able to get an existing code object from e.g. a function object.
Since we're discussing this in the context of PEP 657, I wonder if there's a better way to create tracebacks from C code, other than creating fake frames with fake code objects.
Cython uses code objects and frames for the following use cases:
- tracing generated C code at the Python syntax level - profiling C-implemented functions - tracebacks for C code
Having a way to do these three efficiently (i.e. with close to zero runtime overhead) without having to reach into internals of the interpreter state, code objects and frames, would be nice.
Failing that, I'm ok with declaring the relevant structs and C-API functions non-stable and letting Cython use them as such, as we always did.
I think others have answered this already -- in any case it's not the immediate subject of this thread, and I don't have a strong opinion on it.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 01. 09. 21 22:28, Guido van Rossum wrote:
I apologize, I keep making the same mistake.
The PyCode_New[WithPosArgs] functions are *not* in the stable ABI or in the limited API, so there's no need to petition the SC, nor do I need Petr's approval.
We may be bound by backwards compatibility for the *cpython* API, but I think that if Cython is okay if we just break this we should be fine. Users of the CPython API are expected to recompile for each new version, and if someone were to be using these functions with the old set of parameters the compiler would give them an error.
The cpython CPI is still covered by the backwards compatibility policy (PEP 387). You do need to ask the SC to skip the two-year deprecation period. I don't see an issue with the exception being granted, but I do think it should be rubber-stamped as a project-wide decision.
So let's just choose (E) and d*mn backwards compatibility for these two functions.
That means: - Get rid of PyCode_NewWithPosArgs altogether - PyCode_New becomes unstable (and gets a new posinlyargcount argument)
... but still remains available and documented, just with a note that it may change in minor versions. Right?
On Wed, Sep 1, 2021 at 11:52 AM Guido van Rossum <guido@python.org <mailto:guido@python.org>> wrote:
(context)
Guido van Rossum schrieb am 13.08.21 um 19:24: > In 3.11 we're changing a lot of details about code objects. Part of this is > the "Faster CPython" work, part of it is other things (e.g. PEP 657 -- Fine > Grained Error Locations in Tracebacks). > > As a result, the set of fields of the code object is changing. This is > fine, the structure is part of the internal API anyway. > > But there's a problem with two public API functions, PyCode_New() and > PyCode_NewWithPosArgs(). As we have them in the main (3.11) branch, their > signatures are incompatible with previous versions, and they have to be > since the set of values needed to create a code object is different. (The > types.CodeType constructor signature is also changed, and so is its > replace() method, but these aren't part of any stable API). > > Unfortunately, PyCode_New() and PyCode_NewWithPosArgs() are part of the PEP > 387 stable ABI. What should we do? > > A. We could deprecate them, keep (restore) their old signatures, and create > crippled code objects (no exception table, no endline/column tables, > qualname defaults to name). > > B. We could deprecate them, restore the old signatures, and always raise an > error when they are called. > > C. We could just delete them. > > D. We could keep them, with modified signatures, and to heck with ABI > compatibility for these two. > > E. We could get rid of PyCode_NewWithPosArgs(), update PyCode() to add the > posonlyargcount (which is the only difference between the two), and d*mn > the torpedoes. > > F. Like (E), but keep PyCode_NewWithPosArgs() as an alias for PyCode_New() > (and deprecate it). > > If these weren't part of the stable ABI, I'd choose (E). [...]
On Tue, Aug 31, 2021 at 7:07 PM Stefan Behnel <stefan_ml@behnel.de <mailto:stefan_ml@behnel.de>> wrote:
I also vote for (E). The creation of a code object is tied to interpreter internals and thus shouldn't be (or have been) declared stable.
I think you're one of the few people who call those functions, and if even you think it's okay to break backward compatibility here, I think we should just talk to the SC to be absolved of having these two in the stable ABI. (Petr, do you agree? Without your backing I don't feel comfortable even asking for this.)
I think the only problem with that argument is that code objects are required for frames. You could argue the same way about frames, but then it becomes really tricky to, you know, create frames for non-Python code.
Note there's nothing in the stable ABI to create frames. There are only functions to *get* an existing frame, to inspect a frame, and to eval it. In any case even if there was a stable ABI function to create a frame from a code object, one could argue that it's sufficient to be able to get an existing code object from e.g. a function object.
Since we're discussing this in the context of PEP 657, I wonder if there's a better way to create tracebacks from C code, other than creating fake frames with fake code objects.
Cython uses code objects and frames for the following use cases:
- tracing generated C code at the Python syntax level - profiling C-implemented functions - tracebacks for C code
Having a way to do these three efficiently (i.e. with close to zero runtime overhead) without having to reach into internals of the interpreter state, code objects and frames, would be nice.
Failing that, I'm ok with declaring the relevant structs and C-API functions non-stable and letting Cython use them as such, as we always did.
I think others have answered this already -- in any case it's not the immediate subject of this thread, and I don't have a strong opinion on it.
-- --Guido van Rossum (python.org/~guido <http://python.org/~guido>) /Pronouns: he/him //(why is my pronoun here?)/ <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
-- --Guido van Rossum (python.org/~guido <http://python.org/~guido>) /Pronouns: he/him //(why is my pronoun here?)/ <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
participants (16)
-
Antoine Pitrou
-
Eric Snow
-
Greg Ewing
-
Gregory P. Smith
-
Guido van Rossum
-
Jim J. Jewett
-
Nick Coghlan
-
Pablo Galindo Salgado
-
Patrick Reader
-
Petr Viktorin
-
Serhiy Storchaka
-
Spencer Brown
-
Stefan Behnel
-
Steve Dower
-
Terry Reedy
-
Victor Stinner