Comments on PEP 558 (Defined semantics for locals() )
Hi, First of all I want to say that I'm very much in favour of the general idea behind PEP 558. Defined semantics is always better than undefined semantics :) However, I think there are a few changes needed: 1. Don't add anything to the C API, please. Frame attributes can be accessed via `PyObject_GetAttr[String]`. 2. Don't make the behaviour dependent on whether "tracing" is turned on. Doing so forces debuggers to use sys.settrace which is horribly slow. It also makes the implementation more complex, and has no benefit AFAICT. 3. Don't store write-through proxies in the frame, but make proxies retain a reference to the frame. This would reduce the size and complexity of code for handling frames. Clean up of the frame would occur naturally via reference count when all proxies have been reclaimed. The proposed implementation is hard to reason about and I am not confident that it will not introduce some new subtle bugs to replace the ones it seeks to remove. Any implementation that has functions with "Borrow" and "BreakCycles" in their names makes me nervous. A simpler conceptual model, which I believe could be made reliable, would be: No change for non-function frames (as PEP 558 currently proposes). Each access to `frame.f_locals` (for function frames) should return a new proxy object. This proxy object would have dict-like, write-through semantics for variables in the frame. For keys that do not match variable names, an exception would be raised. This means that all proxies for a single frame will have value equivalence; object equivalence is not needed. I.e. for a frame `f`, `f.f_locals == f.f_locals` would be True, even though `f.f_locals is f.f_locals` would be False. Cheers, Mark.
Unfortunately, the simplifications you propose would be backwards incompatible - it's existing behaviour that there's a real shared dict (even on optimised frames) where arbitrary extra attributes can be stored (even though they don't become accessible as Python variables). I don't want to make frame objects any bigger than they already are, so the natural solution is to store the mapping proxy as `f_locals`, and then bypass the proxy in order to make `PyEval_GetLocals` still "work" (at least as well as it ever did). PyObject_GetAttr(string) also doesn't do that same thing as the proposed C functions, since it invokes the Python descriptor machinery. (Note that the discussion at https://discuss.python.org/t/pep-558-defined-semantics-for-locals/2936/ is more up to date than the PEP text where the C API is concerned) The reference to tracing mode dependent semantics puzzles me, as that was removed in December: https://github.com/python/peps/commit/54888058ce8ad5257114652d9b41e8d1237b8e... Cheers, Nick. On Wed, 22 Jan 2020 at 21:59, Mark Shannon <mark@hotpy.org> wrote:
Hi,
First of all I want to say that I'm very much in favour of the general idea behind PEP 558. Defined semantics is always better than undefined semantics :)
However, I think there are a few changes needed:
1. Don't add anything to the C API, please. Frame attributes can be accessed via `PyObject_GetAttr[String]`.
2. Don't make the behaviour dependent on whether "tracing" is turned on. Doing so forces debuggers to use sys.settrace which is horribly slow. It also makes the implementation more complex, and has no benefit AFAICT.
3. Don't store write-through proxies in the frame, but make proxies retain a reference to the frame. This would reduce the size and complexity of code for handling frames. Clean up of the frame would occur naturally via reference count when all proxies have been reclaimed.
The proposed implementation is hard to reason about and I am not confident that it will not introduce some new subtle bugs to replace the ones it seeks to remove. Any implementation that has functions with "Borrow" and "BreakCycles" in their names makes me nervous.
A simpler conceptual model, which I believe could be made reliable, would be:
No change for non-function frames (as PEP 558 currently proposes).
Each access to `frame.f_locals` (for function frames) should return a new proxy object.
This proxy object would have dict-like, write-through semantics for variables in the frame. For keys that do not match variable names, an exception would be raised. This means that all proxies for a single frame will have value equivalence; object equivalence is not needed. I.e. for a frame `f`, `f.f_locals == f.f_locals` would be True, even though `f.f_locals is f.f_locals` would be False.
Cheers, Mark. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/TSHCB4ZH... Code of Conduct: http://python.org/psf/codeofconduct/
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 08/02/2020 11:49 am, Nick Coghlan wrote:
Unfortunately, the simplifications you propose would be backwards incompatible - it's existing behaviour that there's a real shared dict (even on optimised frames) where arbitrary extra attributes can be stored (even though they don't become accessible as Python variables). I don't want to make frame objects any bigger than they already are, so the natural solution is to store the mapping proxy as `f_locals`, and then bypass the proxy in order to make `PyEval_GetLocals` still "work" (at least as well as it ever did).
The proposed changes in PEP 558 are also backwards incompatible. I thought that was the point. The current implementation is broken in weird ways and we want to fix that. Since we need to break backward compatibility anyway, why not do it in a way the makes the behaviour as well defined and maintainable as possible. I think that PEP 558, as it stands, is still a bit fragile because of the handling of cycles between the locals proxy and the frame.
PyObject_GetAttr(string) also doesn't do that same thing as the proposed C functions, since it invokes the Python descriptor machinery. (Note that the discussion at https://discuss.python.org/t/pep-558-defined-semantics-for-locals/2936/ is more up to date than the PEP text where the C API is concerned)
`PyObject_GetAttr("attr")` has the same semantics as the Python operator `x.attr` which is under the control of `type(x)`, in this case the frame object class. The descriptor protocol is irrelevant.
The reference to tracing mode dependent semantics puzzles me, as that was removed in December: https://github.com/python/peps/commit/54888058ce8ad5257114652d9b41e8d1237b8e...
That was my misreading. The behaviour of `f_locals` in the PEP is not very clear, as it is buried in the discussion of CPython changes. Could you add it to the proposal section? Cheers, Mark.
Cheers, Nick.
On Wed, 22 Jan 2020 at 21:59, Mark Shannon <mark@hotpy.org> wrote:
Hi,
First of all I want to say that I'm very much in favour of the general idea behind PEP 558. Defined semantics is always better than undefined semantics :)
However, I think there are a few changes needed:
1. Don't add anything to the C API, please. Frame attributes can be accessed via `PyObject_GetAttr[String]`.
2. Don't make the behaviour dependent on whether "tracing" is turned on. Doing so forces debuggers to use sys.settrace which is horribly slow. It also makes the implementation more complex, and has no benefit AFAICT.
3. Don't store write-through proxies in the frame, but make proxies retain a reference to the frame. This would reduce the size and complexity of code for handling frames. Clean up of the frame would occur naturally via reference count when all proxies have been reclaimed.
The proposed implementation is hard to reason about and I am not confident that it will not introduce some new subtle bugs to replace the ones it seeks to remove. Any implementation that has functions with "Borrow" and "BreakCycles" in their names makes me nervous.
A simpler conceptual model, which I believe could be made reliable, would be:
No change for non-function frames (as PEP 558 currently proposes).
Each access to `frame.f_locals` (for function frames) should return a new proxy object.
This proxy object would have dict-like, write-through semantics for variables in the frame. For keys that do not match variable names, an exception would be raised. This means that all proxies for a single frame will have value equivalence; object equivalence is not needed. I.e. for a frame `f`, `f.f_locals == f.f_locals` would be True, even though `f.f_locals is f.f_locals` would be False.
Cheers, Mark. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/TSHCB4ZH... Code of Conduct: http://python.org/psf/codeofconduct/
On Mon., 10 Feb. 2020, 8:31 pm Mark Shannon, <mark@hotpy.org> wrote:
On 08/02/2020 11:49 am, Nick Coghlan wrote:
Unfortunately, the simplifications you propose would be backwards incompatible - it's existing behaviour that there's a real shared dict (even on optimised frames) where arbitrary extra attributes can be stored (even though they don't become accessible as Python variables). I don't want to make frame objects any bigger than they already are, so the natural solution is to store the mapping proxy as `f_locals`, and then bypass the proxy in order to make `PyEval_GetLocals` still "work" (at least as well as it ever did).
The proposed changes in PEP 558 are also backwards incompatible. I thought that was the point. The current implementation is broken in weird ways and we want to fix that. Since we need to break backward compatibility anyway, why not do it in a way the makes the behaviour as well defined and maintainable as possible.
The changes at the Python level are *technically* incompatible, but Nathaniel's review made a compelling case that the real world compatibility problems were likely to be minimal, and in some cases would actually be fixing latent defects in existing code. I think that PEP 558, as it stands, is still a bit fragile because of
the handling of cycles between the locals proxy and the frame.
Unfortunately, I'm not entirely sure there's any way to get rid of that without getting rid of PyEval_GetLocals() completely, and that *wouldn't* be a subtle break in the slightest (it's even part of the stable ABI). Since that API returns a borrowed reference, the real reference has to live somewhere, and the most natural place is the frame f_locals attribute (as that's where it lives today). And even if we *did* manage to resolve that dilemna, we then run into the problem that we also need the frame object to hold the proxy because the C level equivalent of accessing the attribute is just "frame->f_locals": it's not an opaque struct, so that pointer is part of the public API. I agree I should explain this aspect clearly in the PEP though (and likely in some block comments in the implementation), as you're quite right that the associated reference borrowing and cycle breaking code is thoroughly nasty now that the namespace object isn't going to be a simple dictionary. (Thinking out loud, though: something that might work is for each locals proxy to use a common snapshot namespace, and store *that* on the frame, exactly as we do today. That would replace the cycle in the current implementation with multiple references to the common snapshot)
PyObject_GetAttr(string) also doesn't do that same thing as the proposed C functions, since it invokes the Python descriptor machinery. (Note that the discussion at https://discuss.python.org/t/pep-558-defined-semantics-for-locals/2936/ is more up to date than the PEP text where the C API is concerned)
`PyObject_GetAttr("attr")` has the same semantics as the Python operator `x.attr` which is under the control of `type(x)`, in this case the frame object class. The descriptor protocol is irrelevant.
The now proposed C APIs don't include one to get access to the write-through proxy, so you do indeed have to use PyObject_GetAttr for that. The proposed APIs instead aim to make it possible to access the Python locals without needing to acquire a frame object reference at all.
The reference to tracing mode dependent semantics puzzles me, as that was removed in December:
https://github.com/python/peps/commit/54888058ce8ad5257114652d9b41e8d1237b8e...
That was my misreading. The behaviour of `f_locals` in the PEP is not very clear, as it is buried in the discussion of CPython changes. Could you add it to the proposal section?
Aye, I'll do that when clarifying the complications arising from wanting to keep PyEval_GetLocals() and direct "frame->f_locals" access working pretty closely to the way they have in the past, while still resolving the other issues. (And writing that sentence is what made me realise that you could be right and it may be possible to make this work without needing to give the frame object a reference to any of the write-through proxy objects) Cheers, Nick.
On 16/02/2020 1:35 pm, Nick Coghlan wrote:
On Mon., 10 Feb. 2020, 8:31 pm Mark Shannon, <mark@hotpy.org <mailto:mark@hotpy.org>> wrote:
On 08/02/2020 11:49 am, Nick Coghlan wrote: > Unfortunately, the simplifications you propose would be backwards > incompatible - it's existing behaviour that there's a real shared dict > (even on optimised frames) where arbitrary extra attributes can be > stored (even though they don't become accessible as Python variables). > I don't want to make frame objects any bigger than they already are, > so the natural solution is to store the mapping proxy as `f_locals`, > and then bypass the proxy in order to make `PyEval_GetLocals` still > "work" (at least as well as it ever did).
The proposed changes in PEP 558 are also backwards incompatible. I thought that was the point. The current implementation is broken in weird ways and we want to fix that. Since we need to break backward compatibility anyway, why not do it in a way the makes the behaviour as well defined and maintainable as possible.
The changes at the Python level are *technically* incompatible, but Nathaniel's review made a compelling case that the real world compatibility problems were likely to be minimal, and in some cases would actually be fixing latent defects in existing code.
I think that PEP 558, as it stands, is still a bit fragile because of the handling of cycles between the locals proxy and the frame.
Unfortunately, I'm not entirely sure there's any way to get rid of that without getting rid of PyEval_GetLocals() completely, and that *wouldn't* be a subtle break in the slightest (it's even part of the stable ABI).
Since that API returns a borrowed reference, the real reference has to live somewhere, and the most natural place is the frame f_locals attribute (as that's where it lives today).
Any function returning a borrowed reference is already broken, IMO. Sadly there are quite a few of them :( Might as well fix `PyEval_GetLocals()` as we are changing its semantics anyway. A quick search of GitHub shows that most uses erroneously fail to increment the reference. So changing `PyEval_GetLocals()` to return a real reference would convert a possible crash into a memory leak. Maybe that's an improvement, maybe not, but it is a reasonable change IMO.
And even if we *did* manage to resolve that dilemna, we then run into the problem that we also need the frame object to hold the proxy because the C level equivalent of accessing the attribute is just "frame->f_locals": it's not an opaque struct, so that pointer is part of the public API.
I don't think the internal structure of Python objects is part of the API. Otherwise we couldn't change anything, ever.
I agree I should explain this aspect clearly in the PEP though (and likely in some block comments in the implementation), as you're quite right that the associated reference borrowing and cycle breaking code is thoroughly nasty now that the namespace object isn't going to be a simple dictionary.
(Thinking out loud, though: something that might work is for each locals proxy to use a common snapshot namespace, and store *that* on the frame, exactly as we do today. That would replace the cycle in the current implementation with multiple references to the common snapshot)
> > PyObject_GetAttr(string) also doesn't do that same thing as the > proposed C functions, since it invokes the Python descriptor > machinery. (Note that the discussion at > https://discuss.python.org/t/pep-558-defined-semantics-for-locals/2936/ > is more up to date than the PEP text where the C API is concerned)
`PyObject_GetAttr("attr")` has the same semantics as the Python operator `x.attr` which is under the control of `type(x)`, in this case the frame object class. The descriptor protocol is irrelevant.
The now proposed C APIs don't include one to get access to the write-through proxy, so you do indeed have to use PyObject_GetAttr for that.
The proposed APIs instead aim to make it possible to access the Python locals without needing to acquire a frame object reference at all.
Why would you need to do that? An addition to the C API needs a much stronger justification than "it might be handy for someone".
> > The reference to tracing mode dependent semantics puzzles me, as that > was removed in December: > https://github.com/python/peps/commit/54888058ce8ad5257114652d9b41e8d1237b8e... >
That was my misreading. The behaviour of `f_locals` in the PEP is not very clear, as it is buried in the discussion of CPython changes. Could you add it to the proposal section?
Aye, I'll do that when clarifying the complications arising from wanting to keep PyEval_GetLocals() and direct "frame->f_locals" access working pretty closely to the way they have in the past, while still resolving the other issues.
(And writing that sentence is what made me realise that you could be right and it may be possible to make this work without needing to give the frame object a reference to any of the write-through proxy objects)
Cheers, Nick.
On Tue., 18 Feb. 2020, 12:14 am Mark Shannon, <mark@hotpy.org> wrote:
Any function returning a borrowed reference is already broken, IMO. Sadly there are quite a few of them :( Might as well fix `PyEval_GetLocals()` as we are changing its semantics anyway.
A quick search of GitHub shows that most uses erroneously fail to increment the reference. So changing `PyEval_GetLocals()` to return a real reference would convert a possible crash into a memory leak. Maybe that's an improvement, maybe not, but it is a reasonable change IMO.
If the idea I had yesterday works out, then `PyEval_GetLocals()` will end up having exactly the same behavior as it does in 3.8, but we'll have a potential path towards some day deprecating and removing it (since there will be better behaved APIs offering equivalent functionality).
The proposed APIs instead aim to make it possible to access the Python locals without needing to acquire a frame object reference at all.
Why would you need to do that? An addition to the C API needs a much stronger justification than "it might be handy for someone".
The latest iteration of the PEP goes into more detail, but the key point is that the Python semantics are difficult to handle from the point of view of extension module code, because you can't tell just from looking at the code whether the locals() equivalent is going to return a copy or not. So the proposed C API instead lets the consuming code choose between: * get a read-only view * get a shallow copy * use the Python level semantics (with a query function to find out what those semantics are in the running frame) That way, it will be easier to write extension module code that behaves the same way in all scopes, or at least fails loudly if it doesn't work properly in the running frame. Cheers, Nick.
On Sun, 16 Feb 2020 at 23:35, Nick Coghlan <ncoghlan@gmail.com> wrote:
On Mon., 10 Feb. 2020, 8:31 pm Mark Shannon, <mark@hotpy.org> wrote:
The proposed changes in PEP 558 are also backwards incompatible. I thought that was the point. The current implementation is broken in weird ways and we want to fix that. Since we need to break backward compatibility anyway, why not do it in a way the makes the behaviour as well defined and maintainable as possible.
The changes at the Python level are *technically* incompatible, but Nathaniel's review made a compelling case that the real world compatibility problems were likely to be minimal, and in some cases would actually be fixing latent defects in existing code.
I think that PEP 558, as it stands, is still a bit fragile because of the handling of cycles between the locals proxy and the frame.
Unfortunately, I'm not entirely sure there's any way to get rid of that without getting rid of PyEval_GetLocals() completely, and that *wouldn't* be a subtle break in the slightest (it's even part of the stable ABI).
Since that API returns a borrowed reference, the real reference has to live somewhere, and the most natural place is the frame f_locals attribute (as that's where it lives today).
And even if we *did* manage to resolve that dilemna, we then run into the problem that we also need the frame object to hold the proxy because the C level equivalent of accessing the attribute is just "frame->f_locals": it's not an opaque struct, so that pointer is part of the public API.
I agree I should explain this aspect clearly in the PEP though (and likely in some block comments in the implementation), as you're quite right that the associated reference borrowing and cycle breaking code is thoroughly nasty now that the namespace object isn't going to be a simple dictionary.
(Thinking out loud, though: something that might work is for each locals proxy to use a common snapshot namespace, and store *that* on the frame, exactly as we do today. That would replace the cycle in the current implementation with multiple references to the common snapshot)
It works! https://github.com/python/cpython/pull/3640/commits/68f10ced11a24166269cef84... switches the reference implementation over to working that way, allowing the reference cycle between frames and their write-through proxies to be eliminated. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (2)
-
Mark Shannon
-
Nick Coghlan