[Python-Dev] [PEP 558] thinking through locals() semantics

Mon May 27 15:30:21 EDT 2019

OK, I apologize for not catching on to the changed semantics of f_locals
(which in the proposal is always a proxy for the fast locals and the
cells). I don't know if I just skimmed that part of the PEP or that it
needs calling out more.

I'm assuming that there are backwards compatibility concerns, and the PEP
is worried that more code will break if one of the simpler options is
chosen. In particular I guess that the choice of returning the same object
is meant to make locals() in a function frame more similar to locals() in a
module frame. And the choice of returning a plain dict rather than a proxy
is meant to make life easier for code that assumes it's getting a plain
dict.

Other than that I agree that returning a proxy (i.e. just a reference to
f_locals) seems to be the most attractive option...

On Mon, May 27, 2019 at 9:41 AM Nathaniel Smith <njs at pobox.com> wrote:

> On Mon, May 27, 2019 at 9:16 AM Guido van Rossum <guido at python.org> wrote:
> >
> > I re-ran your examples and found that some of them fail.
> >
> > On Mon, May 27, 2019 at 8:17 AM Nathaniel Smith <njs at pobox.com> wrote:
> [...]
> >> The interaction between f_locals and and locals() is also subtle:
> >>
> >>   def f():
> >>       a = 1
> >>       loc = locals()
> >>       assert "loc" not in loc
> >>       # Regular variable updates don't affect 'loc'
> >>       a = 2
> >>       assert loc["a"] == 1
> >>       # But debugging updates do:
> >>       sys._getframe().f_locals["a"] = 3
> >>       assert a == 3
> >
> >
> > That assert fails; `a` is still 2 here for me.
>
> I think you're running on current Python, and I'm talking about the
> semantics in the current PEP 558 draft, which redefines f_locals so
> that the assert passes. Nick has a branch here if you want to try it:
> https://github.com/python/cpython/pull/3640
>
> (Though I admit I was lazy, and haven't tried running my examples at
> all -- they're just based on the text.)
>
> >>
> >>       assert loc["a"] == 3
> >>       # But it's not a full writeback
> >>       assert "loc" not in loc
> >>       # Mutating 'loc' doesn't affect f_locals:
> >>       loc["a"] = 1
> >>       assert sys._getframe().f_locals["a"] == 1
> >>       # Except when it does:
> >>       loc["b"] = 3
> >>       assert sys._getframe().f_locals["b"] == 3
> >
> >
> > All of this can be explained by realizing `loc is
> sys._getframe().f_locals`. IOW locals() always returns the dict in f_locals.
>
> That's not true in the PEP version of things. locals() and
> frame.f_locals become radically different. locals() is still a dict
> stored in the frame object, but f_locals is a magic proxy object that
> reads/writes to the fast locals array directly.
>
> >>
> >> Again, the results here are totally different if a Python-level
> >> tracing/profiling function is installed.
> >>
> >> And you can also hit these subtleties via 'exec' and 'eval':
> >>
> >>   def f():
> >>       a = 1
> >>       loc = locals()
> >>       assert "loc" not in loc
> >>       # exec() triggers writeback, and then mutates the locals dict
> >>       exec("a = 2; b = 3")
> >>       # So now the current environment has been reflected into 'loc'
> >>       assert "loc" in loc
> >>       # Also loc["a"] has been changed to reflect the exec'ed
> assignments
> >>       assert loc["a"] == 2
> >>       # But if we look at the actual environment, directly or via
> >>       # f_locals, we can see that 'a' has not changed:
> >>       assert a == 1
> >>       assert sys._getframe().f_locals["a"] == 1
> >>       # loc["b"] changed as well:
> >>       assert loc["b"] == 3
> >>       # And this *does* show up in f_locals:
> >>       assert sys._getframe().f_locals["b"] == 3
> >
> >
> > This works indeed. My understanding is that the bytecode interpreter,
> when accessing the value of a local variable, ignores f_locals and always
> uses the "fast" array. But exec() and eval() don't use fast locals, their
> code is always compiled as if it appears in a module-level scope.
> >
> > While the interpreter is running and no debugger is active, in a
> function scope f_locals is not used at all, the interpreter only interacts
> with the fast array and the cells. It is initialized by the first locals()
> call for a function scope, and locals() copies the fast array and the cells
> into it. Subsequent calls in the same function scope keep the same value
> for f_locals and re-copy fast and cells into it. This also clears out
> deleted local variables and emptied cells, but leaves "strange" keys (like
> "b" in the examples) unchanged.
> >
> > The truly weird case happen when Python-level tracers are present, then
> the contents of f_locals is written back to the fast array and cells at
> certain points. This is intended for use by pdb (am I the only user of pdb
> left in the world?), so one can step through a function and mutate local
> variables. I find this essential in some cases.
>
> Right, the original goal for the PEP was to remove the "truly weird
> case" but keep pdb working
>
> >>
> >> Of course, many of these edge cases are pretty obscure, so it's not
> >> clear how much they matter. But I think we can at least agree that
> >> this isn't the one obvious way to do it :-).
> >>
> >>
> >> ##### What's the landscape of possible semantics?
> >>
> >> I did some brainstorming, and came up with 4 sets of semantics that
> >> seem plausible enough to at least consider:
> >>
> >> - [PEP]: the semantics in the current PEP draft.
> >
> >
> > To be absolutely clear this copies the fast array and cells to f_locals
> when locals() is called, but never copies back, except when Python-level
> tracing/profiling is on.
>
> In the PEP draft, it never copies back at all, under any circumstance.
>
> >>
> >> - [PEP-minus-tracing]: same as [PEP], except dropping the writeback on
> >> Python-level trace/profile events.
> >
> >
> > But this still copies the fast array and cells to f_locals when a Python
> trace function is called, right? It just doesn't write back.
>
> No, when I say "writeback" in this email I always mean
> PyFrame_FastToLocals. The PEP removes PyFrame_LocalsToFast entirely.
>
> >> - [snapshot]: in function scope, each call to locals() returns a new,
> >> *static* snapshot of the local environment, removing all this
> >> writeback stuff. Something like:
> >>
> >>   def locals():
> >>       frame = get_caller_frame()
> >>       if is_function_scope(frame):
> >>           # make a point-in-time copy of the "live" proxy object
> >>           return dict(frame.f_locals)
> >>       else:
> >>           # in module/class scope, return the actual local environment
> >>           return frame.f_locals
> >
> >
> > This is the most extreme variant, and in this case there is no point in
> having f_locals at all for a function scope (since nothing uses it). I'm
> not 100% sure that you understand this.
>
> Yes, this does suggest an optimization: you should be able to skip
> allocating a dict for every frame in most cases. I'm not sure how much
> of a difference it makes. In principle we could implement that
> optimization right now by delaying the dict allocation until the first
> time f_locals or locals() is used, but we currently don't bother. And
> even if we adopt this, we'll still need to keep a slot in the frame
> struct to allocate the dict if we need to, because people can still be
> obnoxious and do frame.f_locals["unique never before seen name"] =
> blah and expect to be able to read it back later, which means we need
> somewhere to store that. (In fact Trio does do this right now, as part
> of its control-C handling stuff, because there's literally no other
> place where you can store information that a signal handler can see
> when it's walking the stack.) We could deprecate writing new names to
> f_locals like this, but that's a longer-term thing.
>
> >>
> >> - [proxy]: Simply return the .f_locals object, so in all contexts
> >> locals() returns a live mutable view of the actual environment:
> >>
> >>   def locals():
> >>       return get_caller_frame().f_locals
> >
> >
> > So this is PEP without any writeback. But there is still copying from
> the fast array and cells to f_locals. Does that only happen when locals()
> is called? Or also when a Python-level trace/profiling function is called?
> > My problem with all variants except what's in the PEP is that it would
> leave pdb *no* way (short of calling into the C API using ctypes) of
> writing back local variables.
>
> No, this option is called [proxy] because this is the version where
> locals() and f_locals *both* give you magic proxy objects where
> __getitem__ and __setitem__ access the fast locals array directly, as
> compared to the PEP where only f_locals gives you that magic object.
>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
>

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him/his **(why is my pronoun here?)*
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20190527/2493dedb/attachment.html>