[Python-Dev] PEP 572: Assignment Expressions

Wed Apr 18 12:18:22 EDT 2018

On Wed, Apr 18, 2018 at 7:35 AM, Chris Angelico <rosuav at gmail.com> wrote:

> On Wed, Apr 18, 2018 at 11:58 PM, Guido van Rossum <guido at python.org>
> wrote:
> > I can't tell from this what the PEP actually says should happen in that
> > example. When I first saw it I thought "Gaah! What a horrible piece of
> > code." But it works today, and people's code *will* break if we change
> its
> > meaning.
> >
> > However we won't have to break that. Suppose the code is (perversely)
> >
> > t = range(3)
> > a = [t for t in t if t]
> >
> > If we translate this to
> >
> > t = range(3)
> > def listcomp(t=t):
> >     a = []
> >     for t in t:
> >         if t:
> >             a.append(t)
> >     return a
> > a = listcomp()
> >
> > Then it will still work. The trick will be to recognize "imported" names
> > that are also assigned and capture those (as well as other captures as
> > already described in the PEP).
>
> That can be done. However, this form of importing will have one of two
> consequences:
>
> 1) Referencing an unbound name will scan to outer scopes at run time,
> changing the semantics of Python name lookups
>

I'm not even sure what this would do.

> 2) Genexps will eagerly evaluate a lookup if it happens to be the same
> name as an internal iteration variable.
>

I think we would have to specify this more precisely.

Let's say by "eagerly evaluate a lookup" you mean "include it in the
function parameters with a default value being the lookup (i.e. starting in
the outer scope), IOW "t=t" as I showed above. The question is *when* we
would do this. IIUC the PEP already does this if the "outer scope" is a
class scope for any names that a simple static analysis shows are
references to variables in the class scope. (I don't know exactly what this
static analysis should do but it could be as simple as gathering all names
that are assigned to in the class, or alternatively all names assigned to
before the point where the comprehension occurs. We shouldn't be distracted
by dynamic definitions like `exec()` although we should perhaps be aware of
`del`.)

My proposal is to extend this static analysis for certain loop control
variables (any simple name assigned to in a for-clause in the
comprehension), regardless of what kind of scope the outer scope is. If the
outer scope is a function we already know how to do this. If it's a class
we use the analysis referred to above. If the outer scope is the global
scope we have to do something new. I propose to use the same simple static
analysis we use for class scopes.

Furthermore I propose to *only* do this for the loop control variable(s) of
the outermost for-clause, since that's the only place where without all
this rigmarole we would have a clear difference in behavior with Python 3.7
in cases like [t for t in t]. Oh, and probably we only need to do this if
that loop control variable is also used as an expression in the iterable
(so we don't waste time doing any of this for e.g. [t for t in q]).

(But what about [t for _ in t for t in t]? That's currently an
UnboundLocalError and we shouldn't try to "fix" that case.)

Since we now have once again introduced an exception for the outermost loop
control variable and the outermost iterable, we can consider doing this
only as a temporary measure. We could have a goal to eventually make [t for
t in t] fail, and in the meantime we would deprecate it -- e.g. in 3.8 a
silent deprecation, in 3.9 a noisy one, in 3.10 break it. Yes, that's a lot
of new static analysis for deprecating an edge case, but it seems
reasonable to want to preserve backward compatibility when breaking this
edge case since it's likely not all that uncommon. Even if most occurrences
are bad style written by lazy programmers, we should not break working
code, if it is reasonable to expect that it's relied upon in real code.

> Of the two, #2 is definitely my preference, but it does mean more
> eager binding.While this won't make a difference in the outermost
> iterable (since that's *already* eagerly bound), it might make a
> difference with others:
>
> t = range(3)
> gen = (t for _ in range(1) for t in t if t)
> t = [4, 5, 6]
> print(next(gen))
> print(next(gen))
>

I don't like this particular example, because it uses an obscure bit of
semantics of generator expressions. It's fine to demonstrate the finer
details of how those work, but it's unlikely to see real code relying on
this. (As I argued before, generator expressions are typically either fed
into other code that eagerly evaluates them before reaching the next line,
or returned from a function, and in the latter case intentional
modification of some variable in that function's scope to affect the
meaning of the generator expression would seem a remote possibility at
best, and an accident waiting to happen at worst.)

> Current semantics: UnboundLocalError on first next() call.
>
> PEP 572 semantics: Either UnboundLocalError (with current reference
> implementation) or it yields 1 and 2 (with eager lookups).
>
> So either we change things for the outermost iterable, or we change
> things for everything BUT the outermost iterable. Either way, I'm
> happy to eliminate the special-casing of the outermost iterable. Yes,
> it's a change in semantics, but a change that removes special cases is
> generally better than one that creates them.
>

Hopefully my proposal above satisfies you.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20180418/8dcb23e5/attachment.html>