[Python-ideas] Assignments in list/generator expressions

Sun Apr 10 17:18:25 CEST 2011

On Sun, Apr 10, 2011 at 11:31 PM, Eugene Toder <eltoder at gmail.com> wrote:
>> It isn't strangely missing at all, it's just hard:
>> http://www.python.org/dev/peps/pep-3150/
>
> Local definition in list comprehension is significantly simpler that
> 'given'. Semantically, 'with name = expr' can be treated as a more
> readable form of 'for name in [expr]'. Implementation can use a simple
> assignment.

Part of the point of PEP 3150 is that such requests don't *stay*
simple. If one subexpression, why not two? If in a list comprehension,
why not in a conditional expression? The "you can't do arbitrary
embedded assignments in expressions" rule is at least a simple one,
even if some folks don't like it (most likely due to a
mathematics/functional programming background, where the idea of
persistent state isn't a basic assumption the way it is in normal
imperative programming).

While naming the iteration variables is a fundamental part of writing
comprehensions, naming other subexpressions is not, so it doesn't make
sense to provide comprehension specific syntax for it. If you want to
name subexpressions, the "one obvious way" is to use an explicit loop,
typically as part of a generator:

def transform(xs):
  """Meaningful description of whatever the transform does"""
  for x in xs:
    y = f(x)
    if y:
      yield y

ys = transform(xs)

The problem of "I want to use this value as part of a conditional
expression and in its own right" actually arises in more locations
than just filtering comprehensions - it also comes up for it
statements, while loops and conditional expressions. In all cases, you
run up against the limits of what expressions allow and have to devote
a bunch of vertical whitespace to switch to using expressions instead.

The issue you're really up against is the fact that Guido made a
choice 20 years ago to enforce a statement/expression dichotomy and to
keep name binding entirely the purview of statements. Comprehension
iteration variables are an exception that were added later on, but
they're used in a very constrained way that matches the embedded
assignment of a specific statement type (i.e. the header line on for
loops).

Embedded assignment proposals for conditions suffer badly from the
fact that there is no statement level equivalent for the kind of
assignment they propose. The obvious solution (allowing the
conditional to be named) is too limiting, but solutions that are
sufficiently flexible to be worthwhile end up allowing naming of
arbitrary subexpressions, effectively throwing away a fundamental
design principle of the language. And so the discussion dies, until
the next time someone decides that their personal use case for
embedded assignment is worth altering the language to allow.

Mathias's proposal in this case is a slight improvement over past
suggestions, since it allows the conditional expression to differ from
the value used in the rest of the expression. However, I'm not sure it
would be feasible to implement it in a way that constrained it to
comprehensions only. Even if that was possible, fixing only 1 out of
the 4 parts of the language where the problem arises isn't much of a
solution from a broader point of view.

And it still only works in simple cases, being utterly unhelpful for cases like:

  {f(k):g(v) for k, v in d.items() if f(k) and g(v)}

As past discussions have shown, the only thing really powerful enough
to cover an acceptably large number of use cases is full-blown
embedded assignment for arbitrary expressions:

ys = [y for x in xs if not (f(x) as y)]
{k2:v2 for k, v in d.items() if (f(k) as k2) and (g(v) as v2)}

That's an awfully big sledgehammer for a rather small nail :P

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia