[Python-ideas] Inline assignments using "given" clauses

Tim Peters tim.peters at gmail.com
Sun May 6 01:51:48 EDT 2018

> There were a couple key reasons I left the "for x in y" case out of the
> initial proposal:
> 1. The "for x in y" header is already quite busy, especially when tuple
> unpacking is used in the assignment target
> 2. Putting the "given" clause at the end would make it ambiguous as to
> whether it's executed once when setting up the iterator, or on every
> iteration
> 3. You can stick in an explicit "if True" if you don't need the given
> variable in the filter condition
>     [(fx**2, fx**3) for x in xs if True given fx = f(x)]
> And then once you've had an entire release where the filter condition was
> mandatory for the comprehension form, allowing the "if True" in "[(fx**2,
> fx**3) for x in xs given fx = f(x)]" to be implicit would be less ambiguous.

And some people claim ":=" would make Python harder to teach ;-)

>> ...
>> It''s certain sanest as
>>     if x**2 + y**2 > 9 given x, y = func_returning_twople():
>> "given" really shines there!

> Yep, that's why I don't have the same immediate reaction of "It would need
> to be limited to simple names as targets" reaction as I do for assignment
> expressions. It might still be a good restriction to start out with, though

I contrived that specific "use case", of course - I actually didn't
stumble into any real code where multiple targets would benefit to my
eyes.  Perhaps because, as you noted above of `"for x in y" headers`,
multiple-target assignment statements are often quite busy already too
(I have no interest in cramming as much logic as possible into each
line - but "sparse is better than dense" doesn't also mean "almost
empty is better than sparse" ;-) ).

> (especially if we wanted to allow multiple name bindings in a single given
> clause).

>> ...
>> The one-letter variable name obscures that it doesn't
>> actually reduce _redundancy_, though.  That is, in the current
>>     match = pattern.search(data)
>>     if match:
>> it's obviously less redundant typing as:
>>     if match := pattern.search(data):
>> In
>>     if match given match = pattern.search(data):
>> the annoying visual redundancy (& typing) persists.

> Right, but that's specific to the case where the desired condition really is
> just "bool(target)".

Not only.  If the result _needs_ to be used N times in total in the
test, binding expressions allow for that, but `given` requires N+1
instances of the name (the "extra one" to establish the name to begin

For example, where `probable_prime()` returns `True` or `False`, and
`bool(candidate)` is irrelevant:

    # highbit is a power of 2 >= 2; create a random prime
    # whose highest bit is highbit

    while (not probable_prime(candidate) given candidate =
                            highbit | randrange(1, highbit, 2)):


    while not probable_prime(candidate :=
                             highbit | randrange(1, highbit, 2)):

There I picked a "long" name to make the redundancy visually annoying ;-)

> That's certainly likely to be a *common* use case,

In all the code I looked at where I believe a gimmick like this would
actually help, it was indeed by far _most_ common that the result only
needed to be used once in the test.  In all such cases, the binding
expression spelling of the test requires one instance of the name, and
the `given` spelling two.

> but if we decide that it's *that* particular flavour of redundancy that really
> bothers us, then there's always the "if expr as name:" spelling (similar to
> the way that Python had "a and b" and "a or b" logical control flow
> operators long before it got "a if c else b").

Reducing each redundancy is a small win to me, but reaches
"importance" because it's so frequent.  Binding expressions have more
uses than _just_ that, though.

But I'm sure I don't know what they all are.  When a _general_ feature
is added, people find surprising uses for it.

For example, at times I'd love to write code like this, but can't:

    while any(n % p == 0 for p in small_primes):
        # divide p out - but what is p?

Generator expressions prevent me from seeing which value of `p`
succeeded.  While that's often "a feature", sometimes it's a PITA.  I
don't know whether this binding-expression stab would work instead
(I'm not sure the PEP realized there's "an issue" here, about the
intended scope for `thisp`):

    while any(n % (thisp := p) == 0 for p in small_primes):
        n //= thisp

If that is made to work, I think that counts as "a surprising use"
(capturing a witness for `any(genexp)` and a counterexample for
`all(genexp)`, both of which are wanted at times, but neither of which
`any()`/`all()` will ever support on their own)..

I suppose I could do it with `given` like so:

    while p is not None given p = next(
            (p for p in small_primes if n % p == 0),
        n //= p

but at that point I'd pay to go back to the original loop-and-a-half ;-)

>> One more, a lovely (to my eyes) binding expression simplification
>> requiring two bindings in an `if` test, taken from real-life code I
>> happened to write during the PEP discussion:
>>     diff = x - x_base
>>     if diff:
>>         g = gcd(diff, n)
>>         if g > 1:
>>             return g
>> collapsed to the crisp & clear:
>>     if (diff := x - x_base) and (g := gcd(diff, n)) > 1:
>>         return g
>> If only one trailing "given" clause can be given per `if` test
>> expression, presumably I couldn't do that without trickery.

> I was actually thinking that if we did want to allow multiple assignments,
> and we limited targets to single names, we could just use a comma as a
> separator:
>     if diff and g > 1 given diff = x - x_base, g = gcd(diff, n):
>         return g

I expect that 's bound to be confusing, because the assignment _statement_

    diff = x - x_base, g = gcd(diff, n)

groups very differently than intended:

    diff = (x - x_base, g) = gcd(diff, n)

And that's a syntax error.  With enclosing parens, expectations
change, and then people would expect it to work like specifying
keyword arguments instead:

> Similar to import statements, optional parentheses could be included in the
> grammar, allowing the name bindings to be split across multiple lines:
>     if diff and g > 1 given (
>         diff = x - x_base,
>         g = gcd(diff, n),
>     ):
>         return g

Keyword arguments work as a syntactic model (group as intended), but
not semantically:  if they really were keyword arguments, `x - x_base`
and `gcd(diff, n)` would both be evaluated _before_ any bindings

So it's more quirky `given`-specific rules no matter how you cut it.
The closest bit of Python syntax that captures the grouping (but only
partially), and the "left-to-right, with each binding in turn visible
to later expressions" semantics, is the semicolon.  Which would create
even weirder expectations :-(

> (Other potential separators would be ";", but that reads weirdly to me since
> my brain expects the semi-colon to end the entire statement, and "and", but
> that feels overly verbose, while also being overly different from its
> regular meaning)


>>   If it's more general,
>>     if (diff given diff = x _ xbase) and g > 1 given g = gcd(diff, n):
>> reads worse to my eyes (perhaps because of the "visual redundancy"
>> thing again), while
>>    if diff and g > 1 given diff = x - x_base given g = gcd(diff, n):
>> has my eyes darting all over the place, and wondering which of the
>> trailing `given` clauses executes first.

> I find that last effect is lessened when using the comma as a separator
> within the given clause rather than repeating the keyword itself.

Definitely.  I tend to believe Python has "slightly more than enough"
meanings for commas already, though.  But using commas and _requiring_
parens for more than one `given` binding seems least surprising to me

Then again, everyone already knows what ":=" means.  They just dislike
it because so many major languages already have it -)

More information about the Python-ideas mailing list