[Python-ideas] Inline assignments using "given" clauses

Nick Malaguti python at fwdaddr.fastmail.fm
Sun May 13 16:29:06 EDT 2018

I think Peter tried to outline this earlier, but what he was laying out wasn't clear to me at first.

There seem to be 4 variations when it comes to assignment expressions. I'm going to try to ignore exact keywords here since we can sort those out once we have settled on which variation we prefer.

1. infix: TARGET := EXPR
2. infix: EXPR as TARGET
3. prefix: let TARGET = EXPR in ANOTHER_EXPR
4. postfix: ANOTHER_EXPR given TARGET = EXPR

Both 1 and 2 may appear in the context of a larger expression where TARGET may or may not be used:

1. 99 + (TARGET := EXPR) ** 2 + TARGET
2. 99 + (EXPR as TARGET) ** 2 + TARGET

3 and 4 require that TARGET appear in ANOTHER_EXPR, even if TARGET is the only thing contained in that expression, whereas with 1 and 2, TARGET need not be used again.

Example I:

1. x := 10
2. 10 as x
3. let x = 10 in x
4. x given x = 10

In the simple case where the goal of the assignment expression is to bind the EXPR to the TARGET so that TARGET can be used in a future statement, 1 and 2 are clearly the most straightforward because they do not require ANOTHER_EXPR.

# Please ignore that m.group(2) doesn't do anything useful here

Example II:

1. if m := re.match(...): m.group(2)
2. if re.match(...) as m: res = m.group(2)
3. if let m = re.match(...) in m: m.group(2)
4. if m given m = re.match(...): m.group(2)

I also think expressions that use "or" or "and" to make a compound expression benefit from the infix style, mostly because each sub-expression stands on its own and is only made longer with the repetition of TARGET:

Example III:

1. if (diff := x - x_base) and (g := gcd(diff, n)) > 1: ...
2. if (x - x_base as diff) and (gcd(diff, n) as g) > 1: ...
3. if (let diff = x - x_base in diff) and (let g = gcd(diff, n) in g > 1): ...
4. if (diff given diff = x - x_base) and (g > 1 given g = gcd(diff, n)): ...

In the more complex case where TARGET is reused in the expression, I find 3 and 4 to benefit as there is a separation of the binding from its usage. I can consider each expression separately and I don't have to deal with the assignment side effects at the same time. I believe this is what Neil is mostly arguing for.

# Borrowing from Andre, please forgive any mathematical problems like division by 0

Example IV:

1:  [(-b/(2*a) + (D := sqrt( (b/(2*a))**2 - c/a), -b/(2*a) - D) 
       for a in range(10) 
       for b in range(10)
       for c in range(10)
       if D >= 0]
2:  [(-b/(2*a) + (sqrt( (b/(2*a))**2 - c/a as D), -b/(2*a) - D) 
        for a in range(10) 
        for b in range(10)
        for c in range(10)
        if D >= 0]
3. [let D = sqrt( (b/(2*a))**2 - c/a) in 
       (-b/(2*a) + D, -b/(2*a) - D) 
       for a in range(10) 
       for b in range(10)
       for c in range(10)
       if D >= 0]
4. [(-b/(2*a) + D, -b/(2*a) - D) 
       for a in range(10) 
       for b in range(10)
       for c in range(10)
       if D >= 0
       given D = sqrt( (b/(2*a))**2 - c/a)]

Also in the case with multiple bindings I find that 3 and 4 benefit over 1 and 2:

Example V:

1. [(x := f(y := (z := f(i) ** 2) + 1)) for i in range(10)]
2. [(f((f(i) ** 2 as z) + 1 as y) as x) for i in range(10)]
3. [let x = f(y), y = z + 1, z = f(i) ** 2 in x for i in range(10)] # maybe the order of the let expressions should be reversed?
4. [x given x = f(y) given y = z + 1 given z = f(i) ** 2 for i in range(10)]

No matter which variation we prefer, there are plenty of arguments to be made that multiple assignment expressions in a single expression or usage of the TARGET later in the expression is harder to work with in most cases,. And since 1 and 2 (at least to me) are more difficult to parse in those situations, I'm more likely to push back on whoever writes that code to do it another way or split it into multiple statements.

I feel that Steven prefers 1, mostly for the reason that it makes Examples I, II, and III easier to write and easier to read. Neil prefers 4 because Examples I, II, and II still aren't that bad with 4, and are easier to work with in Examples IV and V.

If you feel that Examples IV and V should be written differently in the first place, you probably prefer infix (1 or 2).

If you feel that Examples IV and V are going to be written anyway and you want them to be as readable as possible, you probably prefer prefix (3) or postfix (4).

If you want to know what all the TARGETs are assigned to up front, you probably prefer 1 or 3 (for reading from left to right).

If you want to see how the TARGET is used in the larger expression up front and are willing to read to the end to find out if or where the TARGET has been defined, you probably prefer 4.

In my mind, all 4 variations have merit. I think I prefer prefix or postfix (postfix feels very natural to me) because I believe more complex expressions should be separateable (Neil argues better than I can for this).

But Steven has gone a long way to convince me that the sky won't fall if we choose an infix variation because in practice our better angels will push us away from using expressions that are too complex.

Prefix vs postfix is a discussion worth having if we decide that infix isn't the right choice.

I would love to see us reach consensus (too optimistic?) or at least an acknowledgment of the explicit tradeoffs for whichever variation we ultimately choose.


----- Original message -----
From: Matt Arcidy <marcidy at gmail.com>
To: Brendan Barnwell <brenbarn at brenbarn.net>
Cc: "python-ideas" <python-ideas at python.org>
Subject: Re: [Python-ideas] Inline assignments using "given" clauses
Date: Sun, 13 May 2018 11:53:20 -0700

On Sun, May 13, 2018, 11:28 Brendan Barnwell <brenbarn at brenbarn.net> wrote:
> On 2018-05-13 04:23, Steven D'Aprano wrote:
>  > In my experience mathematicians put the given *before* the statement:
>  >
>  >     Given a, b, c three sides of a triangle, then
>  >
>  >         Area = sqrt(s*(s-a)*(s-b)*(s-c))
>  >
>  >     where s = (a + b + c)/2 is the semi-perimeter of the triangle.
>  >
>  > For the record, that is almost exactly what I wrote for a student
>  > earlier today, and its not just me, it is very similar to the wording
>  > used on both Wolfram Mathworld and Wikipedia's pages on Heron's Formula.
>  >
>  > http://mathworld.wolfram.com/HeronsFormula.html
>  >
>  > https://en.wikipedia.org/wiki/Heron%27s_formula
>  >
>  >
>  > Putting "given" after the expression is backwards.
>          Yes, but that's because we're ruling out the use of "where".  At this 
>  point I would be fine with "snicklefritz" as the keyword.  The point is 
>  that I want to put SOMETHING after the expression, and this is not at 
>  all unusual.  See for instance Wikipedia pages on the Reimann zeta 
>  function 
>  (https://en.wikipedia.org/wiki/Riemann_zeta_function#Definition), 
>  gravitation equation 
>  (https://en.wikipedia.org/wiki/Gravity#Newton%27s_theory_of_gravitation), and 
>  compound interest 
>  (https://en.wikipedia.org/wiki/Compound_interest#Mathematics_of_interest_rate_on_loans). 
>    If we have to use the word "given" even though the word mathematicians 
>  would use in that position is "where", that's not such a big deal.

it is a big deal.  postfix requires more cognitive load, we will have no idea up front what's going on except for trivial exames.  more givens, more cognitive load.

if you think spending that is fine for you, I can't argue, but to say it doesn't matter isn't correct.

2.exames which get far worse for complex cases.  left for the for can be as complex.as.you wish.
[ x + y for t in range(10)  ... ]

x = 10
y = 20
[ x + y for t in range(10) ...]

up till you read ... you have no idea there even will be a substitution.  The lower is even worse, you think you know, but then have to redo the whole problem with new information.

also :
mathematicians don't just put the _word_ "given", they put givens, things that are known or assumed to be true.  Axioms and definitions, where definitions assign names to values.  This is for formal arguements.  reassigning values is handled in post fix occasionally once it is clear what x and y are.  but that's not what we are talking about if the name doesn't exist already.

again, you want to use given, that's fine, but the math argument is wrong, as is the "it doesn't matter" argument, assuming the current neurological model for working memory continues to hold.

 Maybe the difference is small, especially after familiarity sets in, but that doesn't mean the difference in load isn't there.  it will only increase for more complex statements with more givens.
> --
>  Brendan Barnwell
>  "Do not follow where the path may lead.  Go, instead, where there is no 
>  path, and leave a trail."
>      --author unknown
>  _______________________________________________
>  Python-ideas mailing list
>  Python-ideas at python.org
>  https://mail.python.org/mailman/listinfo/python-ideas
>  Code of Conduct: http://python.org/psf/codeofconduct/
Python-ideas mailing list
Python-ideas at python.org
Code of Conduct: http://python.org/psf/codeofconduct/

More information about the Python-ideas mailing list