[Python-Dev] Informal educator feedback on PEP 572 (was Re: 2018 Python Language Summit coverage, last part)

Sun Jul 1 00:32:03 EDT 2018

[Nick Coghlan]

>>> "NAME := EXPR" exists on a different level of complexity, since it
>>> adds name binding in arbitrary expressions for the sake of minor
>>> performance improvement in code written by developers that are
>>> exceptionally averse to the use of vertical screen real estate,

> >>> ...

[Tim]
>> Note that PEP 572 doesn't contain a single word about "performance"
(neither

> >> that specific word nor any synonym), and I gave only one thought to it
when

> >> writing Appendix A:  "is this going to slow anything down
significantly?".

> >> The answer was never "yes", which I thought was self-evident, so I never

> >> mentioned it.  Neither did Chris or Guido.

> >>

> >> Best I can recall, nobody has argued for it on the grounds of
"performance".

> >> except in the indirect sense that sometimes it allows a more compact
way of

> >> reusing an expensive subexpression by giving it a name.   Which they
already

> >> do by giving it a name in a separate statement, so the possible
improvement

> >> would be in brevity rather than performance.

[Nick]

> > The PEP specifically cites this example as motivation:

The PEP gives many examples.  Your original was a strawman
mischaracterization of the PEP's _motivations_ (note the plural:  you only
mentioned "minor performance improvement", and snipped my listing of the
major motivations).

>
>   group = re.match(data).group(1) if re.match(data) else None

> >

> > That code's already perfectly straightforward to read and write as a

> > single line,

I disagree.  In any case of textual repetition, it's a visual
pattern-matching puzzle to identify the common substrings (I have to
visually scan that line about 3 times to be sure), and then a potentially
difficult conceptual puzzle to figure out whether side effects may result
in textually identical substrings evaluating to different objects.  That's
why "refererential transparency" is so highly valued in functional
languages ("if subexpressions are spelled the same, they evaluate to the
same result, period" - which isn't generally true in Python - to get that
enormously helpful (to reasoning) guarantee in Python you have to ensure
the subexpression is evaluated exactly once).

And as you of all people should be complaining about, textual repetition is
also prone to "oops - forgot one!" and "oops! made a typo when changing the
second one!" when code is later modified.

> so the only reason to quibble about it

I gave you three better reasons to quibble about it just above ;-)

> is because it's slower than the arguably less clear two-line alternative:

> >

> >  _m = re.match(data)

> >   group = _m.group(1) if _m else None

>
I find that much clearer than the one-liner above:  the visual pattern
matching is easier because the repeated substring is shorter and of much
simpler syntactic structure; it guarantees _by construction_ that the two
instances of `_m` evaluate to the same object, so there's no possible
concern about that (it doesn't even matter if you bound `re` to some
"non-standard" object that has nothing to do with Python's `re` module);
and any later changes to the single instance of `re.match(data)` don't have
to be repeated verbatim elsewhere.  It's possible that it runs twice as
fast too, but that's the least of my concerns.

All of those advantages are retained in the one-liner too if an assignment
expression can be used in it.

> Thus the PEP's argument is that it wants to allow the faster version

> > to remain a one-liner that preserves the overall structure of the

> > version that repeats the subexpression:

> >

> > group = _m.group(1) if _m := re.match(data) else None

> >

> > That's a performance argument, not a readability one (as if you don't

> > care about performance, you can just repeat the subexpression).

>
How does that differ from the part of what I said that you did retain above?

>> sometimes it allows a more compact way of reusing an expensive
>> subexpression by giving it a name.   Which they already do by giving
>> it a name in a separate statement, so the possible improvement would
>> be in brevity rather than performance.

You already realized the performance gain could be achieved by using two
statements.  The _additional_ performance gain by using assignment
expressions is at best trivial (it may save a LOAD_FAST opcode to fetch the
object bound to `_m` for the `if` test).

So, no, gaining performance is _not_ the motivation here.  You already had
a way to make it "run fast'.  The motivation is the _brevity_ assignment
expressions allow while _retaining_ all of the two-statement form's
advantages in easier readability, easier reasoning, reduced redundancy, and
performance.

As Guido said, in the PEP, of the example you gave here:

Guido found several examples where a programmer repeated
a subexpression, slowing down the program, in order to save
one line of code

It couldn't possibly be clearer that Guido thought the programmer's
motivation was brevity ("in order to save one line of code").  Guido only
happened to mention that they were willing to slow down the code to get
that brevity, but, as above, they were also willing to make the code harder
to read, reason about, and maintain.  With the assignment expression, they
don't have to give up any of the latter to get the brevity they mistakenly
_think_ ;-) they care most about - and, indeed, they can make it even
briefer.

I sure don't count it against the PEP that it may trick people overly
concerned with brevity into writing code that's clearer and faster too, but
that's a tiny indirect part of the PEP's motivation_s_ (note the plural
again).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20180630/fc1d94b3/attachment.html>