[Python-Dev] PEP 572: Write vs Read, Understand and Control Flow

Tue Apr 24 05:21:34 EDT 2018

Hi,

I have been asked to express myself on the PEP 572. I'm not sure that
it's useful, but here is my personal opinion on the proposed
"assignment expressions".

PEP 572 -- Assignment Expressions:
https://www.python.org/dev/peps/pep-0572/

First of all, I concur with others: Chris Angelico did a great job to
design a good and full PEP, and a working implementation which is also
useful to play with it!

WARNING! I was (strongly) opposed to PEP 448 Unpacking Generalizations
(ex: [1, 2, *list]) and PEP 498 f-string (f"Hello {name}"), whereas I
am now a happy user of these new syntaxes. So I'm not sure that I have
good tastes :-)

Tim Peter gaves the following example. "LONG" version:

diff = x - x_base
if diff:
    g = gcd(diff, n)
    if g > 1:
        return g

versus the "SHORT" version:

if (diff := x - x_base) and (g := gcd(diff, n)) > 1:
    return g

== Write ==

If your job is to write code: the SHORT version can be preferred since
it's closer to what you have in mind and the code is shorter. When you
read your own code, it seems straightforward and you like to see
everything on the same line.

The LONG version looks like your expressiveness is limited by the
computer. It's like having to use simple words when you talk to a
child, because a child is unable to understand more subtle and
advanced sentences. You want to write beautiful code for adults,
right?

== Read and Understand ==

In my professional experience, I spent most of my time on reading
code, rather than writing code. By reading, I mean: try to understand
why this specific bug that cannot occur... is always reproduced by the
customer, whereas we fail to reproduce it in our test lab :-) This bug
is impossible, you know it, right?

So let's say that you never read the example before, and it has a bug.

By "reading the code", I really mean understanding here. In your
opinion, which version is easier to *understand*, without actually
running the code?

IMHO the LONG version is simpler to understand, since the code is
straightforward, it's easy to "guess" the *control flow* (guess in
which order instructions will be executed).

Print the code on paper and try to draw lines to follow the control
flow. It may be easier to understand how SHORT is more complex to
understand than LONG.

== Debug ==

Now let's imagine that you can run the code (someone succeeded to
reproduce the bug in the test lab!). Since it has a bug, you now
likely want to try to understand why the bug occurs using a debugger.

Sadly, most debugger are designed as if a single line of code can only
execute a single instruction. I tried pdb: you cannot only run (diff
:= x - x_base) and then get "diff" value, before running the second
assingment, you can only execute the *full line* at once.

I would say that the LONG version is easier to debug, at least using pdb.

I'm using regularly gdb which implements the "step" command as I
expect (don't execute the full line, execute sub expressions one by
one), but it's still harder to follow the control flow when a single
line contains multiple instructions, than debugging lines with a
single instruction.

You can see it as a limitation of pdb, but many tools only have the
granularity of whole line. Think about tracebacks. If you get an
exception at "line 1" in the SHORT example (the long "if" expression),
what can you deduce from the line number? What happened?

If you get an exception in the LONG example, the line number gives you
a little bit more information... maybe just enough to understand the
bug?

Example showing the pdb limitation:

>>> def f():
...  breakpoint()
...  if (x:=1) and (y:=2): pass
...
>>> f()
> <stdin>(3)f()

(Pdb) p x
*** NameError: name 'x' is not defined
(Pdb) p y
*** NameError: name 'y' is not defined

(Pdb) step
--Return--
> <stdin>(3)f()->None

(Pdb) p x
1
(Pdb) p y
2

... oh, pdb gone too far. I expected a break after "x := 1" and before
"y := 2" :-(

== Write code for babies! ==

Please don't write code for yourself, but write code for babies! :-)
These babies are going to maintain your code for the next 5 years,
while you moved to a different team or project in the meanwhile. Be
kind with your coworkers and juniors!

I'm trying to write a single instruction per line whenever possible,
even if the used language allows me much more complex expressions.
Even if the C language allows assignments in if, I avoid them, because
I regularly have to debug my own code in gdb ;-)

Now the question is which Python are allowed for babies. I recall that
a colleague was surprised and confused by context managers. Does it
mean that try/finally should be preferred? What about f'Hello
{name.title()}' which calls a method into a "string" (formatting)? Or
metaclasses? I guess that the limit should depend on your team, and
may be explained in the coding style designed by your whole team?

Victor