Re: [Python-Dev] PEP 572: Write vs Read, Understand and Control Flow
[Victor Stinner] ...
Tim Peter gaves the following example. "LONG" version:
diff = x - x_base if diff: g = gcd(diff, n) if g > 1: return g
versus the "SHORT" version:
if (diff := x - x_base) and (g := gcd(diff, n)) > 1: return g
== Write ==
If your job is to write code: the SHORT version can be preferred since it's closer to what you have in mind and the code is shorter. When you read your own code, it seems straightforward and you like to see everything on the same line.
All so, but a bit more: in context, this is just one block in a complex algorithm. The amount of _vertical_ screen space it consumes directly affects how much of what comes before and after it can be seen without scrolling. Understanding this one block in isolation is approximately useless unless you can also see how it fits into the whole. Saving 3 lines of 5 is substantial, but it's more often saving 1 of 5 or 6. Regardless, they add up.
The LONG version looks like your expressiveness is limited by the computer. It's like having to use simple words when you talk to a child, because a child is unable to understand more subtle and advanced sentences. You want to write beautiful code for adults, right?
I want _the whole_ to be as transparent as possible. That's a complicated balancing act in practice.
== Read and Understand ==
In my professional experience, I spent most of my time on reading code, rather than writing code. By reading, I mean: try to understand why this specific bug that cannot occur... is always reproduced by the customer, whereas we fail to reproduce it in our test lab :-) This bug is impossible, you know it, right?
So let's say that you never read the example before, and it has a bug.
Then you're screwed - pay me to fix it ;-) Seriously, as above, this block on its own is senseless without understanding both the mathematics behind what it's doing, and on how all the code before it picked `x` and `x_base` to begin with.
By "reading the code", I really mean understanding here. In your opinion, which version is easier to *understand*, without actually running the code?
Honestly, I find the shorter version a bit easier to understand: fewer indentation levels, and less semantically empty repetition of names.
IMHO the LONG version is simpler to understand, since the code is straightforward, it's easy to "guess" the *control flow* (guess in which order instructions will be executed).
You're saying you don't know that in "x and y" Python evaluates x first, and only evaluates y if x "is truthy"? Sorry, but this seems trivial to me in either spelling.
Print the code on paper and try to draw lines to follow the control flow. It may be easier to understand how SHORT is more complex to understand than LONG.
Since they're semantically identical, there's _something_ suspect about a conclusion that one is _necessarily_ harder to understand than the other ;-) I don't have a problem with you finding the longer version easier to understand, but I do have a problem if you have a problem with me finding the shorter easier.
== Debug ==
Now let's imagine that you can run the code (someone succeeded to reproduce the bug in the test lab!). Since it has a bug, you now likely want to try to understand why the bug occurs using a debugger.
Sadly, most debugger are designed as if a single line of code can only execute a single instruction. I tried pdb: you cannot only run (diff := x - x_base) and then get "diff" value, before running the second assingment, you can only execute the *full line* at once.
I would say that the LONG version is easier to debug, at least using pdb.
That might be a good reason to avoid, say, list comprehensions (highly complex expressions of just about any kind), but I think this overlooks the primary _point_ of "binding expressions": to give names to intermediate results. I couldn't care less if pdb executes the whole "if" statement in one gulp, because I get exactly the same info either way: the names `diff` and `g` bound to the results of the expressions they named. What actual difference does it make whether pdb binds the names one at a time, or both, before it returns to the prompt? Binding expressions are debugger-friendly in that they _don't_ just vanish without a trace. It's their purpose to _capture_ the values of the expressions they name. Indeed, you may want to add them all over the place inside expressions, never intending to use the names, just so that you can see otherwise-ephemeral intra-expression results in your debugger ;-)
... Think about tracebacks. If you get an xception at "line 1" in the SHORT example (the long "if" expression), what can you deduce from the line number? What happened?
If you get an exception in the LONG example, the line number gives you a little bit more information... maybe just enough to understand the bug?
This one I wholly agree with, in general. In the specific example at hand, it's weak, because there's so little that _could_ raise an exception. For example, if the variables weren't bound to integers, in context the code would have blown up long before reaching this block. Python ints are unbounded, so overflow in "-" or "gcd" aren't possible either. MemoryError is theoretically possible, and in that case it would be good to know whether it happened during "-" or during "gcd()". Good to know, but not really helpful, because either way you ran out of memory :-(
== Write code for babies! ==
Please don't write code for yourself, but write code for babies! :-)
These babies are going to maintain your code for the next 5 years, while you moved to a different team or project in the meanwhile. Be kind with your coworkers and juniors!
I'm trying to write a single instruction per line whenever possible, even if the used language allows me much more complex expressions. Even if the C language allows assignments in if, I avoid them, because I regularly have to debug my own code in gdb ;-)
Now the question is which Python are allowed for babies. I recall that a colleague was surprised and confused by context managers. Does it mean that try/finally should be preferred? What about f'Hello {name.title()}' which calls a method into a "string" (formatting)? Or metaclasses? I guess that the limit should depend on your team, and may be explained in the coding style designed by your whole team?
It's the kind of thing I prefer to leave to team style guides, because consensus will never be reached. In a different recent thread, someone complained about using functions at all, because their names are never wholly accurate, and in any case they hide what's "really" going on. To my eyes, that was an unreasonably extreme "write code for babies" position. If a style guide banned using "and" or "or" in Python "if" or "while" tests, I'd find that less extreme, but also unreasonable. But if a style guide banned functions with more than 50 formal arguments, I'd find that unreasonably tolerant. Luckily, I only have to write code for me now, so am free to pick the perfect compromise in every case ;-)
On Tue, Apr 24, 2018 at 08:10:49PM -0500, Tim Peters wrote:
Binding expressions are debugger-friendly in that they _don't_ just vanish without a trace. It's their purpose to _capture_ the values of the expressions they name. Indeed, you may want to add them all over the place inside expressions, never intending to use the names, just so that you can see otherwise-ephemeral intra-expression results in your debugger ;-)
That's a fantastic point and I'm surprised nobody has thought of it until now (that I've seen). Chris, if you're still reading this and aren't yet heartedly sick and tired of the PEP *wink* this ought to go in as another motivating point. -- Steve
[Tim]
Binding expressions are debugger-friendly in that they _don't_ just vanish without a trace. It's their purpose to _capture_ the values of the expressions they name. Indeed, you may want to add them all over the place inside expressions, never intending to use the names, just so that you can see otherwise-ephemeral intra-expression results in your debugger ;-)
[Steven D'Aprano <steve@pearwood.info>] wrote:
That's a fantastic point and I'm surprised nobody has thought of it until now (that I've seen).
Chris, if you're still reading this and aren't yet heartedly sick and tired of the PEP *wink* this ought to go in as another motivating point.
You know, I thought I was joking when I wrote that - but after I sent it I realized I wasn't ;-) It would actually be quite convenient, and far less error-prone, to add a binding construct inside a complicated expression for purposes of running under a debugger. The alternative is typing the sub-expression(s) of interest by hand at the debugger prompt, or adding print()s, both of which are prone to introducing typos, or changing results radically due to triggering side effects in the code invoked by the duplicated sub-expression(s). Adding a binding construct wouldn't change anything about how the code worked (apart from possibly clobbering a local name).
On Tue, Apr 24, 2018 at 8:56 PM, Tim Peters <tim.peters@gmail.com> wrote:
It would actually be quite convenient, and far less error-prone, to add a binding construct inside a complicated expression for purposes of running under a debugger. The alternative is typing the sub-expression(s) of interest by hand at the debugger prompt, or adding print()s, both of which are prone to introducing typos, or changing results radically due to triggering side effects in the code invoked by the duplicated sub-expression(s). Adding a binding construct wouldn't change anything about how the code worked (apart from possibly clobbering a local name).
I thought this was what q was for :-) https://pypi.org/project/q/ -n -- Nathaniel J. Smith -- https://vorpus.org
On 2018-04-24 21:05, Nathaniel Smith wrote:
On Tue, Apr 24, 2018 at 8:56 PM, Tim Peters <tim.peters@gmail.com> wrote:
It would actually be quite convenient, and far less error-prone, to add a binding construct inside a complicated expression for purposes of running under a debugger. The alternative is typing the
A bit like breaking complicated expressions into several lines, with or without assignments for readability. ;-)
I thought this was what q was for :-)
Where have you been all my life, q? -Mike
On 4/24/2018 8:56 PM, Tim Peters wrote:
The alternative is typing the sub-expression(s) of interest by hand at the debugger prompt, or adding print()s, both of which are prone to introducing typos, or changing results radically due to triggering side effects in the code invoked by the duplicated sub-expression(s). Adding a binding construct wouldn't change anything about how the code worked (apart from possibly clobbering a local name). I've done both subexpression mangling (attempts at duplication) and added print statements and experienced these negative side effects, taking twice as long (or more) as intended to get the debugging information needed.
While I'm no core developer, and would have a mild appreciation of avoiding those while True: loops so was generally in favor of this PEP, but not enough to be inpsired to speak up about it, I would frequently benefit from this capability... adding extra binding names, and printing _them_ instead of the duplicated subexpressions. +1 Glenn
On Wed, Apr 25, 2018 at 4:56 AM, Tim Peters <tim.peters@gmail.com> wrote:
[Tim]
Binding expressions are debugger-friendly in that they _don't_ just vanish without a trace. It's their purpose to _capture_ the values of the expressions they name. Indeed, you may want to add them all over the place inside expressions, never intending to use the names, just so that you can see otherwise-ephemeral intra-expression results in your debugger ;-)
[Steven D'Aprano <steve@pearwood.info>] wrote:
That's a fantastic point and I'm surprised nobody has thought of it until now (that I've seen).
Chris, if you're still reading this and aren't yet heartedly sick and tired of the PEP *wink* this ought to go in as another motivating point.
You know, I thought I was joking when I wrote that - but after I sent it I realized I wasn't ;-)
You just don't realise how perspicacious you truly are, Tim!
It would actually be quite convenient, and far less error-prone, to add a binding construct inside a complicated expression for purposes of running under a debugger. The alternative is typing the sub-expression(s) of interest by hand at the debugger prompt, or adding print()s, both of which are prone to introducing typos, or changing results radically due to triggering side effects in the code invoked by the duplicated sub-expression(s). Adding a binding construct wouldn't change anything about how the code worked (apart from possibly clobbering a local name).
Indeed, in the cases where I currently find myself unwrapping expressions to capture their values in local variables for debugging purposes it would usually be far less intrusive to bind a name to the expression inline, then use the debugger to inspect the value.
On 4/25/2018 6:10 AM, Steve Holden wrote:
On Wed, Apr 25, 2018 at 4:56 AM, Tim Peters <tim.peters@gmail.com <mailto:tim.peters@gmail.com>> wrote:
[Tim] >> Binding expressions are debugger-friendly in that they _don't_ just >> vanish without a trace. It's their purpose to _capture_ the values of >> the expressions they name. Indeed, you may want to add them all over >> the place inside expressions, never intending to use the names, just >> so that you can see otherwise-ephemeral intra-expression results in >> your debugger ;-)
[Steven D'Aprano <steve@pearwood.info <mailto:steve@pearwood.info>>] wrote: > That's a fantastic point and I'm surprised nobody has thought of it > until now (that I've seen). > > Chris, if you're still reading this and aren't yet heartedly sick and > tired of the PEP *wink* this ought to go in as another motivating point.
You know, I thought I was joking when I wrote that - but after I sent it I realized I wasn't ;-)
You just don't realise how perspicacious you truly are, Tim!
It would actually be quite convenient, and far less error-prone, to add a binding construct inside a complicated expression for purposes of running under a debugger. The alternative is typing the sub-expression(s) of interest by hand at the debugger prompt, or adding print()s, both of which are prone to introducing typos, or changing results radically due to triggering side effects in the code invoked by the duplicated sub-expression(s). Adding a binding construct wouldn't change anything about how the code worked (apart from possibly clobbering a local name).
Indeed, in the cases where I currently find myself unwrapping expressions to capture their values in local variables for debugging purposes it would usually be far less intrusive to bind a name to the expression inline, then use the debugger to inspect the value.
I agree that this is a definite plus feature. Being able to tag subexpressions would make visual debuggers that show all local variables as one steps (like IDLE's) even more useful relative to print statements. -- Terry Jan Reedy
On Wed, Apr 25, 2018 at 2:17 PM, Terry Reedy <tjreedy@udel.edu> wrote:
On 4/25/2018 6:10 AM, Steve Holden wrote:
Indeed, in the cases where I currently find myself unwrapping expressions to capture their values in local variables for debugging purposes it would usually be far less intrusive to bind a name to the expression inline, then use the debugger to inspect the value.
I agree that this is a definite plus feature. Being able to tag subexpressions would make visual debuggers that show all local variables as one steps (like IDLE's) even more useful relative to print statements.
Some other programming languages (thinking of Racket) solve this by having the debugger let you step through expression evaluation, without editing the code. e.g. in the line x = 1 + 2 * 3, we might step through and first evaluate 2*3 (-> 6), and then 1 + <result> (-> 7). Similar to how Python already lets you step into and see the result of function calls. This is especially powerful in visual debuggers, where the stepping and output can be displayed very intuitively. -- Devin
Devin Jeanpierre writes:
Some other programming languages (thinking of Racket) solve this by having the debugger let you step through expression evaluation, without editing the code.
Good tools are a wonderful thing, and I think pdb should be enhanced that way (by somebody who has the time and interest, not me and not necessarily you ;-). Nevertheless, "printf debugging" continues to be very popular, and good support for printf debugging is indeed the killer app for binding expressions as far as I'm concerned. Tim's humorous insight took me from -0.8 all the way to +1 Nice job, Chris! Good luck with the pronouncement! Steve
On Wed, Apr 25, 2018 at 1:43 PM, Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Apr 24, 2018 at 08:10:49PM -0500, Tim Peters wrote:
Binding expressions are debugger-friendly in that they _don't_ just vanish without a trace. It's their purpose to _capture_ the values of the expressions they name. Indeed, you may want to add them all over the place inside expressions, never intending to use the names, just so that you can see otherwise-ephemeral intra-expression results in your debugger ;-)
That's a fantastic point and I'm surprised nobody has thought of it until now (that I've seen).
Chris, if you're still reading this and aren't yet heartedly sick and tired of the PEP *wink* this ought to go in as another motivating point.
Yes, I'm still reading... but I use pdb approximately zero percent of the time, so I actually have no idea what's useful for single-stepping through Python code. So I'm going to write 'blind' here for a bit; do you reckon this sounds reasonably accurate? -- existing rationale -- Naming the result of an expression is an important part of programming, allowing a descriptive name to be used in place of a longer expression, and permitting reuse. Currently, this feature is available only in statement form, making it unavailable in list comprehensions and other expression contexts. Merely introducing a way to assign as an expression would create bizarre edge cases around comprehensions, though, and to avoid the worst of the confusions, we change the definition of comprehensions, causing some edge cases to be interpreted differently, but maintaining the existing behaviour in the majority of situations. -- end existing rationale, begin new text -- Additionally, naming sub-parts of a large expression can assist an interactive debugger, providing useful display hooks and partial results. Without a way to capture sub-expressions inline, this would require refactoring of the original code; with assignment expressions, this merely requires the insertion of a few ``name :=`` markers. Removing the need to refactor reduces the likelihood that the code be inadvertently changed as part of debugging (a common cause of Heisenbugs), and is easier to dictate to a student or junior programmer. -- end -- ChrisA
Chris Angelico writes:
Additionally, naming sub-parts of a large expression can assist an interactive debugger, providing useful display hooks and partial results. Without a way to capture sub-expressions inline, this would require refactoring of the original code; with assignment expressions, this merely requires the insertion of a few ``name :=`` markers. Removing the need to refactor reduces the likelihood that the code be inadvertently changed as part of debugging (a common cause of Heisenbugs),
Period here preferred.
and is easier to dictate to a student or junior programmer.
True but gratuitous. It's also true that it's easier to dictate to Guido or Tim, though you might be happier if you let them refactor!
On Thu, Apr 26, 2018 at 3:54 PM, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
Chris Angelico writes:
Additionally, naming sub-parts of a large expression can assist an interactive debugger, providing useful display hooks and partial results. Without a way to capture sub-expressions inline, this would require refactoring of the original code; with assignment expressions, this merely requires the insertion of a few ``name :=`` markers. Removing the need to refactor reduces the likelihood that the code be inadvertently changed as part of debugging (a common cause of Heisenbugs),
Period here preferred.
and is easier to dictate to a student or junior programmer.
True but gratuitous. It's also true that it's easier to dictate to Guido or Tim, though you might be happier if you let them refactor!
Well, true. The point isn't WHO you're dictating to, but that you can dictate it at all. "Hmm, let's see. Toss a 'foo colon-equals' in front of X, then print out what foo is." My day job involves a lot of helping students learn how to debug, so I say this kind of thing a lot (even if it's obvious to me what the problem is, because the student needs to learn debugging, not just be told what to fix). Refactoring just for the sake of a print call is overkill and potentially risky (if the student edits the wrong thing). Feel free to suggest an alternate wording. ChrisA
On Wed, Apr 25, 2018 at 11:53 PM, Chris Angelico <rosuav@gmail.com> wrote:
Well, true. The point isn't WHO you're dictating to, but that you can dictate it at all. "Hmm, let's see. Toss a 'foo colon-equals' in front of X, then print out what foo is." My day job involves a lot of helping students learn how to debug, so I say this kind of thing a lot (even if it's obvious to me what the problem is, because the student needs to learn debugging, not just be told what to fix). Refactoring just for the sake of a print call is overkill and potentially risky (if the student edits the wrong thing).
This is overstating things slightly... the best alternative to 'foo colon-equals' isn't risky refactoring so you can call print, it's a helper like: def p(obj): print(obj) return obj that you can sprinkle inside existing expressions. Expecting new users to realize that this is possible, and a good idea, and to implement it, and get it right, while they're in the middle of being confused about basic python things, is not terribly reasonable, so it's probably underused. But there are ways we could address that. -n -- Nathaniel J. Smith -- https://vorpus.org
Chris Angelico writes:
Well, true. The point isn't WHO you're dictating to,
By "period here preferred," I meant I think it's mostly a waste of space to mention dictation at all in that document. But it's not a big deal to me, so how about changing "a student or junior programmer" to "another programmer"?
On Thu, Apr 26, 2018 at 9:23 PM, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
Chris Angelico writes:
Well, true. The point isn't WHO you're dictating to,
By "period here preferred," I meant I think it's mostly a waste of space to mention dictation at all in that document. But it's not a big deal to me, so how about changing "a student or junior programmer" to "another programmer"?
Sure, that works. ChrisA
On 25.04.18 05:43, Steven D'Aprano wrote:
On Tue, Apr 24, 2018 at 08:10:49PM -0500, Tim Peters wrote:
Binding expressions are debugger-friendly in that they _don't_ just vanish without a trace. It's their purpose to _capture_ the values of the expressions they name. Indeed, you may want to add them all over the place inside expressions, never intending to use the names, just so that you can see otherwise-ephemeral intra-expression results in your debugger ;-)
That's a fantastic point and I'm surprised nobody has thought of it until now (that I've seen).
Chris, if you're still reading this and aren't yet heartedly sick and tired of the PEP *wink* this ought to go in as another motivating point.
Yay, that's like a dream, really fantastic. So sorry that I was way too deep in development in spring and did not read earlier about that PEP. I was actually a bit reluctant about "yet another way to prove Python no longer simple" and now even that Pascal-ish look! :-) But this argument has completely sold me. Marvellous! -- Christian Tismer-Sperling :^) tismer@stackless.com Software Consulting : http://www.stackless.com/ Karl-Liebknecht-Str. 121 : http://pyside.org 14482 Potsdam : GPG key -> 0xE7301150FB7BEE0E phone +49 173 24 18 776 fax +49 (30) 700143-0023
participants (11)
-
Chris Angelico
-
Christian Tismer
-
Devin Jeanpierre
-
Glenn Linderman
-
Mike Miller
-
Nathaniel Smith
-
Stephen J. Turnbull
-
Steve Holden
-
Steven D'Aprano
-
Terry Reedy
-
Tim Peters