for -- else: what was the motivation?

avi.e.gross at gmail.com avi.e.gross at gmail.com
Sun Oct 9 23:19:00 EDT 2022


I won't reply to everything Dave says and especially not the parts I fully agree with.

I think in many situations in life there is no ONE way to do things so most advice is heuristic at best and many exceptions may exist depending on your chosen approach. As such, I do not really think in PYTHON when writing code but an amalgam of many languages with all kinds of ways of doing things and then zoom in on how to do it in the current language be it Python or R or JavaScript and so on. Yes, I am in some sense focused but also open, just as in Human languages I may mostly be thinking in English but also sometimes see words and phrases pop into my mind from other languages that mean about the same thing and then filter it out to focus on whichever language I am supposed to be using at the time.

So back to the topic we are wandering in of what kinds of advice might apply in some complex nested IF-THEN-ELSE structures or CASE statements and so on such as comprehensions as to what ways produce better results much of the time and when another method is better.

Someone suggested a method they use that others wondered about. But they have a point. If you make code with large blocks that do not fit on your screen, and indentation is your guide, you can easily lose sight of things. 

But my advice given glibly is also not serious. Sometimes you may think you are being efficient and as DN pointed out, you are not. The compiler or other software may internally rearrange and optimize your code and make your method not necessary or even prevent the optimization. Your tradeoff may make a program run faster but use more memory or other resources or make it hard for anyone else (or yourself next week) to understand your code or be able to modify it. 

So, yes, sometimes it is more natural to write code like if score greater than 90, give an A else if greater than 80 give a B, ... else if less than 65, give an F. Perhaps in Harvard your choice of grades is limited to mostly an A and a few B, no C/D as anything lower is an F. Some other school may have few A and mainly C.  It may be optimal to start with if between 70 and 80, give a C, then deal with lesser cases. Sometimes the data drives what is more common and hence more efficient.

But a big exception is cases where the program blows up if you are not careful. You cannot do certain thinks if a value is empty or NULL or an index above the length of something indexable or trying to write to something read-only and many more.

So much code may look like, in pseudocode:

Case
 is.na(var) do this
var > length do that
value[var] of type integer do whatever
...

It may take several steps to make sure your data won't cause an exception if queried without checking if the query makes sense. Only once that is away, might you be able to try for the most common cases for valid data first.

Again, as DN points out, in Python some may use exceptions that would jump in for the hopefully rare cases the above type of code tries to protect against. That can be a good approach if these cases are rare and identifiable as unique exceptions so your code focuses on optimizing the good cases you normally expect and shows them up-front with exception-handling code dangling beneath. 

I suspect some of us prefer some methods over others but can also be ambidextrous as needed. Older languages rapidly killed any program that tried to divide by zero so every single division needed to be protected if a zero might be involved. Mind you, newer languages can face serious bugs with things not easily trapped as value like Inf or NaN or missing values of various kinds can propagate and ruin lots of data. What is the mean of a group of numbers that includes an infinite one? What about some form of NA? Languages like R offer lots of idioms such as having many functions add a codicil like na.rm=TRUE that strips them before they infect your data, but that is not always appropriate.

I do not see most programming as a one-size-fits-all approach. Most advice is best ignored. Anyone who suggests that all functions be say no more than 5 lines and that you should waste lots of time and energy making ever more small functions to decompose any larger one till it is under 5 lines but now calls lots of meaningless short functions sometimes 10 levels deep, is not helping you. Goals are fine when they make sense but often the opposite is true.

Consider some kind of case statement that has dozens of cases like asking what to do when one of many keys is pressed on a keyboard. The function can be dozens or hundreds of lines long. I could create a function that tests for 'A' and non-A and in the latter case calls a second function that tests for B and non-B and so on. If the user types Z, it is handled in the 26th function! But worse, you may need to pass all kinds of other variables down this chain so whatever key is pressed can do things with local variables. This does not sound like an improvement over one longer function.

But yet, sometimes not. You might decide a good way to handle this is to use a dictionary containing all possible keys as keys and varying small functions as arguments. This could make your main function fairly small, once the dictionary had been created using many long lines.

So you might say that when evaluating multiple possible solutions, one of several guiding mechanisms is to use fairly short functions. But adding may guiding principles opens up the door for many conflicts so add one saying that if your manager or reviewers insist you do it a specific way, forget the rest unless you do not value your job!

Anyone who has read this far, you have my sympathies. LOL! Like me, I suggest you (and I) get a life! In life, there often are no unique right answers albeit an infinite number of wrong answers.




-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com at python.org> On Behalf Of dn
Sent: Sunday, October 9, 2022 7:41 PM
To: python-list at python.org
Subject: Re: for -- else: what was the motivation?

On Sun, 9 Oct 2022 at 15:39, Axy via Python-list <python-list at python.org> wrote:

> "shortest block first"

Have never heard this advice before. Kind-of rankled with me, as it did for others.

Enquiring minds want to know... Played Duck, duck, go on this: zero hits amongst a pile of similar phrases - turns-out there's an algorithm with a similar name, but not related, and an electronics approach (way too 'low' a level for translation to us though).

Tried prefixing with "program" but no such advice to programmers or program[me] designers.

Tried prefixing with "python", but equal lack of joy.

Would OP please quote source?


On 10/10/2022 05.56, Peter J. Holzer wrote:
> On 2022-10-09 12:18:09 -0400, Avi Gross wrote:
>> Smallest code blocks first may be a more modern invention.

None of the recent-grads or new-hires I've asked this morning (it's already Monday over-here!) have used or heard the term.


>> Some would argue for a rule related to efficiency of execution. When you
>> have multiple blocks as in an if-else or case statement with multiple
>> choices, that you order the most common cases first. Those shorten
>> execution more often than the rarer cases especially the ones that should
>> never happen.
> 
> Those of us who started programming on 8 bit homecomputers of course
> have efficiency always at the back of their heads, but I find this

... for mainframes just as much as micro-computers!

Regarding execution-efficiencies, I'm sure @Avi knows better than I: It 
seems likely that Python, as an interpreted language, will create 
'blocks' of its virtual-machine code in the same order as they appear in 
the Python-source. However, aren't there optimising compilers which do 
something intelligent with the equivalent clauses/suites in other languages?

Regardless, is a Jump-instruction which transfers else-control to a 
block five machine-instructions 'down', any less efficient than a jump 
which spans 50-instructions?


>> So not a rule but realistically not always a bad idea to write code in a
>> way that draws the attention of readers along the main path of execution
>> and perhaps not showing all the checking for odd cases first.
> 
> much more important. Putting the main path first makes it easier to
> understand what the code is supposed to do normally. All those pesky
> exceptions are in the "small print" below.

Absolutely! Has the term "readability" been used 'here'?

Human nature (or is it that of computer programmers in-particular) is to 
be optimistic: it will work [this time*]. Accordingly, a colleague talks 
of always coding 'the happy line' first (meaning line of logic, cf 
source-code).

Contrarily, for while-True (infinite) loops, and particularly recursive 
algorithms, the [wise] counsel is to code the end-condition first. 
(always know your closest exit! "The nearest exit may be behind you"...)


Indeed, dare I say, this optimistic-approach is pythonic. Taking an 
over-simple, two-value division example, the approach is:

try:
     a = b / c
except ZeroDivisionError:
     ... clean-up the mess ...

which contrasts the EAFP philosophy of Python versus the LBYL 
expectation of (many) other languages:

assert c != 0
a = b / c

That said, as "Data Science" use of Python expands, it is bringing more 
and more needs for an LBYL attitude, eg "data cleaning".

(EAFP, LBYL? https://docs.python.org/3.9/glossary.html)


> There is of course the opposite view that you should just get all of the
> confounding factors out of the way first, so that the default is also
> the common case. I also do that sometimes, but then I don't hide this in
> in an else: clause but do something like this:
> 
> for item in whatever:
>      if not_this_one(item):
>          continue
>      if neither_this_one(item):
>          continue
>      if cant_continue(item):
>          break
>      if oopsie():
>          raise SomeError()
> 
>      do_something_with(item)
>      and_some_more(item)
>      we_are_done(item)
> 
> which shows visually what the main purpose of the loop (or function or
> other block) is.

Nicely stated!

NB I've seen sufficient of @Peter's posts to know that this was never 
(even implied to be) intended as a snippet for all occasions!


It also illustrates why such is less readable: because we have to scan 
four if-statements before we can 'see' the purpose of the loop. My 
'itch' would be to extract this code 'out' to a function - that way the 
name will convey the somewhat-obscured purpose of the loop.


Alternately, reduce the 'distractions':-

try:
     for item in whatever:
         inspect_the_data( item )
         do_something_with(item)
         and_some_more(item)
         we_are_done(item)
except SomeError:
     ...
except CustomBreakException:
     ... ?pass?				# same effect as break

by 'hiding' in:

def inspect_the_data( item ):
     if not_this_one(item):
         continue
     if neither_this_one(item):
         continue
     if cant_continue(item):
         raise CustomBreakException	# was break
     if oopsie():
         raise SomeError()


It is now easier to understand 'the happy line', ie the thinking of the 
original-coder, and the 'small print' has been relegated to such and can 
be cheerfully disregarded.

Whereas, if 'exceptional circumstances' is the reason one is inspecting 
the code in the first-place, then it also helps to have separated-out 
the ifs-buts-and-maybes, and into a structure which can be as closely 
(and exhaustively) tested, as may be required.


In some ways, (IMHO) there are reasons to feel disquiet over this style 
of coding. Learning "Modular Programming", and slightly-later 
"Structured Programming", when they were still new (?fresh, ?exciting), 
we were inculcated with the 'one way in, one way out' 
philosophy-of-correctness. This applied to "blocks" of code (per 
"module"), as well as formal units, eg functions.

Accordingly, am slightly unnerved by seeing Exceptions being used to 
'jump out' of interior/nested blocks, rather than using the 
return-mechanism (taking their turn like all the good little boys and 
girls). That said, it makes for tidier code - so I'll stop muttering 
into my (grey) beard ...


The alternative, assuming the 'errors and omissions' function is a 
possible tactic(!), would be to return a boolean, eg

def is_clean_data( item )->bool:
     is_verified = False
     if ...
     ...
     return is_verified

- thus the do-stuff calls will become a 'successful' if-then 'suite'.


There is more code to write/read - and the toy-example lends itself to 
such a tactic. In other situations, perhaps some refactoring or 
pre-processing, even a decorator, might remove (or reduce) the need for 
so much checking within the loop/block.


When to use one or the other approach?

We could hide behind some 'mystery', and say "I just know from 
experience", but that smacks of a secret coin-toss (human 'gut feelings' 
not being particularly indicative of success). So, here's a stab at it 
which harks back to the learn/use 'COBOL or FORTRAN' [argument] days:

If the purpose/considerations of the for-loop (block), are:-
- data-related: check-first
	(and thus consider extract and/or invert to promote readability)
- logic/algorithmic implementation, take 'the happy line' first
	(and deal with the exceptions ("small print") later)


Worthy of consideration, is that Python is (still) fast-developing. The 
switch-case construct of v3.10, and protocols and beefed-up descriptors 
(?interfaces?) could have quite an impact on such thinking, and in the 
relatively near-future...


* back in the ?bad old days when testing was something that was (only) 
done AFTER coding was complete, ie ego-driven development. The 
traditional response to the question: "are you coming to 
lunch/dinner/supper/break/the party/bed/...?" was "in a moment - 
[surely?] there's only one more bug"!


I've been 'dipping into' Martin Fowler's "Refactoring", 2e, Pearson, 
2019; but don't have it with me to point to useful references. What I do 
have to-hand, because it has just arrived, is Mariano Anaya's "Clean 
Code in Python", (also 2e), Packt, 2020* - although I didn't see its 
previous edition, and have read nothing beyond the Contents(!) to-date; 
it talks of "Design by Contract", "Defensive Programming", "Separation 
of Concerns" indicating it may have thinking to offer.

* the reviewer was Tarek Ziadé, author of "Expert Python", which is 
worth reading - as are his other works (in English and in French)
-- 
Regards,
=dn
-- 
https://mail.python.org/mailman/listinfo/python-list



More information about the Python-list mailing list