[Python-ideas] Summary of for...else threads

Steven D'Aprano steve at pearwood.info
Sun Oct 11 13:46:24 CEST 2009


Folks,

I have written a summary of the multiple issues raised by the two 
threads on for...else (attached). Unfortunately it's rather long, but 
there was a lot of ground to cover.

I have tried to be as fair to all positions as I am able. If I have 
missed anything major, please feel free to comment, but please don't 
start debating the issues in this thread as well. If anyone feels I 
have misrepresented their position, please let me know.

Unless there are serious objections, I intend to post this summary to 
python-dev in a couple of days and ask for a ruling on the various 
suggestions (e.g. Yes, No, Write A PEP And It Will Be Considered).

For the record, is there anyone here willing to provide patches if one 
or more of the proposals are accepted on python-dev?

Thank you for all who have contributed.


-- 
Steven D'Aprano
-------------- next part --------------
Summary of the python-ideas threads:

    for/except/else syntax
    http://mail.python.org/pipermail/python-ideas/2009-October/005924.html

    SyntaxWarning for for/while/else without break or return?
    http://mail.python.org/pipermail/python-ideas/2009-October/006066.html


Executive summary
=================

The for...else and while...else syntax is one of the most under-used and
misunderstood syntactic features in Python. We discuss potential changes to
the behaviour and/or spelling of this feature:

    - remove the syntax
    - generate a warning if used without break
    - change the spelling
    - add additional functionality


Discussion
==========

Both `for` and `while` loops take an optional `else` suite, which executes if
the loop completes normally, including the case of the loop not executing at
all (e.g. if the `for` sequence is empty). If the loop is exited with a
`break` statement, the `else` suite is skipped. (Naturally if the loop is
exited with a return or by raising an exception, the `else` suite never gets a
chance to execute.)

In the following document, for brevity I will generally say "for...else", but
everything said applies equally to while...else.

The primary use-case of for...else is to implement searching:

    for item in sequence:
        if item == target:
            print "found"
            break
    else:
        print "not found"

But even for this use-case, it is notable that the bisect module does not use
this idiom in its implementation of binary search.

According to some straw polls, even otherwise experienced and competent Python
programmers are unaware of the existence, or confused about the functionality,
of the `else` clause. On being presented with an example of the syntax, such
programmers typically have one of three erroneous reactions:

    (a) The `else` clause has been outdented by mistake, or there is a missing
        `if` statement.

    (b) The `else` clause is executed if the `for` loop is not executed at all.

    (c) The `else` clause is executed if there is a `break`, or if there
        isn't, but I don't remember which.

Predicting the correct behaviour seems to be extremely rare among programmers
who haven't learned it. (b) and especially (c) seems to be fairly common even
among programmers who say they understand the `for...else` construct.

The `else` clause has another perceived problem: if there is no `break` in the
loop, the `else` clause is functionally redundant. (However, they are not
semantically equivalent -- see (3c) below.) That is, the following two code
snippets behave identically:

    for item in sequence:
        process(item)
    else:
        other()

and:

    for item in sequence:
        process(item)
    other()


Proposals have been made that:

    (1) The `else` clause should be left exactly as it is.

    (2) The `else` keyword is misleading or confusing or both, and should be
        replaced.

    (3) The `else` clause itself is useful, but `else` without a preceding
        `break` is not, and so should generate an error or a warning.

    (4) The `else` clause should be changed to execute if the for-loop is not.

    (5) The `else` clause is of limited use, and can be removed altogether.

    (6) There should be an additional clause which is executed when the
        for-loop does not.


Taking these proposals individually:


(1) The `else` clause should be left exactly as it is.

This is, naturally, the default position. Why Python tries to be easy to
comprehend, the language is under no obligation to be "intuitive" to every
programmer. The Zen even says "Although that way may not be obvious at first
unless you're Dutch." While improved documentation is always valued, there is
no consensus that `for/while...else` needs changing: even if it is a "gotcha", there's not necessarily any reason to change it.


(2) The `else` keyword is misleading or confusing or both, and should be
    replaced.

The problem many people seem to have is that the semantic connection between
the keyword `else` and the behaviour of the `else` clause is tenuous at best.
Perhaps the original concept was that the loop construct should be read as:

    execute the for-loop (or while-loop)
    if you reach a `break`, jump to the end of the `for...else` block
    else execute the `else` suite

but this seems to be unintuitive to many people.

One suggestion is to deprecate `else` and replace it with a key-phrase (rather
than a single word) which explicitly states what is happening:

    for item in sequence:
        process(item)
    else no break:
        suite

or similar variants.

The advantage of this is that it is explicit about when the suite is executed.
But there are disadvantages:

* it requires three words rather than one;

* it isn't easy to remember: it could be "else no break", "else not break",
  "if no break", "elif no breaks" or any of a number of other variants;

* it may mislead users into imagining that `break` is a global variable, and
  that they can test it (or set it) outside of the loop construct;

* and despite appearances, it does not describe what the Python VM actually
  does: there is no test. The entire for...else block is a single chunk of
  byte-code, and `break` jumps to the end of it:

    >>> code = compile("""for x in seq:
    ...     break
    ... else:
    ...     print "no break"
    ... """, '', 'exec')
    >>> dis.dis(code)
    1           0 SETUP_LOOP              20 (to 23)
                3 LOAD_NAME                0 (seq)
                6 GET_ITER
          >>    7 FOR_ITER                 7 (to 17)
               10 STORE_NAME               1 (x)

    2          13 BREAK_LOOP
               14 JUMP_ABSOLUTE            7
          >>   17 POP_BLOCK

    4          18 LOAD_CONST               0 ('no break')
               21 PRINT_ITEM
               22 PRINT_NEWLINE
          >>   23 LOAD_CONST               1 (None)
               26 RETURN_VALUE

Note the line SETUP_LOOP, which jumps to 23 (beyond the `else` clause) when
the BREAK_LOOP op-code is reached.

For completeness, I should mention that with a slight change in syntax,
programmers who want this syntax can have it right now:

    for item in sequence:
        process(item)
    else:  # no break
        suite


Another suggestion is to deprecate `else` and replace it with `then`:

    for item in sequence:
        process(item)
    then:
        suite

This matches the way we might describe the flow control:

    execute the for-loop (or while-loop)
    then execute the `else` suite (unless you reach a break)

with the understanding that a `break` jumps to the end of the entire
`for...else` (for...then) block.

This has two advantages: it is only one word, and it accurately describes the
flow control. But the obvious disadvantage is that it requires a new keyword.
Fortunately, "then" is not a verb or a noun, and consequently is unlikely to
be used by much real-world code: I was not able to find it being used as an
identifier in either my own code or the standard library.

A third suggestion was to use the keyword "finally", but this will likely lead
people to expect it to behave like try...finally, that is, to be executed even
if the loop is exited by an exception or a return.


(3) The `else` clause itself is useful, but `else` without a preceding `break`
    is not, and so should generate an error or a warning.

This proposal arguably generated the most heat on the python-ideas list, with
people vehemently disagreeing on whether any such warnings should be provided
by the Python compiler, or left to third-party tools like PyChecker or PyLint.

We can distinguish at least four positions:

    (3a) for...else without a break is simply wrong and should raise a
         SyntaxError

         PROS:
         - This would allow the Python compiler to enforce a convention which
           many people think is sensible.

         CONS:
         - The Python compiler enforces very few conventions.

         - This is a very drastic step that risks breaking existing code, and
           so would need to go through a deprecation period first.

         - It's hard (impossible?) to justify calling something an error when
           it in fact compiles correctly and executes as designed.

         - It is possible for the optimiser to remove the break in the loop
           (e.g. given `if __debug__: break`). This means that while the parser
           enforces the presence of a break, such a break isn't actually
           needed (and may not exist) in successfully compiled and working
           byte-code.

         - It may lead to programmers inserting meaningless breaks that can
           never be taken, simply to satisfy the compiler.

         - There's little or no evidence that for...else with no break is a
           significant source of bugs in real-world code.

    (3b) for...else without a break is pointless and programmers who use it
         are probably confused and the compiler should raise a warning.

         PROS:
         - The `else` clause is error-prone (e.g. accidentally dedenting the
           `else` from an `if...else` inside the loop). Such errors are
           arguably more common than deliberate use of `else` without `break`.

         - Some programmers are confused about when the `else` block is
           executed, and a warning may help them learn the feature.

         - There is much precedence from other languages' compilers, some of
           which give many warnings for things.

         - There is a little precedence from Python, e.g. warning when you
           assign or reference a global or nonlocal before defining it.

         CONS:
         - As a rule, Python's compiler is very lightweight, and it rarely
           gives warnings, particularly not warnings about coding standards.
           E.g. it does not warn if you declare a global at the module level,
           or if you write to locals(), even though the first is pointless and
           the second doesn't do as expected.

         - Such a warning punishes those who do understand the for...else
           construct and choose to use it without a `break` (see below for
           two possible justifications for doing so).

         - The compiler shouldn't try to guess what the programmer may have
           wanted, but should just do as instructed.

         - Such warnings increase the complexity of the compiler. For rare
           features like for...else and while...else, it isn't obvious that
           the benefit is worth the cost.

         - Excessive warnings null naive users into a false sense of security,
           and annoy non-naive users for no good reason.

    (3c) for...else without a `break` is unusual but neither wrong nor even
         suspect, it's a reasonable style.

         PROS:
         - A loop may go through multiple iterations of edit-compile-run. The
           loop may gain an `else` clause in an earlier iteration than it
           gains a `break`.

         - A for...else block is a single conceptual unit in a way that code
           following a loop is not. It represents a single semantic unit
           smaller than a function but larger than a line, and makes it
           obvious that the `else` suite belongs with, and follows, the `for`
           suite. That is, given the following code:

           for item in sequence:
               block
           suite

           without reading the code in detail it isn't obvious whether suite
           belongs with the loop or merely happens to follow it. But with the
           equivalent for...else block:

           for item in sequence:
               block
           else:
               suite

           there can be no doubt that they belong together as a unit.

         - Even if for...else without a break is silly, we're all adults here
           and we should allow programmers to use whatever style they see fit
           without penalty.

    (3d) for...else without a `break` is unusual enough to include in
         PyChecker or PyLint, but it is inappropriate for the Python
         compiler to warn about it.

         PROS:
         - It is a reasonable compromise position. Those who want to check
           for it can use a tool to do so, those who don't don't have to.

         - Misuse of `else` rarely produces subtle errors. It is unlikely to
           require heroic debugging effort to solve such mistakes.

         CONS:
         - PyChecker and PyLint are third-party software. Neither are available
           in the standard library, and both have a non-trivial learning curve.


(4) The `else` clause should be changed to execute if the for-loop is not.

When programmers unfamiliar with the `else` clause are shown the syntax and
asked to predict what it will do, some expect that it will execute if the loop
does not. Changing `else` to match their intuition may make the feature
easier to learn.

However, such a change is a major change in behaviour, and will break existing
code. Consequently, even if desired by the community, it is unlikely to be implemented until Python4000, and would need to go through a long period of
deprecating the existing behaviour first.


(5) The `else` clause is of limited use, and can be removed altogether.

The `else` clause does not have many use-cases, the primary one being
searching. Removing it would simplify the syntax of loops.

This too is unlikely to be implemented before Python 4000 even if desired.


(6) There should be an additional clause which is executed when the for-loop does not.

Such a clause is easy to do today. For lists and other sequences, this is easy with one additional test:

    if sequence:
        for item in sequence:
            process(item)
    else:
        print "sequence is empty"

The for loop can also be moved outside of the test, if preferred. However, for iterators, the test needs a little more work:

    sentinel = object()
    item = sentinel
    for item in sequence:
        process(item)
    if item is sentinel:
        print "sequence is empty"

For this reason, some people would like to see syntactic support for a clause that executes whenever the for-loop is not. This might be written as follows:

    for item in sequence:
        process(item)
    otherwise:
        print "sequence is empty"


Although not strictly necessary, this might be seen as a "nice to have"
syntactic sugar. The disadvantage are the introduction of a new keyword, and
the increased complexity of the compiler. There is also the issue of deciding
how this proposed `otherwise` block would interact with `else`:

    * Can a loop have both an `else` and an `otherwise` suite?
    * If so, do they both execute if the loop does not?
    * And if so, can the programmer choose in which order they execute?



Conclusion
==========

The default, and simplest, position to take is that nothing should change,
except possibly documentation (e.g. an entry in the FAQ explaining the
behaviour).

Otherwise, some of the proposals could be implemented together: e.g. we could
deprecate the keyword `else` in favour of an alternate spelling as well as
raise a warning if there is no `break` in the loop. In any case, any change
will require consensus and possibly a PEP.


-- 
[This document is placed in the public domain.]


More information about the Python-ideas mailing list