[Python-ideas] for/except/else

Sat Mar 4 22:17:31 EST 2017

On 3 March 2017 at 18:47, Wolfgang Maier <
wolfgang.maier at biologie.uni-freiburg.de> wrote:

> On 03/03/2017 04:36 AM, Nick Coghlan wrote:
>
>> On 2 March 2017 at 21:06, Wolfgang Maier
>> <wolfgang.maier at biologie.uni-freiburg.de
>> <mailto:wolfgang.maier at biologie.uni-freiburg.de>> wrote:
>>
>>     - overall I looked at 114 code blocks that contain one or more breaks
>>>
>>
>>
>> Thanks for doing that research :)
>>
>>
>>     Of the remaining 19 non-trivial cases
>>>
>>>     - 9 are variations of your classical search idiom above, i.e.,
>>>     there's an else clause there and nothing more is needed
>>>
>>>     - 6 are variations of your "nested side-effects" form presented
>>>     above with debatable (see above) benefit from except break
>>>
>>>     - 2 do not use an else clause currently, but have multiple breaks
>>>     that do partly redundant things that could be combined in a single
>>>     except break clause
>>>
>>
>>
>> Those 8 cases could also be reviewed to see whether a flag variable
>> might be clearer than relying on nested side effects or code repetition.
>>
>>
> [...]
>
>
>> This is a case where a flag variable may be easier to read than loop
>> state manipulations:
>>
>>     may_have_common_prefix = True
>>     while may_have_common_prefix:
>>         prefix = None
>>         for item in items:
>>             if not item:
>>                 may_have_common_prefix = False
>>                 break
>>             if prefix is None:
>>                 prefix = item[0]
>>             elif item[0] != prefix:
>>                 may_have_common_prefix = False
>>                 break
>>         else:
>>             # all subitems start with a common "prefix".
>>             # move it out of the branch
>>             for item in items:
>>                 del item[0]
>>             subpatternappend(prefix)
>>
>> Although the whole thing could likely be cleaned up even more via
>> itertools.zip_longest:
>>
>>     for first_uncommon_idx, aligned_entries in
>> enumerate(itertools.zip_longest(*items)):
>>         if not all_true_and_same(aligned_entries):
>>             break
>>     else:
>>         # Everything was common, so clear all entries
>>         first_uncommon_idx = None
>>     for item in items:
>>         del item[:first_uncommon_idx]
>>
>> (Batching the deletes like that may even be slightly faster than
>> deleting common entries one at a time)
>>
>> Given the following helper function:
>>
>>     def all_true_and_same(entries):
>>         itr = iter(entries)
>>         try:
>>             first_entry = next(itr)
>>         except StopIteration:
>>             return False
>>         if not first_entry:
>>             return False
>>         for entry in itr:
>>             if not entry or entry != first_entry:
>>                 return False
>>         return True
>>
>>     - finally, 1 is a complicated break dance to achieve sth that
>>>     clearly would have been easier with except break; from typing.py:
>>>
>>
>>
> [...]
>
>
>> I think is another case that is asking for the inner loop to be factored
>> out to a named function, not for reasons of re-use, but for reasons of
>> making the code more readable and self-documenting :)
>>
>>
> It's true that using a flag or factoring out redundant code is always a
> possibility. Having the except clause would clearly not let people do
> anything they couldn't have done before.
> On the other hand, the same is true for the else clause - it's only
> advantage here is that it's existing already

I forget where it came up, but I seem to recall Guido saying that if he
were designing Python today, he wouldn't include the "else:" clause on
loops, since it inevitably confuses folks the first time they see it.
(Hence articles like mine that attempt to link it with try/except/else
rather than if/else).

> - because a single flag could always distinguish between a break having
> occurred or not:
>
> brk = False
> for item in iterable:
>     if some_condition:
>         brk = True
>         break
> if brk:
>     do_stuff_upon_breaking_out()
> else:
>     do_alternative_stuff()
>
> is a general pattern that would always work without except *and* else.
>
> However, the fact that else exists generates a regrettable asymmetry in
> that there is direct language support for detecting one outcome, but not
> the other.
>

It's worth noting that this asymmetry doesn't necessarily exist in the
corresponding C idiom that I assume was the inspiration for the Python
equivalent:

    int data_array_len = sizeof(data_array) / sizeof(data_array[0]);
    in idx = 0;
    for (idx = 0; idx < data_array_len; idx++) {
        if (condition(container[idx])) {
            break;
        }
    }
    if (idx < data_array_len) {
        // We found a relevant entry
    } else {
        // We didn't find anything
    }

In Python prior to 2.1 (when PEP 234 added the iterator protocol), a
similar approach could be used for Python's for loops:

    num_items = len(container):
    for idx in range(num_items):
        if condition(container[idx]):
            break
    if num_items and idx < num_items:
        # We found a relevant entry
    else:
        # We didn't find anything

However, while my own experience with Python is mainly with 2.2+ (and hence
largely after the era where "for i in range(len(container)):" was still
common), I've spent a lot of time working with C and the corresponding
iterator protocols in C++, and there it is pretty common to move the "entry
found" code before the break and then invert the conditional check that
appears after the loop:

    int data_array_len = sizeof(data_array) / sizeof(data_array[0]);
    int idx = 0;
    for (idx = 0; idx < data_array_len; idx++) {
        if (condition(container[idx])) {
            // We found a relevant entry
            break;
        }
    }
    if (idx >= data_array_len) {
        // We didn't find anything
    }

And it's *this* version of the C/C++ idiom that Python's "else:" clause
replicates.

One key aspect of this particular idiomatic structure is that it retains
the same overall shape regardless of whether the inner structure is:

    if condition(item):
        # Condition is true, so process the item
        process(item)
        break

or:

    if maybe_process_item(item):
        # Item was processed, so we're done here
        break

Whereas the "post-processing" model can't handle pre-composed helper
functions that implement both the conditional check and the item
processing, and then report back which branch they took.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20170305/eeb8fd7b/attachment.html>