[Python-ideas] for/except/else
Nick Coghlan
ncoghlan at gmail.com
Thu Mar 2 22:36:29 EST 2017
On 2 March 2017 at 21:06, Wolfgang Maier <
wolfgang.maier at biologie.uni-freiburg.de> wrote:
> On 02.03.2017 06:46, Nick Coghlan wrote:
>
>> The proposal in this thread then has the significant downside of only
>> covering the "nested side effect" case:
>>
>> for item in iterable:
>> if condition(item):
>> break
>> except break:
>> operation(item)
>> else:
>> condition_was_never_true(iterable)
>>
>> While being even *less* amenable to being pushed down into a helper
>> function (since converting the "break" to a "return" would bypass the
>> "except break" clause).
>>
>
> I'm actually not quite buying this last argument. If you wanted to
> refactor this to "return" instead of "break", you could simply put the
> return into the except break block. In many real-world situations with
> multiple breaks from a loop this could actually make things easier instead
> of worse.
>
Fair point - so that would be even with the "single nested side effect"
case, but simpler when you had multiple break conditions (and weren't
already combined them with "and").
> Personally, the "nested side effect" form makes me uncomfortable every
> time I use it because the side effects on breaking or not breaking the loop
> don't end up at the same indentation level and not necessarily together.
> However, I'm gathering from the discussion so far that not too many people
> are thinking like me about this point, so maybe I should simply adjust my
> mind-set.
>
This is why I consider the "search only" form of the loop, where the else
clause either sets a default value, or else prevents execution of the code
after the loop body (via raise, return, or continue), to be the preferred
form: there aren't any meaningful side effects hidden away next to the
break statement. If I can't do that, I'm more likely to switch to a classic
flag variable that gets checked post-loop execution than I am to push the
side effect inside the loop body:
search_result = _not_found = object()
for item in iterable:
if condition(item):
search_result = item
break
if search_result is _not_found:
# Handle the "not found" case
else:
# Handle the "found" case
> All that said, this is a very nice abstract view on things! I really
> learned quite a bit from this, thank you :)
>
> As always though, reality can be expected to be quite a bit more
> complicated than theory so I decided to check the stdlib for real uses of
> break. This is quite a tedious task since break is used in many different
> ways and I couldn't come up with a good automated way of classifying them.
> So what I did is just go through stdlib code (in reverse alphabetical
> order) containing the break keyword and put it into categories manually. I
> only got up to socket.py before losing my enthusiasm, but here's what I
> found:
>
> - overall I looked at 114 code blocks that contain one or more breaks
>
Thanks for doing that research :)
> Of the remaining 19 non-trivial cases
>
> - 9 are variations of your classical search idiom above, i.e., there's an
> else clause there and nothing more is needed
>
> - 6 are variations of your "nested side-effects" form presented above with
> debatable (see above) benefit from except break
>
> - 2 do not use an else clause currently, but have multiple breaks that do
> partly redundant things that could be combined in a single except break
> clause
>
Those 8 cases could also be reviewed to see whether a flag variable might
be clearer than relying on nested side effects or code repetition.
> - 1 is an example of breaking out of two loops; from sre_parse._parse_sub:
>
> [...]
> # check if all items share a common prefix
> while True:
> prefix = None
> for item in items:
> if not item:
> break
> if prefix is None:
> prefix = item[0]
> elif item[0] != prefix:
> break
> else:
> # all subitems start with a common "prefix".
> # move it out of the branch
> for item in items:
> del item[0]
> subpatternappend(prefix)
> continue # check next one
> break
> [...]
>
This is a case where a flag variable may be easier to read than loop state
manipulations:
may_have_common_prefix = True
while may_have_common_prefix:
prefix = None
for item in items:
if not item:
may_have_common_prefix = False
break
if prefix is None:
prefix = item[0]
elif item[0] != prefix:
may_have_common_prefix = False
break
else:
# all subitems start with a common "prefix".
# move it out of the branch
for item in items:
del item[0]
subpatternappend(prefix)
Although the whole thing could likely be cleaned up even more via
itertools.zip_longest:
for first_uncommon_idx, aligned_entries in
enumerate(itertools.zip_longest(*items)):
if not all_true_and_same(aligned_entries):
break
else:
# Everything was common, so clear all entries
first_uncommon_idx = None
for item in items:
del item[:first_uncommon_idx]
(Batching the deletes like that may even be slightly faster than deleting
common entries one at a time)
Given the following helper function:
def all_true_and_same(entries):
itr = iter(entries)
try:
first_entry = next(itr)
except StopIteration:
return False
if not first_entry:
return False
for entry in itr:
if not entry or entry != first_entry:
return False
return True
>
> - finally, 1 is a complicated break dance to achieve sth that clearly
> would have been easier with except break; from typing.py:
>
> [...]
> def __subclasscheck__(self, cls):
> if cls is Any:
> return True
> if isinstance(cls, GenericMeta):
> # For a class C(Generic[T]) where T is co-variant,
> # C[X] is a subclass of C[Y] iff X is a subclass of Y.
> origin = self.__origin__
> if origin is not None and origin is cls.__origin__:
> assert len(self.__args__) == len(origin.__parameters__)
> assert len(cls.__args__) == len(origin.__parameters__)
> for p_self, p_cls, p_origin in zip(self.__args__,
> cls.__args__,
> origin.__parameters__):
> if isinstance(p_origin, TypeVar):
> if p_origin.__covariant__:
> # Covariant -- p_cls must be a subclass of
> p_self.
> if not issubclass(p_cls, p_self):
> break
> elif p_origin.__contravariant__:
> # Contravariant. I think it's the opposite.
> :-)
> if not issubclass(p_self, p_cls):
> break
> else:
> # Invariant -- p_cls and p_self must equal.
> if p_self != p_cls:
> break
> else:
> # If the origin's parameter is not a typevar,
> # insist on invariance.
> if p_self != p_cls:
> break
> else:
> return True
> # If we break out of the loop, the superclass gets a
> chance.
> if super().__subclasscheck__(cls):
> return True
> if self.__extra__ is None or isinstance(cls, GenericMeta):
> return False
> return issubclass(cls, self.__extra__)
> [...]
>
I think is another case that is asking for the inner loop to be factored
out to a named function, not for reasons of re-use, but for reasons of
making the code more readable and self-documenting :)
Cheers,
Nick.
--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20170303/6fcbba6c/attachment.html>
More information about the Python-ideas
mailing list