[Python-ideas] Control Flow - Never Executed Loop Body

Wed Mar 23 18:11:21 EDT 2016

On Mar 23, 2016, at 13:17, Sven R. Kunze <srkunze at mail.de> wrote:
> 
>> On 23.03.2016 00:56, Andrew Barnert via Python-ideas wrote:
> 
>> (Especially since 90% of the time, when you need to do something special on empty, you explicitly have a sequence, not any iterable, so it's just "if not seq:".)
> 
> Repeating that you think it will be a sequence will make it one.

I've said that if be happy to see any counterexamples, where you really do need this with iterators. So far, nobody has provided one. Of course absence of proof isn't proof of absence, but "you can't prove that it's impossible anyone will ever need this" is not a good rationale for a language change.

>> So really, this proposal is really just asking for syntactic sugar that complicates the language in exchange for making some already-understandable code a little more concise, which doesn't seem worth it.
> 
> Did you even think what you just said? Almost everything in Python is "just syntactic sugar" compared to most other Turing-complete languages.
> 
> To put it differently: every construct that abstracts away "goto" is "just syntactic sugar".

There's a big difference between syntactic sugar that makes hard-to-follow code more readable, and syntactic sugar that only makes some already-understandable code a little more concise. The former may belong in the language, the latter very rarely does.

You seem to think that there's no inherent cost to adding new features to a language. For example, you later say:

> The only real reason against it so far: "it makes the language more complicated because I don't need it". Not entirely compelling but understandable.

The more complicated the language is, the harder it is to keep it all in your head, to spot the control flow while skimming, to trace out the details when necessary, etc. Also, the more things you add, the more places there are for inconsistencies to creep in. (Of course it also makes the language harder to implement and maintain, but those are less important.)

Only accepting changes that are actually worth the cost in increased complexity is a big part of what makes Python more readable than "dumping-ground" languages that have C-for, Python-for, while, do-while, until, repeat-until, and loop, some in both statement and expression variants, and some also writable in postfix form.

>> The for loop bytecode would have to change to stash an "any values seen" flag somewhere such that if it sees StopIteration and hasn't seen any values, it converts that to an EmptyCollection. Or any of the other equivalents (e.g., the compiler could unroll the first PyIter_Next from loop from the rest of them to handle it specially).
> 
> Something like that.
> 
>> But this seems like it would add a lot of overhead and complexity to every loop whether desired or not.
> 
> If the "for" does not have any empty-equivalent clauses, there is no need to introduce that overhead in the first place.

Not true. The first implementation I suggested, putting EmptyCollection into every iterable, requires the overhead in every case. The second one, changing the way the existing bytecodes work, means making frame objects (or _something_) more complicated to enable stashing the flags, which affects every for loop. The third, unrolling the first PyObject_Iter, has to be done if there's any code that can inspect the current exception state, which can't be statically determined, so it has to be always done.

If you have a _different_ implementation, I'm happy to hear it. I supplied all the versions I could think of because another critic (Stephen? I forget...) implied that what you wanted was impossible or incoherent, and it clearly isn't. But it may not be a good idea to let a critic of your idea come up with the implementation. :)

> So we can conclude:
> 
> 1) none overhead for regular "for"s
> 2) less overhead for "for-ifempty" because it would be done in C and not in Python

For which of the three implementations? I'm pretty sure all of them would have significant overhead.

>>> If not, then how will this work? Is this a special kind of
>>> exception-like process that *only* operates inside for loops?
>>> 
>>> What will an explicit "raise NeverExecuted" do?
>> Presumably that's the same question as what an explicit raise StopIteration does. Just as there's nothing stopping you from writing a __next__ method that raises StopIteration but then yields more values of called again, there's nothing stopping you from raising NeverExecuted pathologically, but you shouldn't do so. M
>> 
>>>> So, independent of the initial "never executed loop body" use-case, one
>>>> could also emulate the "else" clause by:
>>>> 
>>>> for item in collection:
>>>>    # do for item
>>>> except StopIteration:
>>>>    # do after the loop
>>> That doesn't work, for two reasons:
>>> 
>>> (1) Not all for-loops use iterators. The venerable old "sequence
>>> protocol" is still supported for sequences that don't support __iter__.
>>> So there may not be any StopIteration raised at all.
>> I think there always is.
>> 
>> IIRC, PyObject_Iter (the C API function used by iter and by for loops) actually constructs a sequence iterator object if the object doesn't have tp_iter (__iter__ for Python types) but does have  tp_sequence (__getitem__ for Python types, but, e.g., dict has __getitem__ without having tp_sequence). And the for loop doesn't treat that sequence iterator any different from "real" iterators returned by __iter__; it just calls tp_next (__next__) until StopIteration. (And the "other half" of the old-style sequence protocol, that lets old-style sequences be reversed if they have a length, is similarly implemented by the C API underneath the reversed function.)
>> 
>> I'm on my phone right now, so I can't double-check any of the details, but I'm 80% sure they're all at least pretty close...
> 
> I think I would have to deal with the old protocol given the proposal is accepted.

No, because, as I just explained, the old protocol is  taken care of by wrapping old-style sequences in iterator objects, so as far as the for-loop code is concerned, they look identical to "new-style" iterables.

>>> (2) Even if StopIteration is raised, the for-loop catches it (in a
>>> manner of speaking) and consumes it.
>>> 
>>> So to have this work, we would need to have the for-loop re-raise
>>> StopIteration... but what happens if you don't include an except
>>> StopIteration clause? Does every bare for-loop with no "except" now
>>> print a traceback and halt processing? If not, why not?
>> I think this could be made to work: a for loop without an except clause handles StopIteration the same as today (by jumping to the else clause), but one that does have one or more except clauses just treats it like a normal exception.
>> 
>> Of course this would mean for/except/else is now legal but useless, which could be confusing ("why does my else clause no longer run when I add an 'except ValueError' clause?").
> 
> One could disallow "else" in case any "except" is defined.

But that's just an extra rule to implement (and remember) for no real benefit. Why not just document that for/except/else is generally useless and shouldn't be written, and let linters flag it?

>> More generally, I think the fact that for/except StopIteration is almost but not quite identical to plain for would be confusing more often than helpful.
> 
> You bet how people think about "else". "So, 'else' is always executed after the for?" "Yes, but only when there is no 'break' executed in the 'for'" "... *thinking* ... okay ..."

I don't understand your point here. Because there's already something in the language that you find confusing, that gives us free rein to add anything else to the language that people will find confusing?