On 06/28/2013 10:16 PM, Andrew Barnert wrote:
On Jun 28, 2013, at 18:50, Shane Green <shane@umbrellacode.com <mailto:shane@umbrellacode.com>> wrote:
Yes, but it only works for generator expressions and not comprehensions.
This is the point if options #1 and 2: make StopIteration work in comps either (1) by redefining comprehensions in terms of genexps or (2) by fiat.
After some research, it turns out that these are equivalent. Replacing any [comprehension] with list(comprehension) is guaranteed by the language (and the CPython implementation) to give you exactly the same value unless (a) something in the comp raises StopIteration, or (b) something in the comp relies on reflective properties (e.g., sys._getframe().f_code.co_flags) that aren't guaranteed anyway.
So, other than being 4 characters more verbose and 40% slower, there's already an answer for comprehensions.
Right.. Any solution also must not slow down the existing simpler cases. For those who haven't looked at the C code yet, there is this comment there. /* List and set comprehensions and generator expressions work by creating a nested function to perform the actual iteration. This means that the iteration variables don't leak into the current scope. The defined function is called immediately following its definition, with the result of that call being the result of the expression. The LC/SC version returns the populated container, while the GE version is flagged in symtable.c as a generator, so it returns the generator object when the function is called. This code *knows* that the loop cannot contain break, continue, or return, so it cheats and skips the SETUP_LOOP/POP_BLOCK steps used in normal loops. Possible cleanups: - iterate over the generator sequence instead of using recursion */ I don't know how much the SETUP_LOOP/POP_BLOCK costs in time. It probably only makes a big difference in nested cases.
And if either of those problems is unacceptable, a patch for #1 or #2 is actually pretty easy.
I've got two different proof of concepts: one actually implements the comp as passing the genexp to list, the other just wraps everything after the BUILD_LIST and before the RETURN_VALUE in a the equivalent of try: ... except StopIteration: pass. I need to add some error handling to the C code, and for #2 write sufficient tests that verify that it really does work exactly like #1, but I should have working patches to play with in a couple days.
My opinion of that workaround is that it’s also a step backward in terms of readability. I suspect.
if i < 50 else stop() would probably also work, since it throws an exception. That’s better, IMHO.
Once a function is added that is called on every iteration, then a regular for loop with a break (without the function call) will run quicker. I think what matters is that it's fast and is easy to explain. The first two examples here are the existing variations. The third case would be the added break case. (The exact spelling of the expression may be different.) # [x for x in seq] for x in iter: append x # LIST_APPEND byte code, not a method call # [x for x in seq if expr] for x in iter: if expr: append x # [x for x in seq if expr break] for x in iter: if expr: break append x The generator comps have YIELD_VALUE in place of LIST_APPEND, This last case is the simplest variation for an early exit. It only differs from the second case by having a BREAK_LOOP after the POP_JUMP_IF_FALSE instruction. Along with SETUP_LOOP and POP_BLOCK, before after the loops. I am curious about how many places in the library adding break to these would make a difference. If there isn't any, or only a few, then it's probably not needed. But then again, maybe it's worth a good before dismissing it. Cheers, Ron (* dis.dis seems to be adding some extra unneeded lines, a second, dead JUMP_ABSOLUTE to the top of the loop for case 2 above, and a "JUMP_FORWARD 0" in the third case. Seems odd, but these don't effect what we are talking about here.)