Re: [Python-Dev] [issue6673] Py3.1 hangs in coroutine and eats up all memory

[moving this from the bug tracker] Alexandre Vassalotti wrote:
Alexandre Vassalotti added the comment:
Not a bug.
The list comprehension in your chunker:
while True: target.send([ (yield) for i in range(chunk_size) ])
is equivalent to the following generator in Python 3:
while True: def g(): for i in range(chunk_size): yield (yield) target.send(list(g()))
This clearly needs not what you want.
Does this do anything meaningful, or would it make sense to output a compiler warning (or better: an error) here? Using yield in a comprehension (as opposed to a generator expression, which I intuitively expected not to work) doesn't look any dangerous at first glance, so it was quite surprising to see it fail that drastically. This is also an important issue for other Python implementations. Cython simply transforms comprehensions into the equivalent for-loop, so when we implement PEP 342 in Cython, we will have to find a way to emulate CPython's behaviour here (unless we decide to stick with Py2.x sematics, which would not be my preferred solution). Stefan
So, just rewrite your code using for-loop:
while True: result = [] for i in range(chunk_size): result.append((yield)) target.send(result)
---------- nosy: +alexandre.vassalotti resolution: -> invalid status: open -> closed

Stefan Behnel wrote:
This is also an important issue for other Python implementations. Cython simply transforms comprehensions into the equivalent for-loop, so when we implement PEP 342 in Cython, we will have to find a way to emulate CPython's behaviour here (unless we decide to stick with Py2.x sematics, which would not be my preferred solution).
How do you do that without leaking the iteration variable into the current namespace? Avoiding that leakage is where the semantic change between 2.x and 3.x came from here: 2.x just creates the for loop inline (thus leaking the iteration variable into the current scope), while 3.x creates an inner function that does the iteration so that the iteration variables exist in their own scope without polluting the namespace of the containing function. The translation of your example isn't quite as Alexandre describes it - we do at least avoid the overhead of creating a generator function in the list comprehension case. It's more like: while True: def f(): result = [] for i in range(chunk_size): result.append((yield)) return result target.send(f()) So what you end up with is a generator that has managed to bypass the syntactic restriction that disallows returning non-None values from generators. In CPython it appears that happens to end up being executed as if the return was just another yield expression (most likely due to a quirk in the implementation of RETURN_VALUE inside generators): while True: def f(): result = [] for i in range(chunk_size): result.append((yield)) yield result target.send(f()) It seems to me that CPython should be raising a SyntaxError for yield expressions inside comprehensions (in line with the "no returning values other than None from generator functions" rule), and probably for generator expressions as well. Cheers, Nick. P.S. Experimentation at a 3.x interpreter prompt:
def f(): ... return [(yield) for i in range(10)] ... x = f() next(x) for i in range(8): ... x.send(i) ... x.send(8) next(x) [0, 1, 2, 3, 4, 5, 6, 7, 8, None] x = f() next(x) for i in range(10): # A statement with a return value! ... x.send(i) ... [0, 1, 2, 3, 4, 5, 6, 7, 8, None] dis(f) 2 0 LOAD_CONST 1 (<code object <listcomp> at 0xb7c53bf0, file "<stdin>", line 2>) 3 MAKE_FUNCTION 0 6 LOAD_GLOBAL 0 (range) 9 LOAD_CONST 2 (10) 12 CALL_FUNCTION 1 15 GET_ITER 16 CALL_FUNCTION 1 19 RETURN_VALUE dis(f.__code__.co_consts[1]) 2 0 BUILD_LIST 0 3 LOAD_FAST 0 (.0) >> 6 FOR_ITER 13 (to 22) 9 STORE_FAST 1 (i) 12 LOAD_CONST 0 (None) 15 YIELD_VALUE 16 LIST_APPEND 2 19 JUMP_ABSOLUTE 6 >> 22 RETURN_VALUE
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Nick Coghlan wrote:
Stefan Behnel wrote:
This is also an important issue for other Python implementations. Cython simply transforms comprehensions into the equivalent for-loop, so when we implement PEP 342 in Cython, we will have to find a way to emulate CPython's behaviour here (unless we decide to stick with Py2.x sematics, which would not be my preferred solution).
How do you do that without leaking the iteration variable into the current namespace?
We currently have 2.x sematics for comprehensions anyway, but the (long-standing) idea is to move comprehensions into their own scope (not a function, just a new type of scope), so that all names defined inside the expressions end up inside of the inner scope. This is completely orthogonal to the loop transformation itself, though, which would simply happen inside of the inner scope. However, having to emulate the other Py3 semantics for comprehensions that this thread is about, would pretty much kill such a simple solution.
The translation of your example isn't quite as Alexandre describes it - we do at least avoid the overhead of creating a generator function in the list comprehension case. It's more like:
while True: def f(): result = [] for i in range(chunk_size): result.append((yield)) return result target.send(f())
So the problem is that f(), i.e. the function-wrapped comprehension itself, swallows the "(yield)" expression (which redundantly makes it a generator). That means that the outer function in my example, which was def chunker(chunk_size, target): while True: target.send([ (yield) for i in range(chunk_size) ]) doesn't become a generator itself, so the above simply ends up as an infinite loop. IMHO, that's pretty far from obvious when you look at the code. Also, the target receives a "generator object <listcomp>" instead of a list. That sounds weird.
It seems to me that CPython should be raising a SyntaxError for yield expressions inside comprehensions (in line with the "no returning values other than None from generator functions" rule), and probably for generator expressions as well.
Yes, that's what I was suggesting. Disallowing it in genexps is a more open question, though. I wouldn't mind being able to send() values into a generator expression, or to throw() exceptions during their execution. Anyway, I have no idea about a use case, so it might just as well be disallowed for symmetry reasons. Stefan

Stefan Behnel <stefan_ml <at> behnel.de> writes:
IMHO, that's pretty far from obvious when you look at the code.
A "yield" wrapped in a list comprehension looks far from obvious IMO anyway, whether in 2.x or 3.x. It's this kind of "smart" writing tricks people find that only makes code more difficult to read for others (à la Perl). Regards Antoine.

Antoine Pitrou wrote:
Stefan Behnel <stefan_ml <at> behnel.de> writes:
IMHO, that's pretty far from obvious when you look at the code.
A "yield" wrapped in a list comprehension looks far from obvious IMO anyway, whether in 2.x or 3.x. It's this kind of "smart" writing tricks people find that only makes code more difficult to read for others (à la Perl).
So, your vote is to make it a compiler error as well? Stefan
participants (3)
-
Antoine Pitrou
-
Nick Coghlan
-
Stefan Behnel