
On Mon, May 03, 2021 at 09:04:51PM +1000, Chris Angelico wrote:
My understanding of the situation is that the list comprehension [ x*x for x in range(5) ] is a shorthand for list( x*x for x in range(5) ).
Sorta-kinda. It's not a shorthand in the sense that you can't simply replace one with the other,
Only because the `list` name could be shadowed or rebound to something else. Syntactically and functionally, aside from the lazy vs eager difference, a comprehension is a comprehension and there is nothing generator comprehensions can do that list comprehensions can't. In Python 2 there were scoping differences between the two, but I believe that in Python 3 those have been eliminated.
but they do have very similar behaviour, yes. A genexp is far more flexible than a list comp,
Aside from the lazy nature of generator comprehensions, what else?
so the compiled bytecode for list(genexp) has to go to a lot of unnecessary work to permit that flexibility, whereas the list comp can simplify things down.
I don't think so. The bytecode in 3.9 is remarkably similar. >>> dis.dis('list(spam for spam in eggs)') 1 0 LOAD_NAME 0 (list) 2 LOAD_CONST 0 (<code object <genexpr> at 0x7fc185ce0870, file "<dis>", line 1>) 4 LOAD_CONST 1 ('<genexpr>') 6 MAKE_FUNCTION 0 8 LOAD_NAME 1 (eggs) 10 GET_ITER 12 CALL_FUNCTION 1 14 CALL_FUNCTION 1 16 RETURN_VALUE Disassembly of <code object <genexpr> at 0x7fc185ce0870, file "<dis>", line 1>: 1 0 LOAD_FAST 0 (.0) >> 2 FOR_ITER 10 (to 14) 4 STORE_FAST 1 (spam) 6 LOAD_FAST 1 (spam) 8 YIELD_VALUE 10 POP_TOP 12 JUMP_ABSOLUTE 2 >> 14 LOAD_CONST 0 (None) 16 RETURN_VALUE The bytecode for the list comp `[spam for spam in eggs]` is only three bytecodes shorter, so that doesn't support your comment about "a lot of unnecessary work". `dis.dis('[spam for spam in eggs]')` can: - skip the name lookup for list (LOAD_NAME); - and the CALL_FUNCTION that ends up calling it; The dissassemblies of the two code objects, "<genexpr>" and "<listcomp>", have slightly different implementations but only differ by one bytecode overall. As far as runtime efficiency, list comps are a little faster. Iterating over a 1000-item sequence is 33% faster for a list comp, but for a 100000-item sequence that drops to 25% faster. But as soon as you do a significant amount of work inside the comprehension, that work is likely to dominate the other costs. There's definitely some overhead needed to support starting and stopping a generator, but we can argue that is an implementation detail. A sufficiently clever interpreter could avoid that overhead.
That said, I think the only way you'd actually detect a behavioural difference is if the name "list" has been rebound.
That and timing. -- Steve