[Python-3000] Efficient LC's which don't leak the iteration variables (was Re: Py3k release schedule worries)

Nick Coghlan ncoghlan at gmail.com
Wed Dec 20 13:40:51 CET 2006

Georg Brandl wrote:
> Guido van Rossum schrieb:
>> On 12/19/06, Georg Brandl <g.brandl at gmx.net> wrote:
>>>> - turning list comprehensions into syntactic sugar for generator expressions
>>> I'd like to point out that there is already my patch for that, which implements
>>> set comprehensions and list comprehensions exactly as syntactic sugar for
>>> GEs. This, however, affects performance greatly as LCs are executed in their own
>>> function scope, which isn't necessary. A better implementation would therefore
>>> leave the LC implementation as is, only preventing the name leaking into the
>>> enclosing scope.
>> Do you think you have it in you to tackle this?
> I'll implement it if only someone can point me in the right direction
> how to do it.

One idea I've had on that front is to persuade the compiler to replace the 
names used for the iteration variables in the source code either with the 
compiler's own hidden variable names (like the ones it already uses for 
various internal results it can't leave on the stack), or else to add the idea 
of a 'scope prefix' that assigns names in the existing function scope with a 
numeric prefix in front of each of the symbols.

I haven't actually tried to implement that, though, and I suspect things will 
get a little tricky when it comes to dealing with nested scopes like '[(lambda 
i=i: i) for i in range(10)]'.

Another alternative might be to flag each list comprehension iteration 
variable during the symtable pass as being either deleted or restored when the 
comprehension is complete. Then before the LC emit code to save any variables 
to be restored to a hidden compiler variable and code after the LC to restore 
them to their original values. For deleted variables, emit the relevant 
deletion opcode after the LC.

This second approach, however, still has issues dealing with nested scopes (in 
particular closure variables will refer to the variable at function scope 
instead of the final value of the iteration variable). In addition, it needs 
to be able to handle conditionally defined variables: ones which may or may 
not be defined when the LC executes depending on which path was followed 
through earlier parts of the function.

So I think the first approach I mentioned (particularly the 'scope prefix' 
concept) is the one most likely to prove workable. But YMMV, since I haven't 
actually *tried* any of this.


Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

More information about the Python-3000 mailing list