[Python-3000] Efficient LC's which don't leak the iteration variables (was Re: Py3k release schedule worries)
ncoghlan at gmail.com
Wed Dec 20 13:40:51 CET 2006
Georg Brandl wrote:
> Guido van Rossum schrieb:
>> On 12/19/06, Georg Brandl <g.brandl at gmx.net> wrote:
>>>> - turning list comprehensions into syntactic sugar for generator expressions
>>> I'd like to point out that there is already my patch for that, which implements
>>> set comprehensions and list comprehensions exactly as syntactic sugar for
>>> GEs. This, however, affects performance greatly as LCs are executed in their own
>>> function scope, which isn't necessary. A better implementation would therefore
>>> leave the LC implementation as is, only preventing the name leaking into the
>>> enclosing scope.
>> Do you think you have it in you to tackle this?
> I'll implement it if only someone can point me in the right direction
> how to do it.
One idea I've had on that front is to persuade the compiler to replace the
names used for the iteration variables in the source code either with the
compiler's own hidden variable names (like the ones it already uses for
various internal results it can't leave on the stack), or else to add the idea
of a 'scope prefix' that assigns names in the existing function scope with a
numeric prefix in front of each of the symbols.
I haven't actually tried to implement that, though, and I suspect things will
get a little tricky when it comes to dealing with nested scopes like '[(lambda
i=i: i) for i in range(10)]'.
Another alternative might be to flag each list comprehension iteration
variable during the symtable pass as being either deleted or restored when the
comprehension is complete. Then before the LC emit code to save any variables
to be restored to a hidden compiler variable and code after the LC to restore
them to their original values. For deleted variables, emit the relevant
deletion opcode after the LC.
This second approach, however, still has issues dealing with nested scopes (in
particular closure variables will refer to the variable at function scope
instead of the final value of the iteration variable). In addition, it needs
to be able to handle conditionally defined variables: ones which may or may
not be defined when the LC executes depending on which path was followed
through earlier parts of the function.
So I think the first approach I mentioned (particularly the 'scope prefix'
concept) is the one most likely to prove workable. But YMMV, since I haven't
actually *tried* any of this.
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
More information about the Python-3000