[Python-3000] List & set comprehensions patch
Nick Coghlan
ncoghlan at gmail.com
Tue Mar 6 16:46:28 CET 2007
Georg and I have been working on the implementation of list
comprehensions which don't leak their iteration variables, along with
the implementation of set comprehensions. The latest patch can be found
as SF patch #1660500 [1]. The file new-set-comps.diff is the combined
patch which implements both features, and unifies handling of the
different kinds of comprehension. In an effort to improve readability,
the patch also converts the sets in symtable.c to be actual PySet
objects, rather than PyDict objects with None keys and tries to reduce
the number of different meanings assigned to the term 'scope'.
One of the comments made on Georg's initial attempt at implementing
these features was that it would be nice to avoid the function call
overhead in the listcomp & setcomp case (as it appears at first glance
that the internal scope can be temporary). I tried to do that and
essentially failed outright - working through symtable.c and compile.c,
I found that dealing with the scoping issues created by the possibility
of nested genexps, lambdas and list or set comprehensions would pretty
much require reimplementing all of the scoping rules that functions
already provide.
Here's an example of the scoping issues from the new test_listcomps.py
that forms part of the patch:
>>> def test_func():
... items = [(lambda: i) for i in range(5)]
... i = 20
... return [x() for x in items]
>>> test_func()
[4, 4, 4, 4, 4]
Without creating an actual function object for the body of the list
comprehension, it becomes rather difficult to get the lambda expression
closure to resolve to the correct value.
For list comprehensions at module or class scope, the introduction of
the function object can actually lead to a speed increase as the
iteration variables and accumulation variable become function locals
instead of module globals. Inside a function, however, the additional
function call overhead slows things down.
Some specific questions related to the current patch:
In implementing it, I discovered that list comprehensions don't do
SETUP_LOOP/POP_BLOCK around their for loop - I'd like to get
confirmation from someone who knows their way around the ceval loop
better than I do that omitting those is actually legitimate (I *think*
the restriction to a single expression in the body of the comprehension
makes it OK, but I'm not sure).
There are also a couple of tests we had to disable - one in test_dis,
one in test_grammar. Suggestions on how to reinstate those (or agreement
that it is OK to get rid of them) would be appreciated.
The PySet update code in symtable.c currently uses PyNumber_InplaceOr
with a subsequent call to Py_DECREF to counter the implicit call to
Py_INCREF. Should this be changed to use PyObject_CallMethod to invoke
the Python level update method?
There are also two backwards compatibility problems which came up:
- code which explicitly deleted the listcomp variable started
throwing NameErrors. Several tweaks were needed in the standard library
to fix this.
- only the outermost iterator expression is evaluated in the scope
containing the comprehension (just like generator expressions). This
means that the inner expressions can no longer see class variables and
values in explicit locals() dictionaries provided to exec & friends.
This didn't actually cause any problems in the standard library - I only
note it because my initial implementation mistakenly evaluated the
outermost iterator in the new scope, which *did* cause severe problems
along these lines.
Regards,
Nick
[1] http://www.python.org/sf/1660500
--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
---------------------------------------------------------------
http://www.boredomandlaziness.org
More information about the Python-3000
mailing list