[Python-3000] List & set comprehensions patch

Guido van Rossum guido at python.org
Tue Mar 6 19:18:18 CET 2007


Quick responses from just reading the email (I'll try to review the
code later today, I'm trying to do Py3k work all day):

On 3/6/07, Nick Coghlan <ncoghlan at gmail.com> wrote:
> Georg and I have been working on the implementation of list
> comprehensions which don't leak their iteration variables, along with
> the implementation of set comprehensions. The latest patch can be found
> as SF patch #1660500 [1]. The file new-set-comps.diff is the combined
> patch which implements both features, and unifies handling of the
> different kinds of comprehension. In an effort to improve readability,
> the patch also converts the sets in symtable.c to be actual PySet
> objects, rather than PyDict objects with None keys and tries to reduce
> the number of different meanings assigned to the term 'scope'.

Maybe you could separate that out into a separate patch so it won't
old up review or work on the main patch? Or is there a stronger
connection?

> One of the comments made on Georg's initial attempt at implementing
> these features was that it would be nice to avoid the function call
> overhead in the listcomp & setcomp case (as it appears at first glance
> that the internal scope can be temporary). I tried to do that and
> essentially failed outright - working through symtable.c and compile.c,
> I found that dealing with the scoping issues created by the possibility
> of nested genexps, lambdas and list or set comprehensions would pretty
> much require reimplementing all of the scoping rules that functions
> already provide.

How about determining if it's a *simple* case or not, and doing the
variable renaming in the simple case and Georg's original version in
non-simple cases? You can define "simple" as whatever makes the
determination easy and still treats most common cases as simple. E.g.
a lambda would be a non-simple case, and so would using a nonlocal or
global variable (note though that nonlocal and global should reach
inside the list/set comp!) etc.

> Here's an example of the scoping issues from the new test_listcomps.py
> that forms part of the patch:
>
>      >>> def test_func():
>      ...     items = [(lambda: i) for i in range(5)]
>      ...     i = 20
>      ...     return [x() for x in items]
>      >>> test_func()
>      [4, 4, 4, 4, 4]
>
> Without creating an actual function object for the body of the list
> comprehension, it becomes rather difficult to get the lambda expression
> closure to resolve to the correct value.
>
> For list comprehensions at module or class scope, the introduction of
> the function object can actually lead to a speed increase as the
> iteration variables and accumulation variable become function locals
> instead of module globals. Inside a function, however, the additional
> function call overhead slows things down.
>
> Some specific questions related to the current patch:
>
> In implementing it, I discovered that list comprehensions don't do
> SETUP_LOOP/POP_BLOCK around their for loop - I'd like to get
> confirmation from someone who knows their way around the ceval loop
> better than I do that omitting those is actually legitimate (I *think*
> the restriction to a single expression in the body of the comprehension
> makes it OK, but I'm not sure).

They exist to handle break/continue. Since those don't apply to
list/set comps, it's safe.

> There are also a couple of tests we had to disable - one in test_dis,
> one in test_grammar. Suggestions on how to reinstate those (or agreement
> that it is OK to get rid of them) would be appreciated.

I'll have to look later.

> The PySet update code in symtable.c currently uses PyNumber_InplaceOr
> with a subsequent call to Py_DECREF to counter the implicit call to
> Py_INCREF. Should this be changed to use PyObject_CallMethod to invoke
> the Python level update method?

What's wrong with the inplace or? I seem to recall that s |= x and
s.update(x) aren't equivalent if x is not a set.

> There are also two backwards compatibility problems which came up:
>
>    - code which explicitly deleted the listcomp variable started
> throwing NameErrors. Several tweaks were needed in the standard library
> to fix this.

That's fine. I think it's okay to have this kind of problem.

>    - only the outermost iterator expression is evaluated in the scope
> containing the comprehension (just like generator expressions). This
> means that the inner expressions can no longer see class variables and
> values in explicit locals() dictionaries provided to exec & friends.
> This didn't actually cause any problems in the standard library - I only
> note it because my initial implementation mistakenly evaluated the
> outermost iterator in the new scope, which *did* cause severe problems
> along these lines.

This smells fishy. Do you have an example?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list