[Python-3000] List & set comprehensions patch

Guido van Rossum guido at python.org
Wed Mar 7 23:20:46 CET 2007


On 3/7/07, Nick Coghlan <ncoghlan at gmail.com> wrote:
> > How about determining if it's a *simple* case or not, and doing the
> > variable renaming in the simple case and Georg's original version in
> > non-simple cases? You can define "simple" as whatever makes the
> > determination easy and still treats most common cases as simple. E.g.
> >  a lambda would be a non-simple case, and so would using a nonlocal
> > or global variable (note though that nonlocal and global should reach
> >  inside the list/set comp!) etc.
>
> It sounds feasible, but I don't think the lack of it should prevent the
> change from going in.

I was worried that if the change went in, the portions of the old code
that would still be usable for the "simple" cases would be deleted or
at least go stale.

But I think it's more important to make the new syntax and semantics go in.

> The implementation in the patch *is* a regression
> from a speed point of view, but it's also an obvious target for later
> optimisation - if we find a function scope that cannot outlive it's
> parent scope (such as a simple comprehension), then we can inline that
> code in the parent scope by using renamed variables.
>
> However, there's then further weirdness with what effect this has on the
> contents of locals()...

Yet another reason to declare locals() an implementation detail. I
personally never use it.

> It looks like even nailing down what 'simple' means in this situation
> will be somewhat tricky :P

That's why I left the definition of simple to you. :-)

> >> In implementing it, I discovered that list comprehensions don't do
> >> SETUP_LOOP/POP_BLOCK around their for loop - I'd like to get
> >> confirmation from someone who knows their way around the ceval loop
> >>  better than I do that omitting those is actually legitimate (I
> >> *think* the restriction to a single expression in the body of the
> >> comprehension makes it OK, but I'm not sure).
> >
> > They exist to handle break/continue. Since those don't apply to
> > list/set comps, it's safe.
>
> Excellent - that's what I thought they were about. Does yield need them?
> If it doesn't, we can probably also remove them from the bytecode
> emitted for generator expressions.

Yield shouldn't need them; it doesn't involve flow control within the frame.

> >> There are also a couple of tests we had to disable - one in
> >> test_dis, one in test_grammar. Suggestions on how to reinstate
> >> those (or agreement that it is OK to get rid of them) would be
> >> appreciated.
> >
> > I'll have to look later.
>
> According to Georg, it's only the test_dis one that needs a closer look
> - the removed grammar test was declared invalid some time ago.

OK, thanks.

> >> The PySet update code in symtable.c currently uses
> >> PyNumber_InplaceOr with a subsequent call to Py_DECREF to counter
> >> the implicit call to Py_INCREF. Should this be changed to use
> >> PyObject_CallMethod to invoke the Python level update method?
> >
> > What's wrong with the inplace or? I seem to recall that s |= x and
> > s.update(x) aren't equivalent if x is not a set.
>
> A while back when Barry wanted to add PySet_Update, Raymond recommended
> using PyObject_CallMethod instead. In this case, I know I'm dealing with
> real set objects, so the inplace-or works fine (it's annoyingly easy to
> forget the decref and leak a reference though... all the more reason to
> leave that part of the patch out for the moment, I guess)

This is system level code, we only need to get it right once.

> >> - only the outermost iterator expression is evaluated in the scope
> >> containing the comprehension (just like generator expressions).
> >> This means that the inner expressions can no longer see class
> >> variables and values in explicit locals() dictionaries provided to
> >> exec & friends. This didn't actually cause any problems in the
> >> standard library - I only note it because my initial implementation
> >> mistakenly evaluated the outermost iterator in the new scope, which
> >> *did* cause severe problems along these lines.
> >
> > This smells fishy. Do you have an example?
>
> The following example uses a generator expression to show the effect of
> the semantic change:
>
> .>>> class Dummy:
> ...     x = 10
> ...     print [x for i in range(5)]
> ...     print list(x for i in range(5))
> ...
> [10, 10, 10, 10, 10]
> Traceback (most recent call last):
>    File "<stdin>", line 1, in ?
>    File "<stdin>", line 4, in Dummy
>    File "<stdin>", line 4, in <generator expression>
> NameError: global name 'x' is not defined
>
> With list comprehensions also switching to private iteration variables,
> the first print statement will throw a NameError instead of printing the
> list.
>
> This was brought to my attention by the fact that my original buggy
> implementation blew up on really common things like "[i for i in seq if
> cond(i)]" when the code was executed with execfile() rather than being
> imported.

Ah, I see. I think this is acceptable, since it only happens in cases
where we already have somewhat odd scoping rules.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list