[Python-Dev] Generator objects and list comprehensions?

Joe Jevnik jjevnik at quantopian.com
Wed Jan 25 01:28:26 EST 2017


List, set, and dict comprehensions compile like:

# input
result = [expression for lhs in iterator_expression]

# output
def comprehension(iterator):
    out = []
    for lhs in iterator:
        out.append(expression)
    return out

result = comprehension(iter(iterator_expression))

When you make `expression` a `yield` the compiler thinks that
`comprehension` is a generator function instead of a normal function.

We can manually translate the following comprehension:

result = [(yield n) for n in (0, 1)]

def comprehension(iterator):
    out = []
    for n in iterator:
        # (yield n) as an expression is the value sent into this generator
        out.append((yield n))
    return out

result = comprehension(iter((0, 1)))


We can see this in the behavior of `send` on the resulting generator:

In [1]: g = [(yield n) for n in (0, 1)]

In [2]: next(g)
Out[2]: 0

In [3]: g.send('hello')
Out[3]: 1

In [4]: g.send('world')
---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
<ipython-input-4-82c589e0c27d> in <module>()
----> 1 g.send('world')

StopIteration: ['hello', 'world']

The `return out` gets translated into `raise StopIteration(out)` because
the code is a generator.

The bytecode for this looks like:

In [5]: %%dis
   ...: [(yield n) for n in (0, 1)]
   ...:
<module>
--------
  1           0 LOAD_CONST               0 (<code object <listcomp> at
0x7f4bae68eed0, file "<show>", line 1>)
              3 LOAD_CONST               1 ('<listcomp>')
              6 MAKE_FUNCTION            0
              9 LOAD_CONST               5 ((0, 1))
             12 GET_ITER
             13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             16 POP_TOP
             17 LOAD_CONST               4 (None)
             20 RETURN_VALUE

<module>.<listcomp>
-------------------
  1           0 BUILD_LIST               0
              3 LOAD_FAST                0 (.0)
        >>    6 FOR_ITER                13 (to 22)
              9 STORE_FAST               1 (n)
             12 LOAD_FAST                1 (n)
             15 YIELD_VALUE
             16 LIST_APPEND              2
             19 JUMP_ABSOLUTE            6
        >>   22 RETURN_VALUE


In `<module>` you can see us create the <listcomp> function, call `iter` on
`(0, 1)`, and then call `<listcomp>(iter(0, 1))`. In `<listcomp>` you can
see the loop like we had above, but unlike a normal comprehension we have a
`YIELD_VALUE` (from the `(yield n)`) in the comprehension.

The reason that this is different in Python 2 is that list comprehension,
and only list comprehensions, compile inline. Instead of creating a new
function for the loop, it is done in the scope of the comprehension. For
example:

# input
result = [expression for lhs in iterator_expression]

# output
result = []
for lhs in iterator_expression:
    result.append(lhs)

This is why `lhs` bleeds out of the comprehension. In Python 2, adding
making `expression` a `yield` causes the _calling_ function to become a
generator:

def f():
    [(yield n) for n in range(3)]
    # no return

# py2
>>> type(f())
generator

# py3
>> type(f())
NoneType

In Python 2 the following will even raise a syntax error:

In [5]: def f():
   ...:     return [(yield n) for n in range(3)]
   ...:
   ...:
  File "<ipython-input-5-3602a9999f46>", line 2
    return [(yield n) for n in range(3)]
SyntaxError: 'return' with argument inside generator


Generator expressions are a little different because the compilation
already includes a `yield`.

# input
(expression for lhs in iterator_expression)

# output
def genexpr(iterator):
    for lhs in iterator:
        yield expression

You can actually abuse this to write a cute `flatten` function:

`flatten = lambda seq: (None for sub in seq if (yield from sub) and False)`

because it translates to:

def genexpr(seq):
    for sub in seq:
        # (yield from sub) as an expression returns the last sent in value
        if (yield from sub) and False:
            # we never enter this branch
            yield None


That was a long way to explain what the problem was. I think that that
solution is to stop using `yield` in comprehensions because it is
confusing, or to make `yield` in a comprehension a syntax error.

On Wed, Jan 25, 2017 at 12:38 AM, Craig Rodrigues <rodrigc at freebsd.org>
wrote:

> Hi,
>
> Glyph pointed this out to me here: http://twistedmatrix.
> com/pipermail/twisted-python/2017-January/031106.html
>
> If I do this on Python 3.6:
>
> >>  [(yield 1) for x in range(10)]
> <generator object <listcomp> at 0x10cd210f8>
>
> If I understand this: https://docs.python.org/
> 3/reference/expressions.html#list-displays
> then this is a list display and should give a list, not a generator object.
> Is there a bug in Python, or does the documentation need to be updated?
>
> --
> Craig
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> joe%40quantopian.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20170125/ab9478f6/attachment-0001.html>


More information about the Python-Dev mailing list