[Tutor] List comprehension question

Steven D'Aprano steve at pearwood.info
Mon Nov 8 12:43:40 CET 2010


Richard D. Moores wrote:

> So this version of my function uses a generator, range(), no?
> 
> def proper_divisors(n):
>     sum_ = 0
>     for x in range(1,n):
>         if n % x == 0:
>             sum_ += x
>     return sum_

I'm going to be pedantic here... but to quote the Middleman, specificity 
is the soul of all good communication.

The term "generator" in Python refers specifically to an object akin to 
ordinary functions. Instead of using return, they use yield:

def g(n):
     n = 2*n
     yield n
     yield n+1

g itself is a "generator function", and the result of calling g is a 
"generator object". In practice, people tend to be sloppy and just refer 
to both the function g and the result of calling it as generators, 
expecting the reader to tell from context which is which.

But pedantically, we can see the difference:

 >>> g  # the function itself
<function g at 0xb7b8f1ec>
 >>> type(g)
<class 'function'>
 >>> g(1)  # the result of calling the function
<generator object g at 0xb7a7be14>
 >>> type(g(1))
<class 'generator'>

Generators are a special type of "iterators" -- iterators are objects 
which can be iterated over, like lists, strings, and many others. In 
this case, the generator is a special type of executable function which 
can be paused mid-process. If you do this:

it = g(3)  # say
next(it)

the built-in next() function starts executing the generator object. The 
code is executed up to the first yield statement:

     n = 2*n
     yield n

and so next(it) returns 2*3 = 6. Then execution pauses, and the 
generator just waits.

(Aside: in Python 2.x, next(it) is written it.next() instead.)

The next time you call next(it), execution continues until the next yield:

     yield n+1

and so next(it) will now return 6+1 = 7.

Finally, if you call next(it) again, execution drops off the end of the 
code, and the generator will raise StopIteration.

Now, this seems a bit complicated, but Python can handle most of that 
for you. Instead of manually calling next() on the generator object, if 
you use it in a for-loop:

for i in g(3):
     print(i)

the for-loop will handle calling next and catching the StopIterator, and 
you simply see i = 6, 7. Or you can pass it to functions such as list:

list(g(3))

and get pretty much the same result.

(For advanced use: in newer versions of Python, generators have become 
even more powerful, but complex, objects called co-routines. Co-routines 
allow you to send data *into* the middle of the running code, as well as 
extract data using yield.)

Python also has "generator expressions" -- this is a short-cut syntax 
for particularly simple generators. Generator expressions look just like 
list expressions, except they use round brackets instead of square:

 >>> (x+1 for x in (1, 2, 4) if x%2 == 0)
<generator object <genexpr> at 0xb7a7be3c>

List comprehensions run all the way to the end, producing a list. 
Generator expressions run lazily, only producing values one at a time as 
needed. So compare:

sum([list comp]) vs sum(gen expr)

The first case needs to find storage for the entire list first, which is 
potentially huge. But the second case just adds the numbers one at a 
time, and so never needs much memory.

So, that's generators... what about range? range is also an iterator, 
that is, something you can iterate over, but it's not a generator:

 >>> type(range(20))
<class 'range'>

It is its own special type of object, like list, str, int or dict.


Coming back to your function:

def proper_divisors(n):
     sum_ = 0
     for x in range(1,n):
         if n % x == 0:
             sum_ += x
     return sum_


we can write that much more simply:

def proper_divisors(n):
     return sum(x for x in range(1, n) if n%x == 0)


And we can speed it up a bit by realising that there's no need to go all 
the way up to n in the range. After all, we know that (say) 100%96 can't 
possibly be zero. The largest factor of n is sqrt(n):


def proper_divisors(n):
     return sum(x for x in range(1, int(math.sqrt(n))) if n%x == 0)


For large n, that will save a lot of time.




-- 
Steven



More information about the Tutor mailing list