[Tutor] List comprehension question
Steven D'Aprano
steve at pearwood.info
Mon Nov 8 12:43:40 CET 2010
Richard D. Moores wrote:
> So this version of my function uses a generator, range(), no?
>
> def proper_divisors(n):
> sum_ = 0
> for x in range(1,n):
> if n % x == 0:
> sum_ += x
> return sum_
I'm going to be pedantic here... but to quote the Middleman, specificity
is the soul of all good communication.
The term "generator" in Python refers specifically to an object akin to
ordinary functions. Instead of using return, they use yield:
def g(n):
n = 2*n
yield n
yield n+1
g itself is a "generator function", and the result of calling g is a
"generator object". In practice, people tend to be sloppy and just refer
to both the function g and the result of calling it as generators,
expecting the reader to tell from context which is which.
But pedantically, we can see the difference:
>>> g # the function itself
<function g at 0xb7b8f1ec>
>>> type(g)
<class 'function'>
>>> g(1) # the result of calling the function
<generator object g at 0xb7a7be14>
>>> type(g(1))
<class 'generator'>
Generators are a special type of "iterators" -- iterators are objects
which can be iterated over, like lists, strings, and many others. In
this case, the generator is a special type of executable function which
can be paused mid-process. If you do this:
it = g(3) # say
next(it)
the built-in next() function starts executing the generator object. The
code is executed up to the first yield statement:
n = 2*n
yield n
and so next(it) returns 2*3 = 6. Then execution pauses, and the
generator just waits.
(Aside: in Python 2.x, next(it) is written it.next() instead.)
The next time you call next(it), execution continues until the next yield:
yield n+1
and so next(it) will now return 6+1 = 7.
Finally, if you call next(it) again, execution drops off the end of the
code, and the generator will raise StopIteration.
Now, this seems a bit complicated, but Python can handle most of that
for you. Instead of manually calling next() on the generator object, if
you use it in a for-loop:
for i in g(3):
print(i)
the for-loop will handle calling next and catching the StopIterator, and
you simply see i = 6, 7. Or you can pass it to functions such as list:
list(g(3))
and get pretty much the same result.
(For advanced use: in newer versions of Python, generators have become
even more powerful, but complex, objects called co-routines. Co-routines
allow you to send data *into* the middle of the running code, as well as
extract data using yield.)
Python also has "generator expressions" -- this is a short-cut syntax
for particularly simple generators. Generator expressions look just like
list expressions, except they use round brackets instead of square:
>>> (x+1 for x in (1, 2, 4) if x%2 == 0)
<generator object <genexpr> at 0xb7a7be3c>
List comprehensions run all the way to the end, producing a list.
Generator expressions run lazily, only producing values one at a time as
needed. So compare:
sum([list comp]) vs sum(gen expr)
The first case needs to find storage for the entire list first, which is
potentially huge. But the second case just adds the numbers one at a
time, and so never needs much memory.
So, that's generators... what about range? range is also an iterator,
that is, something you can iterate over, but it's not a generator:
>>> type(range(20))
<class 'range'>
It is its own special type of object, like list, str, int or dict.
Coming back to your function:
def proper_divisors(n):
sum_ = 0
for x in range(1,n):
if n % x == 0:
sum_ += x
return sum_
we can write that much more simply:
def proper_divisors(n):
return sum(x for x in range(1, n) if n%x == 0)
And we can speed it up a bit by realising that there's no need to go all
the way up to n in the range. After all, we know that (say) 100%96 can't
possibly be zero. The largest factor of n is sqrt(n):
def proper_divisors(n):
return sum(x for x in range(1, int(math.sqrt(n))) if n%x == 0)
For large n, that will save a lot of time.
--
Steven
More information about the Tutor
mailing list