[Python-Dev] genexps slow?

Raymond Hettinger python at rcn.com
Wed Mar 31 00:48:53 EST 2004


> Can anybody explain this?
> 
> [guido at guido linux]$ ./python ../Lib/timeit.py -s 'r=range(10000)'
'sum([x
> for x in r])'
> 100 loops, best of 3: 7.75 msec per loop
> [guido at guido linux]$ ./python ../Lib/timeit.py -s 'r=range(10000)'
'sum(x
> for x in r)'
> 100 loops, best of 3: 8.23 msec per loop

I optimized list comps so that they run much faster than they did back
when Alex first made the comparative timings.  On my machine, they run
twice as fast.

Comparing listcomps to genexps, there are several factors affecting the
relative timings:

* Genexps should come out ahead on cache utilization (the consumer
function 
always has the data element in cache because it was just produced).
This effect increases with number of iterations (large lists cannot keep
all their elements in cache as the same time).

* Genexps incur frame switching time that list comps do not.  This
effect is a constant for each iteration and will make genexps slower
than list comps for shorter lists.

* Genexps do not access malloc as often and do not move all the data as
often.   As lists grow, they periodically have to realloc and move data.
This effect is much more pronounced in real programs where the memory is
more fragmented and the data sizes are larger.

* Genexps take better advantage of re-usable containers (ones with a
free lists).  For instance, if you time 'max((x,x) for x in r)' then
genexps should come out ahead because the same tuple is being reused on
every pass while list comp has to build 10000 tuples of size two.


Raymond






More information about the Python-Dev mailing list