"Collapsing" a list into a list of changes

Jack Diederich jack at performancedrivers.com
Sat Feb 5 11:53:56 EST 2005


On Sat, Feb 05, 2005 at 02:31:08PM +1000, Nick Coghlan wrote:
> Jack Diederich wrote:
> >Since this is 2.4 you could also return a generator expression.
> >
> >
> >>>>def iter_collapse(myList):
> >
> >...   return (x[0] for (x) in 
> >it.groupby([0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5]))
> >... 
> 
> But why write a function that returns a generator expression, when you 
> could just turn the function itself into a generator?
> 
> Py>def iter_collapse(myList):
> ...   for x in it.groupby(myList):
> ...     yield x[0]
> 
Here is where I say, "because it is faster!"
except it isn't.  maybe.  The results are unexpected, anyway.

import timeit
import itertools as it

def collapse1(myl):
  for x in it.groupby(myl):
    yield x[0]

def collapse2(myl):
  return (x[0] for (x) in it.groupby(myl))

list_str = '[0,0,1,1,1,2,2,3,3,3,2,2,2,4,4,4,5]'

print "collapse1", timeit.Timer('collapse1(%s)' % (list_str), 'from __main__ import collapse1').timeit()
print "collapse2", timeit.Timer('collapse2(%s)' % (list_str), 'from __main__ import collapse2').timeit()
print "list(collapse1)", timeit.Timer('list(collapse1(%s))' % (list_str), 'from __main__ import collapse1').timeit()
print "list(collapse2)", timeit.Timer('list(collapse2(%s))' % (list_str), 'from __main__ import collapse2').timeit()

collapse1 1.06855082512
collapse2 3.40627384186
list(collapse1) 8.31489896774
list(collapse2) 9.49903011322

The overhead of creating the generator expression seems much higher than
creating the equivalent function.  If you subtract our the setup difference
actually running through the whole iterator is slightly faster for genexps
than functions that yield.  At a guess it has something to do with how
they handle lexical scope and some byte code differences.

I said guess, right?

-Jack



More information about the Python-list mailing list