[Python-Dev] iterzip()

Jeremy Hylton jeremy@zope.com
Mon, 29 Apr 2002 16:03:48 -0400

>>>>> "TP" == Tim Peters <tim.one@comcast.net> writes:

  TP> Check this out:

  TP> """ from time import clock as now

  TP> n = 1000000

  TP> def juststore(x):
  TP>     for i in x: x[i] = 0

  TP> def justtups(x):
  TP>     for i in x: 0,

  TP> def storetups(x):
  TP>     for i in x: x[i] = 0,

  TP> def run(func):
  TP>     x = range(n) start = now() func(x) finish = now() print
  TP>     "%10s %6.2f" % (func.__name__, finish - start)

  TP> run(juststore) run(justtups) run(storetups) """

  TP> I get:

  TP>  juststore 0.93
  TP>   justtups 0.58
  TP>  storetups 7.61

  TP> list.append is out of the picture here.  Creating a million
  TP> tuples goes fast so long as they're recycled, and storing a
  TP> million things goes fast, but storing a million distinct tuples
  TP> takes very much longer.  It's a Mystery to me so far.  How does
  TP> it act on Linux?

It acts about the same on Linux.

I don't see why it's a mystery.  justtups() only uses one tuple at a
time; it gets re-used from the free list every time.  storetups() has
to (py)malloc a new tuple every stinkin' time.

Note that there's a net excess of allocations in the storetup() case,
so we invoke the garbage collector every n/700 times through the
loop.  I've noted before that it doesn't make much sense to invoke GC
unless there is at least one deallocation; you can't reclaim anything
if there are no DECREFs.