Generator function performance

Tim Peters tim.one at home.com
Mon Dec 17 22:30:00 EST 2001


[Oren Tirosh]
> At first, I thought of generator function mostly in terms of convenience:
> the resulting source is shorter and more readable compared to other
> implementations.  Then I found an unexpected benefit: generators are
> also very efficient.

Yes, this was noted (and measured) on c.l.py when generators were still a
novelty <wink>.  For example, recursive generators yielding combinatorial
objects one at a time generally run significantly faster than recursive
functions building up all the results into a list.  Plus the memory savings
can be unbounded <wink>.

> A function call in Python is quite expensive.  Suspending and restoring
> the state of a generator is relatively cheap in comparison.

It may help to explain that a bit:  calls cost for two main reasons.  First
is sorting out the large number of arglist gimmicks at runtime each time:
positional arguments, default arguments, and keyword arguments don't come
for free, and so far we haven't invented specialized opcodes to try to take
advantage of that *most* calls don't use keyword arguments at all.  Second
is initializing a frame object, which spans over 100 lines of dense, dull C
code, like this (just the tail end of it):

      ...
	f->f_locals = locals;
	f->f_trace = NULL;
	f->f_exc_type = f->f_exc_value = f->f_exc_traceback = NULL;
	f->f_tstate = tstate;

	f->f_lasti = 0;
	f->f_lineno = code->co_firstlineno;
	f->f_restricted = (builtins != tstate->interp->builtins);
	f->f_iblock = 0;
	f->f_nlocals = code->co_nlocals;
	f->f_stacksize = code->co_stacksize;
	f->f_ncells = ncells;
	f->f_nfreevars = nfrees;

	while (--extras >= 0)
		f->f_localsplus[extras] = NULL;

	f->f_valuestack = f->f_localsplus + (f->f_nlocals + ncells + nfrees);
	f->f_stacktop = f->f_valuestack;
	_PyObject_GC_TRACK(f);
	return f;

In contrast, the arglist is parsed only once over the lifetime of a
generator, and the frame object is reused almost entirely as-is across
resumptions.  The full code needed to resume a generator is less than the
block above from the tail end of PyFrame_New() (see gen_iternext() in
ceval.c for the non-gory details).

tastes-great-and-less-filling-ly y'rs  - tim





More information about the Python-list mailing list