Iterators & generators (RE: Real Problems with Python)

Tim Peters tim_one at email.msn.com
Mon Feb 14 01:37:57 EST 2000


[Samuel A. Falvo II, wonders about "suspend" semantics after
 Tim's unelaborated example]

[Aahz Maruch, proves that you don't have to be a lambda-head to
 pick up the thrust of it at once -- thanks!]

> suspend *is* a return.  It also -- at the same time -- stores the
> program counter for b.traverse_post(), so that the next call to
> b.traverse_post() starts at the line of execution immediately following
> the suspend.

Right so far as it goes -- also stores away "everything else" needed, like
local vars and their current bindings.  This is cheap since it's just the
state of the frame object, and half the trick of "suspend" lies simply in
not decrementing the frame object's refcount!  In effect, it freezes the
function in mid-stride, returns a result to the caller, and waits to get
thawed again.

> This means that, should someone be foolish enough to call
> b.traverse_post() inside the loop "for leaf...", the loop will
> probably return incorrect results, just like modifying a list
> while you iterate over it.

Actually not a problem.  Under the covers, the implementation has to
distinguish between "the first call" and "resuming calls".  A first call
always gets a fresh frame of its own, just like any other call today.  It's
only resuming that's new behavior on the call side (& resuming is much
cheaper than a first call -- it simply restarts the frame object from where
it suspended).  So you could have dozens of iterators marching over the same
tree, and they wouldn't interfere with each other.  "for" can hide all that
by magic, but it would also be Pythonic to expose the machinery so that you
*could* do fancier things yourself.

> Using suspend means that people no longer have to hack __getitem__()
> to "do the right thing" for a class.

Try even *writing* a binary tree postorder traversal via __getitem__(), then
ponder NeelK's original points again.  You can't do better than faking
recursion via a hand-simulated stack (unnatural, tedious and error-prone),
or building the entire result sequence up front then passing it out a piece
at a time (can be too space-intensive, or waste time if you're simply doing
a search and will likely "get out early").  Note that binary tree traversal
is about as simple as you can get without being trivial:  "resumable
functions" already make a night-and-day difference here, but are that much
more helpful the harder the task.

> Doing this in a threaded environment is dangerous, but no more so
> than dealing with any other class.

Or any other kind of call -- each iterator/generator gets its own locals, so
the only dangers are in mutating while iterating, or communicating info via
unprotected globals.  The same cautions apply to __getitem__ today.

it-tends-to-"just-work"-all-by-itself-ly y'rs  - tim






More information about the Python-list mailing list