[Python-Dev] Shall I start adding iterators to Python 2.2?
Guido van Rossum
guido@digicool.com
Thu, 19 Apr 2001 23:29:45 -0500
I've got a fairly complete implementation of iterators along the lines
of Ping's PEP (slightly updated). This is available for inspection
through CVS: just check out the CVS branch of python/dist/src named
"iter-branch" from SourceForge:
cvs checkout -r iter-branch -d <directory> python/src/dist
(This branch was forked off the trunk around the time of version
2.1b1, so it's not up to date with Python 2.1, but it's good enough to
show off iterators.)
My question is: should I just merge this code onto the trunk (making
it part of 2.2), or should we review the design more before committing
to this implementation?
Brief overview of what I've got implemented:
- There is a new built-in operation, spelled as iter(obj) in Python
and as PyObject_GetIter(obj) in C; it calls the tp_iter slot in
obj's type. This returns an iterator, which can be any object that
implements the iterator protocol. The iterator protocol defines one
method, next(), which returns the next value or raises the new
StopIteration exception.
- For backwards compatibility, if obj's type does not have a valid
tp_iter slot, iter(obj) and PyObject_GetIter(obj) create a sequence
iterator that iterates over a sequence.
- "for x in S: B" is implemented roughly as
__iter = iter(S)
while 1:
try:
x = __iter.next()
except StopIteration:
break
B
(except that the semantics of break when there's an else clause are
not different from what this Python code would do).
- The test "key in dict" is implemented as "dict.has_key(key)". (This
was done by implementing the sq_contains slot.
- iter(dict) returns an iterator that iterates over the keys of dict
without creating a list of keys first. This means that "for key in
dict" has the same effect as "for key in dict.keys()" as long as the
loop body doesn't modify the dictionary (assignment to existing keys
is okay).
- There's an operation to create an iterator from a function and a
sentinel value. This is spelled as iter(function, sentinel). For
example,
for line in iter(sys.stdin.readline, ""):
...
is an efficient loop over the lines of stdin.
- But even cooler is this, which is totally equivalent:
for line in sys.stdin:
...
- Not yet implemented, but part of the plan, is to use iterators for
all other implicit loops, like map/reduce/filter, min/max, and the
"in" test for sequences that don't define sq_contains.
--Guido van Rossum (home page: http://www.python.org/~guido/)