Iterators, generators and 2.2 (was RE: do...until wisdom needed...)
[Aahz]
Generators (called "iterators") are going to be 2.2. They'll be more powerful that Icon's generators; it's not clear whether they'll be a full-fledged substitute for coroutines.
[Neelakantan Krishnaswami]
{\mr_burns Excellent.}
Is this the iterator idea that Ping posted about a couple of months back? What is the PEP number for this? I'm curious how the existing iteration protocol will interact with the new iterators.
This is getting confused. Iterators != generators (sorry, Aahz! it's more involved than that). Aahz gave you the PEP number for iterators, and last night Guido checked an initial implementation into the 2.2 CVS tree. In Python terms, "for" setup looks for an __iter__ method first, and if it doesn't find it but does find __getitem__, builds a lightweight iterator around the __getitem__ method instead. So the "for" loop works only with iterators now, but there's an adapter providing iterators by magic for old sequence objects that don't know about iterators: C:\Code\python\dist\src\PCbuild>python Python 2.2a0 (#16, Apr 20 2001, 23:16:12) [MSC 32 bit (Intel)] on win32 Type "copyright", "credits" or "license" for more information.
def f(s): ... for i in s: ... print i ... from dis import dis dis(f) 0 SET_LINENO 1
3 SET_LINENO 2 6 SETUP_LOOP 25 (to 34) 9 LOAD_FAST 0 (s) 12 GET_ITER >> 13 SET_LINENO 2 16 FOR_ITER 14 19 STORE_FAST 1 (i) 22 SET_LINENO 3 25 LOAD_FAST 1 (i) 28 PRINT_ITEM 29 PRINT_NEWLINE 30 JUMP_ABSOLUTE 13 33 POP_BLOCK >> 34 LOAD_CONST 0 (None) 37 RETURN_VALUE
The backward compatibility layer described above is hiding in the new GET_ITER opcode. Of course builtin lists (and so on) define the iterator slot directly now, so GET_ITER simply returns their iterator directly. Loops are less complicated (internally) now, and run significantly faster. User-defined types and classes no longer *need* to (ab)use __getitem__ to implement iteration (which is of particular interest to Greg Wilson right now, who is implementing a Set class and doesn't *want* to define __getitem__ because it's semantically senseless). None of that should be controversial in the least. More controversial is that iteration over dict keys has been tentatively added (and note that this is another thing made *possible* by breaking the old connection between __getitem__ and iteration):
dict = {"one": 1, "two": 2} for k in dict: ... print k ... one two
This is significantly faster, and unboundedly more memory-efficient, than doing "for k in dict.keys()". The dict.__contains__ slot was also filled in, so that "k in dict" is synonymous with "dict.has_key(k)", but again runs significantly faster:
"one" in dict 1 "three" in dict 0
File objects have also grown iterators, so that, e.g., for line in sys.stdin: print line now works. Iterators can be explicitly materialized too, via the new iter() builltin function, and invoked apart from the "for" protocol:
i1 = iter(dict) i1
dir(i1) ['next'] print i1.next.__doc__ it.next() -- get the next value, or raise StopIteration i2 = iter(dict) i1.next() 'one' i2.next() 'one' i1.next() 'two' i2.next() 'two' i1.next() Traceback (most recent call last): File "<stdin>", line 1, in ? StopIteration
Note that this allows a simple memory-efficient implementation of parallel sequence iteration too. For example, this program: class zipiter: def __init__(self, seq1, *moreseqs): seqs = [seq1] seqs.extend(list(moreseqs)) self.seqs = seqs def __iter__(self): self.iters = [iter(seq) for seq in self.seqs] return self def next(self): return [i.next() for i in self.iters] for i, j, k in zipiter([1, 2, 3], "abc", (5., 6., 7., 8.)): print i, j, k prints 1 a 5.0 2 b 6.0 3 c 7.0 Now all that is just iteration in a thoroughly conventional sense. There is no support here for generators or coroutines or microthreads, except in the sense that breaking the iteration==__getitem__ connection makes it easier to think about *how* generators may be implemented, and having an explicit iterator object "should" make it possible to go beyond Icon's notion of generators (which can only be driven implicitly by control context). Neil Schemenauer is currently thinking hard about that "in his spare time", but there's no guarantee anything will come of it in 2.2. Iterators are a sure thing, though (not least because they're already implemented!). not-only-implemented-but-feel-exactly-right-ly y'rs - tim
participants (1)
-
Tim Peters