
Guido, thanks for the quick edits of the first draft. Here is a link to the second: http://users.rcn.com/python/download/pep-0289.html The reST version is attached. [Guido]
BTW I think the idea of having some iterators support __copy__ as a way to indicate they can be cloned is also PEPpable; we've pretty much reached closure on that one. PEP 1 explains how to get a PEP number.
That one sounds like a job for Alex. Raymond Hettinger ------------------------------------------------------------------ PEP: 289 Title: Generator Expressions Version: $Revision: 1.3 $ Last-Modified: $Date: 2003/08/30 23:57:36 $ Author: python@rcn.com (Raymond D. Hettinger) Status: Active Type: Standards Track Content-Type: text/x-rst Created: 30-Jan-2002 Python-Version: 2.3 Post-History: 22-Oct-2003 Abstract ======== This PEP introduces generator expressions as a high performance, memory efficient generalization of list comprehensions [1]_ and generators [2]_. Rationale ========= Experience with list comprehensions has shown their wide-spread utility throughout Python. However, many of the use cases do not need to have a full list created in memory. Instead, they only need to iterate over the elements one at a time. For instance, the following summation code will build a full list of squares in memory, iterate over those values, and, when the reference is no longer needed, delete the list:: sum([x*x for x in range(10)]) Time, clarity, and memory are conserved by using an generator expession instead:: sum(x*x for x in range(10)) Similar benefits are conferred on constructors for container objects:: s = Set(word for line in page for word in line.split()) d = dict( (k, func(v)) for k in keylist) Generator expressions are especially useful in functions that reduce an iterable input to a single value:: sum(len(line) for line in file if line.strip()) Accordingly, generator expressions are expected to partially eliminate the need for reduce() which is notorious for its lack of clarity. And, there are additional speed and clarity benefits from writing expressions directly instead of using lambda. List comprehensions greatly reduced the need for filter() and map(). Likewise, generator expressions are expected to minimize the need for itertools.ifilter() and itertools.imap(). In contrast, the utility of other itertools will be enhanced by generator expressions:: dotproduct = sum(x*y for x,y in itertools.izip(x_vector, y_vector)) Having a syntax similar to list comprehensions also makes it easy to convert existing code into an generator expression when scaling up application. BDFL Pronouncements =================== The previous version of this PEP was REJECTED. The bracketed yield syntax left something to be desired; the performance gains had not been demonstrated; and the range of use cases had not been shown. After, much discussion on the python-dev list, the PEP has been resurrected its present form. The impetus for the discussion was an innovative proposal from Peter Norvig [3]_. The Gory Details ================ 1. The semantics of a generator expression are equivalent to creating an anonymous generator function and calling it. There's still discussion about whether that generator function should copy the current value of all free variables into default arguments. 2. The syntax requires that a generator expression always needs to be inside a set of parentheses and cannot have a comma on either side. Unfortunately, this is different from list comprehensions. While [1, x for x in R] is illegal, [x for x in 1, 2, 3] is legal, meaning [x for x in (1,2,3)]. With reference to the file Grammar/Grammar in CVS, two rules change: a) The rule:: atom: '(' [testlist] ')' changes to:: atom: '(' [listmaker1] ')' where listmaker1 is almost the same as listmaker, but only allows a single test after 'for' ... 'in'. b) The rule for arglist needs similar changes. 2. The loop variable is not exposed to the surrounding function. This facilates the implementation and makes typical use cases more reliable. In some future version of Python, list comprehensions will also hide the induction variable from the surrounding code (and, in Py2.4, warnings will be issued for code accessing the induction variable). 3. There is still discussion about whether variable referenced in generator expressions will exhibit late binding just like other Python code. In the following example, the iterator runs *after* the value of y is set to one:: def h(): y = 0 l = [1,2] def gen(S): for x in S: yield x+y it = gen(l) y = 1 for v in it: print v 4. List comprehensions will remain unchanged:: [x for x in S] # This is a list comprehension. [(x for x in S)] # This is a list containing one generator expression. Acknowledgements ================ * Raymond Hettinger first proposed the idea of "generator comprehensions" in January 2002. * Peter Norvig resurrected the discussion in his proposal for Accumulation Displays [3]_. * Alex Martelli provided critical measurements that proved the performance benefits of generator expressions. He also provided strong arguments that they were a desirable thing to have. * Phillip Eby suggested "iterator expressions" as the name. * Subsequently, Tim Peters suggested the name "generator expressions". * Samuele Pedroni argued against late binding and provided the example shown above. References ========== .. [1] PEP 202 List Comprehensions http://python.sourceforge.net/peps/pep-0202.html .. [2] PEP 255 Simple Generators http://python.sourceforge.net/peps/pep-0255.html .. [3] Peter Norvig's Accumulation Display Proposal http:///www.norvig.com/pyacc.html Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End: