Re: [Python-Dev] PEP 289: Generator Expressions (second draft)
I've checked in an update to Raymond's PEP 289 which (I hope) clarifies a lot of things, and settles the capturing of free variables. Raymond, please take this to c.l.py for feedback! Wear asbestos. :-) I'm sure there will be plenty of misunderstandings in the discussion there. If these are due to lack of detail or clarity in the PEP, feel free to update the PEP. If there are questions that need us to go back to the drawing board or requiring BDFL pronouncement, take it back to python-dev. --Guido van Rossum (home page: http://www.python.org/~guido/)
Raymond, please take this to c.l.py for feedback! Wear asbestos. :-)
One thought: If we eventually adopt the notation that {a, b, c} is a set, there is a potential ambiguity in expressions such as {x**2 for x in range(n)}. Which is it, a set comprehension or a set with one element that is a generator expression? It would have to be the former, of course, by analogy with [x**2 for x in range(n)], which means that if we introduce generator expressions, and we later introduce set literals, we will have to introduce set comprehensions at the same time. Either that or prohibit generator expressions as set-literal elements unless parenthesized -- i.e. {(x**2 for x in range(n))}.
If we eventually adopt the notation that {a, b, c} is a set, there is a potential ambiguity in expressions such as {x**2 for x in range(n)}. Which is it, a set comprehension or a set with one element that is a generator expression?
It would have to be the former, of course, by analogy with [x**2 for x in range(n)], which means that if we introduce generator expressions, and we later introduce set literals, we will have to introduce set comprehensions at the same time. Either that or prohibit generator expressions as set-literal elements unless parenthesized -- i.e. {(x**2 for x in range(n))}.
Don't worry. The current proposal *always* requires parentheses around generator expressions (but it may be the only argument to a function), so your example would be illegal. --Guido van Rossum (home page: http://www.python.org/~guido/)
On Thu, 2003-10-23 at 10:18, Andrew Koenig wrote:
Raymond, please take this to c.l.py for feedback! Wear asbestos. :-)
One thought:
If we eventually adopt the notation that {a, b, c} is a set, there is a potential ambiguity in expressions such as {x**2 for x in range(n)}. Which is it, a set comprehension or a set with one element that is a generator expression?
It would have to be the former, of course, by analogy with [x**2 for x in range(n)], which means that if we introduce generator expressions, and we later introduce set literals, we will have to introduce set comprehensions at the same time. Either that or prohibit generator expressions as set-literal elements unless parenthesized -- i.e. {(x**2 for x in range(n))}.
Heh, and then {(x, x**2) for x in range(n)} is a dict comprehension. okay-/now/-i'll-shut-up-about-them-ly y'rs, -Barry
Heh, and then {(x, x**2) for x in range(n)} is a dict comprehension.
No, it's a set comprehension where the set elements are pairs. The dict comprehension would be {x: x**2 for x in range(n)} Or would that be a single-element dict whose key is x and value is a generator expression? :-)
Hello, this is my first post to this list, but I followed it "passive" since quite some time. I had a thought about distinguishing the list with 1 iterators vs. list comprehension issue that did not appear (at least to my eyes) yet. Why not take the same approach than used for tuples already? like (5) is just the value 5 and (5,) is a 1-tuple containing the value 5 I thought it would be intuitive to have [x**2 for x in range(n)] # be a list comprehension like it currently is [x**2 for x in range(n),] # a list with 1 iterator in it
No, it's a set comprehension where the set elements are pairs. The dict comprehension would be
{x: x**2 for x in range(n)}
Or would that be a single-element dict whose key is x and value is a generator expression? :-)
in this case the same could be applied {x: x**2 for x in range(n)} # dict comprehension {x: x**2 for x in range(n),} # dict with 1 iterator (but "x" is probably not a valid name, is it?) best regards Werner
Andrew Koenig <ark-mlist@att.net>:
The dict comprehension would be
{x: x**2 for x in range(n)}
Or would that be a single-element dict whose key is x and value is a generator expression? :-)
According to the parentheses rule, no, because that would have to be {x: (x**2 for x in range(n))} (Parentheses)-(are)-(so)-(handy)-(ly), Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
[Guido]
I've checked in an update to Raymond's PEP 289 which (I hope) clarifies a lot of things, and settles the capturing of free variables.
Nice edits. I'm unclear on the meaning of the last line in detail #3, "(Loop variables may also use constructs like x[i] or x.a; this form may be deprecated.)" Does this mean that "(x.a for x in mylist)" will initiatly be valid but will someday break? If so, I can't imagine why. Or does in mean that the induction variable can be in that form, "(x for x.a in mylist)". Surely, this would never be allowed.
Raymond, please take this to c.l.py for feedback! Wear asbestos. :-)
Will do. Raymond Hettinger
Raymond Hettinger writes:
Does this mean that "(x.a for x in mylist)" will initiatly be valid but will someday break? If so, I can't imagine why. Or does in mean that the induction variable can be in that form, "(x for x.a in mylist)". Surely, this would never be allowed.
The later. There's bound to be some seriously evil stuff out there, just waiting to pop up... ;-) -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation
I'm unclear on the meaning of the last line in detail #3, "(Loop variables may also use constructs like x[i] or x.a; this form may be deprecated.)"
Does this mean that "(x.a for x in mylist)" will initiatly be valid but will someday break?
No, I meant that "for x.a in mylist: ..." is valid but shouldn't be, and consequently (because they all share the same syntax) this is also allowed in list comprehensions and generator expressions. All uses should be disallowed.
If so, I can't imagine why. Or does in mean that the induction variable can be in that form, "(x for x.a in mylist)". Surely, this would never be allowed.
We can prevent it for generator expressions, but it's too late for list comprehensions and regular for loops -- we'll have to go deprecate it there. --Guido van Rossum (home page: http://www.python.org/~guido/)
Guido> No, I meant that "for x.a in mylist: ..." is valid but shouldn't Guido> be, Valid? I'll buy that, but it had never occurred to me. Useful? That's not immediately obvious: >>> class Foo: ... def __init__(self): ... self.a = 42 ... >>> lst = [Foo() for i in range(4)] >>> lst [<__main__.Foo instance at 0x752760>, <__main__.Foo instance at 0x7529e0>, <__main__.Foo instance at 0x752df0>, <__main__.Foo instance 0x7529e0>at 0x752dc8>] >>> [x for x.a in lst] [Type help() for interactive help, or help(object) for help about object., Type help() for interactive help, or help(object) for help about object., Type help() for interactive help, or help(object) for help about object., Type help() for interactive help, or help(object) for help about object.] Skip
[Skip Montanaro]
Valid? I'll buy that, but it had never occurred to me
It had not occurred to me either. A moments reflection on the implementation reveals that any lvalue will work, even a[:]. Rather than twist ourselves into knots trying to find ways to disallow it, I think it should be left in the realm of things that never occur to anyone and have never been a real problem. don't-ask-don't-tell-ly yours, Raymond
FYI, some of the implementations of the backtracking conjoin() operator in test_generators.py make heavy use of for values[i] in gs[i](): style for-loops. That style is often useful when generating vectors representing combinatorial objects. I could live without it, but so far haven't needed to prove that <wink>.
[Guido]
No, I meant that "for x.a in mylist: ..." is valid but shouldn't be, and consequently (because they all share the same syntax) this is also allowed in list comprehensions and generator expressions. All uses should be disallowed.
If so, I can't imagine why. Or does in mean that the induction variable can be in that form, "(x for x.a in mylist)". Surely, this would never be allowed.
We can prevent it for generator expressions, but it's too late for list comprehensions and regular for loops -- we'll have to go deprecate it there.
Since the issue is not unique to generator expressions, I recommend leaving it out of the PEP and separately dealing with all for-constructs at one time. It's harder to win support for proposals that use the word "deprecate". Raymond Hettinger
I've checked in an update to Raymond's PEP 289 which (I hope) clarifies a lot of things, and settles the capturing of free variables.
I had another early-morning idea about how to deal with the free variable issue, which could also be used when you have another form of closure (lambda, def) and you want to capture some of its free variables. Suppose there were a special form of assignment new x = expr If x is not used in any nested scope, this is the same as a regular assignment. But if it is, and consequently x is kept in a cell, instead of replacing the contents of the cell, this creates a *new* cell which replaces the previous one in the current scope. But any previously create closure will still be holding on to the old cell with its old value. If you do this in a loop, you will end up with a series of incarnations of the variable, each of which lives in its own little scope. Using this, Tim's pipeline example would become pipe = source for new p in predicates: new pipe = e for e in pipe if p(e) For generator expressions, Tim's idea of just always capturing the free variables is probably better, since it doesn't require recognising a subtle problem and then applying a furtherly-subtle solution. But it seemed like a stunningly brilliant idea at 3:27am this morning, so I thought I'd share it with you. :-) Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | greg@cosc.canterbury.ac.nz +--------------------------------------+
[GvR]
Raymond, please take this to c.l.py for feedback! Wear asbestos. :-)
I'm sure there will be plenty of misunderstandings in the discussion there. If these are due to lack of detail or clarity in the PEP, feel free to update the PEP. If there are questions that need us to go back to the drawing board or requiring BDFL pronouncement, take it back to python-dev.
The asbestos wasn't needed :-) Overall the pep is being well received. The discussion has been uncontentious and light (around 50-55 posts). Several people initially thought that lambda should be part of the syntax, but other respondants quickly laid that to rest. Many posters were succinctly positive: "+1" or "great idea". One skeptical response came from someone who didn't like list comprehensions either. Alex quickly pointed out that they have been "wildly successful" for advanced users and newbies alike. One poster counter-suggested a weird regex style syntax for embedding Perl expressions. The newsgroup was very kind and no one called him wacko :-) There was occasional discussion about the parentheses requirement but that was quickly settled also. One idea that had some merit was to not require the outer parentheses for a single expression on the rhs of an assignment: g = (x**2 for x in range(10)) # maybe the outer parens are not needed The discussion is winding down and there are no unresolved questions. Raymond Hettinger
At 04:13 PM 10/27/03 -0500, Raymond Hettinger wrote:
There was occasional discussion about the parentheses requirement but that was quickly settled also. One idea that had some merit was to not require the outer parentheses for a single expression on the rhs of an assignment:
g = (x**2 for x in range(10)) # maybe the outer parens are not needed
FWIW, I think the parentheses add clarity over e.g. g = x**2 for x in range(10) As this latter formulation looks to me like g will equal 81 after the statement is executed.
Overall the pep is being well received. The discussion has been uncontentious and light (around 50-55 posts).
Great!
There was occasional discussion about the parentheses requirement but that was quickly settled also. One idea that had some merit was to not require the outer parentheses for a single expression on the rhs of an assignment:
g = (x**2 for x in range(10)) # maybe the outer parens are not needed
I really think they should be required. The 'for' keyword feels like it has a lower "priority" than the assignment operator. --Guido van Rossum (home page: http://www.python.org/~guido/)
On Tuesday 28 October 2003 12:10 am, Guido van Rossum wrote:
Overall the pep is being well received. The discussion has been uncontentious and light (around 50-55 posts).
Great!
There was occasional discussion about the parentheses requirement but that was quickly settled also. One idea that had some merit was to not require the outer parentheses for a single expression on the rhs of an assignment:
g = (x**2 for x in range(10)) # maybe the outer parens are not needed
I really think they should be required. The 'for' keyword feels like it has a lower "priority" than the assignment operator.
I entirely agree with Guido: the assignment looks _much_ better to me WITH the parentheses around the RHS. Alex
On Monday 27 October 2003 10:13 pm, Raymond Hettinger wrote: ...
Several people initially thought that lambda should be part of the
yield was repeatedly mentioned, and I don't recall lambda being, so maybe this is a typo.
syntax, but other respondants quickly laid that to rest.
Yes, consensus clearly converged on the proposed syntax (the mention of "generators" in the construct's name was the part that I think prompted the desire for 'yield' -- had they been called "iterator expressions" I suspect nobody would have missed 'yield' even transiently:-).
One poster counter-suggested a weird regex style syntax for embedding Perl expressions. The newsgroup was very kind and no one called him wacko :-)
...though I did say "if you want Perl, you know where to find it"...:-)
The discussion is winding down and there are no unresolved questions.
Yes, fair summary. The one persistent (but low-as-a-whisper) grumbling is by one A.M., who keeps mumbling "they're _iterator_ expressions, the fact that they use generators is an implementation detail, grmbl grmbl":-). But then, he IS one of those pesky must-always-have-SOME-whine types. Alex
On Tue, Oct 28, 2003 at 12:19:29AM +0100, Alex Martelli wrote:
The one persistent (but low-as-a-whisper) grumbling is by one A.M., who keeps mumbling "they're _iterator_ expressions, the fact that they use generators is an implementation detail, grmbl grmbl":-).
I'm inclined to agree with him. Was there some reason why the term iterator expressions was rejected? Neil
The one persistent (but low-as-a-whisper) grumbling is by one A.M., who keeps mumbling "they're _iterator_ expressions, the fact that they use generators is an implementation detail, grmbl grmbl":-).
I'm inclined to agree with him. Was there some reason why the term iterator expressions was rejected?
After seeing "iterator expressions" I came up wit "generator expressions" and decided I liked that better. Around the same time Tim Peters wrote a post where he proposed "generator expressions" independently: http://mail.python.org/pipermail/python-dev/2003-October/039186.html Trying to rationalize my own gut preference, I think I like "generator expressions" better than "iterator expressions" because there are so many other expressions that yield iterators (e.g. iter(x) comes to mind :-). Just like generator functions are one specific cool way of creating an iterator, generator expressions are another specific cool way, and as a bonus, they're related in terms of implementation (and that certainly reflects on corners of the semantics, so I don't think we should try to hide this as an implementation detail). --Guido van Rossum (home page: http://www.python.org/~guido/)
After seeing "iterator expressions" I came up wit "generator expressions" and decided I liked that better. Around the same time Tim Peters wrote a post where he proposed "generator expressions" independently:
http://mail.python.org/pipermail/python-dev/2003-October/039186.html
Trying to rationalize my own gut preference, I think I like "generator expressions" better than "iterator expressions" because there are so many other expressions that yield iterators (e.g. iter(x) comes to mind :-). Just like generator functions are one specific cool way of creating an iterator, generator expressions are another specific cool way, and as a bonus, they're related in terms of implementation (and that certainly reflects on corners of the semantics, so I don't think we should try to hide this as an implementation detail).
I'm convinced. Raymond
participants (12)
-
Alex Martelli
-
Andrew Koenig
-
Barry Warsaw
-
Fred L. Drake, Jr.
-
Greg Ewing
-
Guido van Rossum
-
Neil Schemenauer
-
Phillip J. Eby
-
Raymond Hettinger
-
Skip Montanaro
-
Tim Peters
-
Werner Schiendl