Add "while" clauses to generator expressions

I've been using Python generators for a while now. e.g. a=(i for i in range(10)) a.next() a.next() ...etc. I also find the "if" clause handy: a = (i for i in range(10) if i%2==0) (I know that range(0,12,2) will do the same thing, but it's the idea I like, especially for more complex predicates.) I would like to know if anyone has thought of adding a "while" clause as well, like this: a = (i for i in range(100) while i <=50) (Again, this could be done with range(0,51) but then the predicate can be more complicated.) Why would this be helpful? Consider the "in ..." part of the generator. You could be referring to something that is ordered -- sorted names for example. Then you might want to stop your iterator when you reach "Morgan" so you would like to write: name = (n for n in names while n <= "Morgan") name.next() ...etc... Of course, name = (n for n in names if n <= "Morgan") will work, but it will look at every item in "names." Since "names" is sorted, this is a waste of time. Imagine you want to stop at "Baker". Your "if" clause will look at and discard most names in the list, assuming a normal distribution of English names. Now, you could do the same thing with a generator function: def leMorgan(names): i = 0 while names[i] <= "Morgan": yield names[i] i+=1 and use it like this: name=leMorgan(names) name.next() ...etc... but I think that adding a while clause to the generator expression is simpler and clearer (and it keeps it all in one place!) I know that this is functionally equivalent to the takewhile function in itertools and my motivation is the same. I just think that this could be done nicely within the context of the existing syntax. This is also convenient when the "in ..." clause refers to another (possibly infinite) generator. For a simple example, suppose I want to run through some natural numbers. I can write an infinte generator function like this: def genN(n=0): while 1: yield n n+=1 Then, I might use this in another generator: p = (n for n in genN() if prime(n) while n <= 100) to get the prime numbers under 100 (assuming I have a predicate "prime" that works as one would hope). Of course you could do this with range(101) instead of genN, but this is just an example to demonstrate the idea. Without the "while" clause, this will not work: p = (n for n in genN() if prime(n) if n <= 100) This will actually NEVER terminate, since EVERY item in genN() (which is an infinite generator) will be tested for <= 100. So...What do others think? Is this a loony idea? Is there a better way? Also, can anyone think of a similarly syntax to replicate the dropwhile function?

Yes of course. However I am advocating adding this to the syntax rather than using the takewhile function, thus making it part of the generator expression proper and avoiding the function call and module include. From the little I know about the implementation of the corresponding "if" clause, this should be relatively easy. On 1/10/09, Mathias Panzenböck <grosser.meister.morti@gmx.net> wrote:
-- Sent from my mobile device

On Sat, Jan 10, 2009 at 7:35 AM, Gerald Britton <gerald.britton@gmail.com> wrote:
I think this could end up being confusing. Current generator expressions turn into an equivalent generator function by simply indenting the clauses and adding a yield, for example: (i for i in range(100) if i % 2 == 0) is equivalent to: def gen(): for i in range(100): if i % 2 == 0: yield i Now you're proposing syntax that would no longer work like this. Taking your example: (i for i in range(100) while i <= 50) I would expect this to mean: def gen(): for i in range(100): while i <= 50: yield i In short, -1. You're proposing to use an existing keyword in a new way that doesn't match how generator expressions are evaluated. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

I guess I don't think it would be confusing. On the contrary, I believe that the expressions would read naturally and be a nice simplification. Of course it won't work just like "if" but that is just the point! I can (and do) accomplish the same thing with "takewhile", but if the same thing can be sone with a little addition to the generator expression, why not do it? On 1/10/09, Steven Bethard <steven.bethard@gmail.com> wrote:
-- Sent from my mobile device

[Fixing the top-posting. Note that only Guido is allowed to top-post around here. ;-)] On 1/10/09, Steven Bethard <steven.bethard@gmail.com> wrote:
On Sat, Jan 10, 2009 at 2:34 PM, Gerald Britton <gerald.britton@gmail.com> wrote:
I'm probably just repeating myself here, but the reason not to do it is that the current generator expressions translate almost directly into the corresponding generator statements. Using "while" in the way you've suggested breaks this symmetry, and would make Python harder to learn. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Hmmm. I can't really see what you are saying. Your example doesn't quite get the intention. The "while" clause in the example would translate like this: def gen(): for i in range(100): if i <= 50: yield i else: break Also, I think that for a new python user, (n for n in range(100) while n*n < 50) is easier to understand and use than: (n for n in takewhile(lambda n:n*n < 50, range(100))) My proposed version is shorter (less chance for typos), has one less set of parentheses (always a good thing) and reads naturally. Also, it is directly analogous to: (n for n in range(100) if n*n < 50) except that the "while" version stops when n reaches 8. The "if" version doesn't stop until n reaches 99. On Sat, Jan 10, 2009 at 5:45 PM, Steven Bethard <steven.bethard@gmail.com> wrote:

I've got an idea: x = (n for n in xrange(100) if n < 22 else break) but I don't mean that seriously (giving it -1). :P -panzi

Gerald Britton schrieb:
The other is needlessly complicated, though. takewhile(lambda n: n*n < 50, range(100)) is just as fine, and only 3 characters longer than the proposed while-exp. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Gerald Britton wrote:
as well, like this:
a = (i for i in range(100) while i <=50)
there was a longish thread on the subject back in august '08, where I pointed out
but I was then answered that this "worked by accident" which apparently meant that 3.0 pedagogical doctrine wants [x for x...] to be strictly equivalent to list(x for x...), while this equivalence in fact doesn't extend to similar uses of StopIteration. Cheers, BB

Yes of course. However I am advocating adding this to the syntax rather than using the takewhile function, thus making it part of the generator expression proper and avoiding the function call and module include. From the little I know about the implementation of the corresponding "if" clause, this should be relatively easy. On 1/10/09, Mathias Panzenböck <grosser.meister.morti@gmx.net> wrote:
-- Sent from my mobile device

On Sat, Jan 10, 2009 at 7:35 AM, Gerald Britton <gerald.britton@gmail.com> wrote:
I think this could end up being confusing. Current generator expressions turn into an equivalent generator function by simply indenting the clauses and adding a yield, for example: (i for i in range(100) if i % 2 == 0) is equivalent to: def gen(): for i in range(100): if i % 2 == 0: yield i Now you're proposing syntax that would no longer work like this. Taking your example: (i for i in range(100) while i <= 50) I would expect this to mean: def gen(): for i in range(100): while i <= 50: yield i In short, -1. You're proposing to use an existing keyword in a new way that doesn't match how generator expressions are evaluated. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

I guess I don't think it would be confusing. On the contrary, I believe that the expressions would read naturally and be a nice simplification. Of course it won't work just like "if" but that is just the point! I can (and do) accomplish the same thing with "takewhile", but if the same thing can be sone with a little addition to the generator expression, why not do it? On 1/10/09, Steven Bethard <steven.bethard@gmail.com> wrote:
-- Sent from my mobile device

[Fixing the top-posting. Note that only Guido is allowed to top-post around here. ;-)] On 1/10/09, Steven Bethard <steven.bethard@gmail.com> wrote:
On Sat, Jan 10, 2009 at 2:34 PM, Gerald Britton <gerald.britton@gmail.com> wrote:
I'm probably just repeating myself here, but the reason not to do it is that the current generator expressions translate almost directly into the corresponding generator statements. Using "while" in the way you've suggested breaks this symmetry, and would make Python harder to learn. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy

Hmmm. I can't really see what you are saying. Your example doesn't quite get the intention. The "while" clause in the example would translate like this: def gen(): for i in range(100): if i <= 50: yield i else: break Also, I think that for a new python user, (n for n in range(100) while n*n < 50) is easier to understand and use than: (n for n in takewhile(lambda n:n*n < 50, range(100))) My proposed version is shorter (less chance for typos), has one less set of parentheses (always a good thing) and reads naturally. Also, it is directly analogous to: (n for n in range(100) if n*n < 50) except that the "while" version stops when n reaches 8. The "if" version doesn't stop until n reaches 99. On Sat, Jan 10, 2009 at 5:45 PM, Steven Bethard <steven.bethard@gmail.com> wrote:

I've got an idea: x = (n for n in xrange(100) if n < 22 else break) but I don't mean that seriously (giving it -1). :P -panzi

Gerald Britton schrieb:
The other is needlessly complicated, though. takewhile(lambda n: n*n < 50, range(100)) is just as fine, and only 3 characters longer than the proposed while-exp. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

Gerald Britton wrote:
as well, like this:
a = (i for i in range(100) while i <=50)
there was a longish thread on the subject back in august '08, where I pointed out
but I was then answered that this "worked by accident" which apparently meant that 3.0 pedagogical doctrine wants [x for x...] to be strictly equivalent to list(x for x...), while this equivalence in fact doesn't extend to similar uses of StopIteration. Cheers, BB
participants (5)
-
Boris Borcic
-
Georg Brandl
-
Gerald Britton
-
Mathias Panzenböck
-
Steven Bethard