Implementing a 'but' statement in for-iterations
This is more of a syntactic sugar than an actual new feature, but... Exactly, 'but' is the idea: a special keyword to be used in for statements to exclude values from the iterable. E.g., when iterating over a generator:
for i in range(0, 10) but (2, 8): would implicitly create a new generator comprehensively, as in: for i in (j for j in range(0, 10) if j not in [2, 8]):
It might not add such a feature to justify the definition of a but_stmt in python.gram, but it's fully compliant with Python's philosophy of concise, clear and elegant code. #road to a programming natural language (jk)
On Tue, Jun 29, 2021 at 7:51 AM Max Shouman <shouman.max@gmail.com> wrote:
This is more of a syntactic sugar than an actual new feature, but... Exactly, 'but' is the idea: a special keyword to be used in for statements to exclude values from the iterable.
E.g., when iterating over a generator:
for i in range(0, 10) but (2, 8): would implicitly create a new generator comprehensively, as in: for i in (j for j in range(0, 10) if j not in [2, 8]):
It might not add such a feature to justify the definition of a but_stmt in python.gram, but it's fully compliant with Python's philosophy of concise, clear and elegant code.
#road to a programming natural language (jk)
Python currently has 35 keywords. To justify a 36th, there needs to be a *lot* of benefit. Instead, perhaps it'd be worth devising your own range() type which is capable of subtraction? _range = range class range: def __init__(self, *args, exclude=()): self.basis = _range(*args) self.excludes = exclude def __sub__(self, other): return type(self)(r.start, r.stop, r.step, excludes=self.exclude + other) def __iter__(self): for val in self.basis: if val not in self.excludes: yield val Then you can write this: for i in range(0, 10) - (2, 8): ... Tada! No new keyword needed, and it reads better than "but" does ("except for" would be how English would normally write that). ChrisA
On 6/28/21 5:40 PM, Max Shouman wrote:
This is more of a syntactic sugar than an actual new feature, but... Exactly, 'but' is the idea: a special keyword to be used in for statements to exclude values from the iterable.
E.g., when iterating over a generator:
for i in range(0, 10) but (2, 8): would implicitly create a new generator comprehensively, as in: for i in (j for j in range(0, 10) if j not in [2, 8]): It might not add such a feature to justify the definition of a but_stmt in python.gram, but it's fully compliant with Python's philosophy of concise, clear and elegant code.
#road to a programming natural language (jk)
Wild idea, but could we avoid a new keyword by reusing one that can't go there, like except? for i in range(0,10) except (2, 8): don't know if it is actually worth it, but at least it doesn't add a new keyword. -- Richard Damon
Another wild idea: Suppose that after a line that introduces a suite, including the final colon, you could write further lines on the same physical line, and this would be semantically equivalent to having them on separate lines with increasing indents, but a smaller indent than the following lines in the suite body. Example: with open('file1') as f: with open('file2') as g: <do stuff> Then you could write: for i in range(0, 10): if i not in range(2, 8): <do stuff> # This is arguably slightly easier to read than having two separate lines, as it puts both aspects of a single concept ("what values of i do I loop over?") together. It also avoids a physical indentation level. Rob Cliffe On 29/06/2021 00:44, Richard Damon wrote:
This is more of a syntactic sugar than an actual new feature, but... Exactly, 'but' is the idea: a special keyword to be used in for statements to exclude values from the iterable.
E.g., when iterating over a generator:
for i in range(0, 10) but (2, 8): would implicitly create a new generator comprehensively, as in: for i in (j for j in range(0, 10) if j not in [2, 8]): It might not add such a feature to justify the definition of a but_stmt in python.gram, but it's fully compliant with Python's philosophy of concise, clear and elegant code.
#road to a programming natural language (jk) Wild idea, but could we avoid a new keyword by reusing one that can't go
On 6/28/21 5:40 PM, Max Shouman wrote: there, like except?
for i in range(0,10) except (2, 8):
don't know if it is actually worth it, but at least it doesn't add a new keyword.
Umm?! items = (j for j in range(10) if j not in {2, 8}) We don't need a new keyword. Nor a tortured use of an old one. On Mon, Jun 28, 2021, 8:46 PM Rob Cliffe via Python-ideas < python-ideas@python.org> wrote:
Another wild idea: Suppose that after a line that introduces a suite, including the final colon, you could write further lines on the same physical line, and this would be semantically equivalent to having them on separate lines with increasing indents, but a smaller indent than the following lines in the suite body. Example:
with open('file1') as f: with open('file2') as g: <do stuff>
Then you could write:
for i in range(0, 10): if i not in range(2, 8): <do stuff> # This is arguably slightly easier to read than having two separate lines, as it puts both aspects of a single concept ("what values of i do I loop over?") together. It also avoids a physical indentation level.
Rob Cliffe
On 29/06/2021 00:44, Richard Damon wrote:
This is more of a syntactic sugar than an actual new feature, but... Exactly, 'but' is the idea: a special keyword to be used in for statements to exclude values from the iterable.
E.g., when iterating over a generator:
for i in range(0, 10) but (2, 8): would implicitly create a new generator comprehensively, as in: for i in (j for j in range(0, 10) if j not in [2, 8]): It might not add such a feature to justify the definition of a but_stmt in python.gram, but it's fully compliant with Python's philosophy of concise, clear and elegant code.
#road to a programming natural language (jk) Wild idea, but could we avoid a new keyword by reusing one that can't go
On 6/28/21 5:40 PM, Max Shouman wrote: there, like except?
for i in range(0,10) except (2, 8):
don't know if it is actually worth it, but at least it doesn't add a new keyword.
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/RFD3U3... Code of Conduct: http://python.org/psf/codeofconduct/
What do we gain from this? Three characters. I'm not sure how useful this would be. On Mon, Jun 28, 2021, 9:29 PM David Mertz <mertz@gnosis.cx> wrote:
Umm?!
items = (j for j in range(10) if j not in {2, 8})
We don't need a new keyword. Nor a tortured use of an old one.
On Mon, Jun 28, 2021, 8:46 PM Rob Cliffe via Python-ideas < python-ideas@python.org> wrote:
Another wild idea: Suppose that after a line that introduces a suite, including the final colon, you could write further lines on the same physical line, and this would be semantically equivalent to having them on separate lines with increasing indents, but a smaller indent than the following lines in the suite body. Example:
with open('file1') as f: with open('file2') as g: <do stuff>
Then you could write:
for i in range(0, 10): if i not in range(2, 8): <do stuff> # This is arguably slightly easier to read than having two separate lines, as it puts both aspects of a single concept ("what values of i do I loop over?") together. It also avoids a physical indentation level.
Rob Cliffe
On 29/06/2021 00:44, Richard Damon wrote:
This is more of a syntactic sugar than an actual new feature, but... Exactly, 'but' is the idea: a special keyword to be used in for statements to exclude values from the iterable.
E.g., when iterating over a generator:
> for i in range(0, 10) but (2, 8): would implicitly create a new generator comprehensively, as in: > for i in (j for j in range(0, 10) if j not in [2, 8]): It might not add such a feature to justify the definition of a but_stmt in python.gram, but it's fully compliant with Python's philosophy of concise, clear and elegant code.
#road to a programming natural language (jk) Wild idea, but could we avoid a new keyword by reusing one that can't go
On 6/28/21 5:40 PM, Max Shouman wrote: there, like except?
for i in range(0,10) except (2, 8):
don't know if it is actually worth it, but at least it doesn't add a new keyword.
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/RFD3U3... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/TNEUSZ... Code of Conduct: http://python.org/psf/codeofconduct/
On Tue, Jun 29, 2021 at 01:37:59AM +0100, Rob Cliffe via Python-ideas wrote:
for i in range(0, 10): if i not in range(2, 8): <do stuff> # This is arguably slightly easier to read than having two separate lines,
"Arguably" is an understatement.
as it puts both aspects of a single concept ("what values of i do I loop over?") together.
I would consider it two concepts: (1) you're looping over the values range(0, 10) and (2) skipping over values in range(2, 8). Until we reach the bottom of the block, we can't be sure that the condition isn't if i not in range(2, 8): ... else: ... which by the way suggests that this suggested syntax would be ambiguous: for i in seq: if condition: block else: block Does the `else` partner the `for` or the `if`? -- Steve
On Tue, Jun 29, 2021 at 5:04 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Jun 29, 2021 at 01:37:59AM +0100, Rob Cliffe via Python-ideas wrote:
for i in range(0, 10): if i not in range(2, 8): <do stuff> # This is arguably slightly easier to read than having two separate lines,
"Arguably" is an understatement.
as it puts both aspects of a single concept ("what values of i do I loop over?") together.
I would consider it two concepts: (1) you're looping over the values range(0, 10) and (2) skipping over values in range(2, 8).
What if you loop over the values in the range(0, 10) but skip all of the odd numbers? Is that two concepts too? Because we have an easy way to spell that: range(0, 10, 2). Why should a disjointed range be fundamentally different? It's one concept: looping over all the values from 0 to 10 that aren't between 2 and 8.
Until we reach the bottom of the block, we can't be sure that the condition isn't
if i not in range(2, 8): ... else: ...
which by the way suggests that this suggested syntax would be ambiguous:
Do you have this problem with comprehensions? No. Why not? Because you can't put an "else" when you're using "if" to filter a condition. Seems pretty straight-forward to me. I do think that filtering would be a useful feature to have, but for ranges, the best way IMO is to change the thing you're iterating over. It's not that hard to devise a range-like object that supports subtraction; I whipped up a quick demo above. You could make it possible to subtract one range from another, or whatever suits your purpose, and ultimately, you'd still end up with the basic "for value in iterable:" syntax, just with a more detailed iterable. ChrisA
On Tue, Jun 29, 2021 at 05:17:36PM +1000, Chris Angelico wrote:
What if you loop over the values in the range(0, 10) but skip all of the odd numbers? Is that two concepts too?
Of course it is. Think about somebody who knows about looping, and knows what range() does, but has no concept of odd numbers. (Perhaps a very precocious child.) They would easily understand `for i in range(0, 10, 2)` but have no idea how to skip odd numbers. If that example is too implausible for you, how about skipping the antisigma numbers? https://oeis.org/A024816 Going back to odds and evens, you even described it as two steps: 1. loop over the values in the range(0, 10) 2. but skip all of the odd numbers. That's two distict steps. You are correct that it gives the same practical result as the single step: 1. loop over the values in the range(0, 10, 2) but that's okay. There are many operations which are functionally equivalent (they give the same results) but are conceptually different: * the number of days in a week; * the average of 6 and 8; * a quarter of 28; * the fourth prime number; * the number of Dwarves in the story of Snow White; * the number of Deadly Sins in christian theology; * the number of examples I used to illustrate this. As a programmer, you of course are perfectly entitled to change your code to a functionally equivalent but conceptually distinct operation, if you care more about the results than how you get the results. (E.g. loop unrolling changes something which is conceptually a loop into something which is conceptually *not* a loop, since they are functionally equivalent.)
Because we have an easy way to spell that: range(0, 10, 2).
Sure. But why do we care so much about this trivial special case? Remember that the condition can be as general as we like. How about looping over numbers between 0 and a trillion whose bit count (number of 1s in binary) is a perfect number, except for those which are prime? More practically, if you have some arbitrarily complicated iterable, and you wish to skip some of those items according to some arbitrarily complicated condition known only at runtime: for url in filter( lambda url: url not in skiplist, map(make_url, webspider.follow_all(depth=10000)) ): if (url not in seen and is_image(url.filetype) and image:=Image.read(url) and image.detect_faces(**params).match_any(*targets) ): process(url) do you still think that's conceptually a single operation? "Process the URLs I want, duh!" *wink* -- Steve
On Tue, Jun 29, 2021 at 7:15 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Jun 29, 2021 at 05:17:36PM +1000, Chris Angelico wrote:
What if you loop over the values in the range(0, 10) but skip all of the odd numbers? Is that two concepts too?
Of course it is. Think about somebody who knows about looping, and knows what range() does, but has no concept of odd numbers. (Perhaps a very precocious child.) They would easily understand `for i in range(0, 10, 2)` but have no idea how to skip odd numbers.
If that example is too implausible for you, how about skipping the antisigma numbers?
Going back to odds and evens, you even described it as two steps:
1. loop over the values in the range(0, 10)
2. but skip all of the odd numbers.
That's two distict steps.
My point is that the *iteration* isn't any different - the *range* is what's different. A filtered range makes very good sense here. Hence, a disjoint range object (one capable of subtraction) would be an excellent solution to the OP's problem. So why should it be fundamentally different to apply conditions? Surely that's just another form of subsetting?
Because we have an easy way to spell that: range(0, 10, 2).
Sure. But why do we care so much about this trivial special case? Remember that the condition can be as general as we like. How about looping over numbers between 0 and a trillion whose bit count (number of 1s in binary) is a perfect number, except for those which are prime?
Exactly.
More practically, if you have some arbitrarily complicated iterable, and you wish to skip some of those items according to some arbitrarily complicated condition known only at runtime:
for url in filter( lambda url: url not in skiplist, map(make_url, webspider.follow_all(depth=10000)) ): if (url not in seen and is_image(url.filetype) and image:=Image.read(url) and image.detect_faces(**params).match_any(*targets) ): process(url)
do you still think that's conceptually a single operation?
"Process the URLs I want, duh!"
Well, that gets down to the question of what counts as a "sentence". You shouldn't cram everything onto a single line of code just because you can, just as you shouldn't write an English sentence ten pages long. But I don't see a problem with filtered iteration. We currently have a very clunky way of spelling it ("for url in (x for x in iterable if cond):"), and Python is perfectly happy to do this. There are plenty of times when it would make very good sense to consider the condition to be part of the iterable. ChrisA
On Tue, Jun 29, 2021 at 07:27:11PM +1000, Chris Angelico wrote:
My point is that the *iteration* isn't any different - the *range* is what's different. A filtered range makes very good sense here. Hence, a disjoint range object (one capable of subtraction) would be an excellent solution to the OP's problem.
I don't think that the OP's problem is *specifically and only* to iterate over disjoint ranges. I think that was just an illustration of the concept. I have no objection to having some sort of ordered integer disjoint ranges, but I don't think that will be sufficient for the OP. [...]
"Process the URLs I want, duh!"
Well, that gets down to the question of what counts as a "sentence". You shouldn't cram everything onto a single line of code just because you can, just as you shouldn't write an English sentence ten pages long.
Ah, you've been reading Daniel Defoe too! *wink* Remember that list comprehensions aren't about saving a line of code. We often split comprehensions over multiple lines to make them more readable. Comprehensions are about making an *expression*. I see no meaningful benefit to cramming a conditional test into the same line as a loop. Current syntax is fine. for item in iterable: if condition: block Putting the condition on the same line as the for doesn't make it an expression. In general, saving one line and one indent is too little benefit to make up for the cost of cramming more details into a single line.
But I don't see a problem with filtered iteration. We currently have a very clunky way of spelling it ("for url in (x for x in iterable if cond):"),
You can write it that way if you want, but why make two loops when you need only one? Isn't that doing more work than necessary? Using an extra loop inside a generator comprehension just to get an if is, I think, an example of being so sharp you cut yourself. -- Steve
On Tue, Jun 29, 2021 at 8:29 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Jun 29, 2021 at 07:27:11PM +1000, Chris Angelico wrote:
My point is that the *iteration* isn't any different - the *range* is what's different. A filtered range makes very good sense here. Hence, a disjoint range object (one capable of subtraction) would be an excellent solution to the OP's problem.
I don't think that the OP's problem is *specifically and only* to iterate over disjoint ranges. I think that was just an illustration of the concept.
I have no objection to having some sort of ordered integer disjoint ranges, but I don't think that will be sufficient for the OP.
[...]
"Process the URLs I want, duh!"
Well, that gets down to the question of what counts as a "sentence". You shouldn't cram everything onto a single line of code just because you can, just as you shouldn't write an English sentence ten pages long.
Ah, you've been reading Daniel Defoe too! *wink*
Or certain Reddit posts. Seriously guys, we don't have a shortage of punctuation, it's okay to use the stuff!
Remember that list comprehensions aren't about saving a line of code. We often split comprehensions over multiple lines to make them more readable. Comprehensions are about making an *expression*.
Yes. And sometimes the best way to write it is one line, other times the best way is two lines. It's a question of whether it's one thought or two.
I see no meaningful benefit to cramming a conditional test into the same line as a loop. Current syntax is fine.
for item in iterable: if condition: block
Putting the condition on the same line as the for doesn't make it an expression. In general, saving one line and one indent is too little benefit to make up for the cost of cramming more details into a single line.
Stop thinking in terms of "saving lines", because that's not the point. It's always possible to write things on more lines, more indents, and fewer details per line. In fact, you can use dis.dis() to show you a MUCH longer form of the same code. We don't write assembly code, not because we need to use fewer lines, but because higher level constructs *better express thoughts*. If you think in terms of "iterate over this, check this condition, do this if it's true", then the correct way to write it is the way you show: a for loop, an if statement, and the block of code. If you think in terms of "iterate over this subset and do this", then the correct way to write it is as a single iteration with a filter. Of course, sometimes we don't have the syntax that will let us write things the most elegant way possible. But that's why we're talking on python-ideas - we're discussing ways to make the language better, more expressive, more efficient, etc.
But I don't see a problem with filtered iteration. We currently have a very clunky way of spelling it ("for url in (x for x in iterable if cond):"),
You can write it that way if you want, but why make two loops when you need only one? Isn't that doing more work than necessary? Using an extra loop inside a generator comprehension just to get an if is, I think, an example of being so sharp you cut yourself.
Yes - that's the problem with the current way of spelling it. It's an entire additional loop, a lot of junk in the source code ("x for x" is utterly pointless here), and in current implementations, an entire layer of function call and generator. That's why filtered iteration with its own keyword *would* be of value. If you could simply write: for url in iterable if cond: then (a) it wouldn't have two layered loops, (b) it wouldn't pretend that you can attach an else clause to the if, (c) it clearly expresses the concept of "filtered iteration". Unfortunately, it also becomes annoyingly ambiguous with "for url in iterable if cond else otheriterable:", which is perfectly legal syntax (although I can't imagine many situations where you'd want that). That was resolved for comprehensions, so maybe that wouldn't even be a problem. My point is that filtered iteration is basically a change to the thing you're iterating over - it is NOT a change to the body of the loop. It belongs on the loop header. In some cases, like the OP's example, it could be done with an actual iterable; in other cases, less so; but it still belongs with the iterable. ChrisA
I think the more realistic option is to allow "if" in the for statement. This has been suggested before, you can find it in the archives. On Mon, 28 Jun 2021, 22:50 Max Shouman, <shouman.max@gmail.com> wrote:
This is more of a syntactic sugar than an actual new feature, but... Exactly, 'but' is the idea: a special keyword to be used in for statements to exclude values from the iterable.
E.g., when iterating over a generator:
for i in range(0, 10) but (2, 8): would implicitly create a new generator comprehensively, as in: for i in (j for j in range(0, 10) if j not in [2, 8]):
It might not add such a feature to justify the definition of a but_stmt in python.gram, but it's fully compliant with Python's philosophy of concise, clear and elegant code.
#road to a programming natural language (jk) _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/YI2ILM... Code of Conduct: http://python.org/psf/codeofconduct/
participants (8)
-
Chris Angelico
-
David Mertz
-
Henk-Jaap Wagenaar
-
Johnathan Irvin
-
Max Shouman
-
Richard Damon
-
Rob Cliffe
-
Steven D'Aprano