
Hello, I recently realized the following, and wonder whether there is a reason for this, which I am unable to figure out. A list comp can both map and filter: [x*x for x in numbers if x%2==1] When it only maps, we can simply get rid of the filter part: [x*x for x in numbers] Meaning we are not forced to write: [x*x for x in numbers if True] But the analog does not apply to filter-only list comps: [x in numbers if x%2==1] # SyntaxError Why? I find this syntax clear, actually clearer than [x for x in numbers if x%2==1] that confusingly repeats the item. Would there be any parsing issue if we let down "<expression> for" when it does nothing? Side note: I find the core part of a comprehension be "<item> in <collection>": [item in collection] would be equivalent to list(collection). (item in collection) would just build an iterator on collection. From this POV, this core can be preceded by a mapping expression and/or followed by a filtering condition. Denis -- ________________________________ la vita e estrany spir.wikidot.com

spir wrote:
Yes, there's a parsing problem: "x in numbers" is a containment test "x in numbers if cond else whatever" is a containment test combined with a conditional expression "x for x in numbers" and "x for x in numbers if cond" are genexps/comprehensions (depending on the kind of brackets you put around them, if any) "x in number if cond" is none of the above, but the parser can't tell it isn't meant to be the second one and hence will choke on the missing "else":
Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Nick Coghlan wrote:
Even aside from the interpreter parsing problem, there's actually a more important human parsing problem. Without a filter call to imply iteration, we want to retain the "for" keyword so human readers can easily tell there is a loop involved in the construct. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Thu, Feb 25, 2010 at 8:45 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Even aside from the interpreter parsing problem, there's actually a more important human parsing problem.
Another human parsing problem is this: x = [(k, v) in somedict.items() if condition(v)] What should be stored in the result? While we can decide what it should be, it's not blindingly obvious for a reader. -Fred -- Fred L. Drake, Jr. <fdrake at gmail.com> "Chaos is the score upon which reality is written." --Henry Miller

spir wrote:
But the analog does not apply to filter-only list comps: [x in numbers if x%2==1] # SyntaxError
If you were allowed to say that, you should also be able to say [x in numbers] but this already has a meaning as an ordinary list constructor. Requiring the word 'for' to appear in an LC ensures that there is never any ambiguity. However, it might be possible to phrase it another way, e.g. [x from numbers if x%2==1] -- Greg

On Thu, Feb 25, 2010 at 6:40 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
[x from numbers if x%2==1]
I like this, but it has the same readability issue I noted for cases where decomposition is used: [(k, v) from somedict.items() if condition(v)] -Fred -- Fred L. Drake, Jr. <fdrake at gmail.com> "Chaos is the score upon which reality is written." --Henry Miller

Fred Drake wrote:
I don't think that's too bad if you keep in mind that [x from stuff] is a shorthand for [x for x in stuff] whatever x happens to be. So your example would be equivalent to [(k, v) for (k, v) in somedict.items() if condition(v)] As a bonus, the short form could probably be made more efficient, because the tuples produced by the for-loop could be put straight into the result list, instead of having to re-pack k and v into a new tuple. -- Greg

Stephen J. Turnbull, 26.02.2010 04:28:
Absolutely, so I would expect this to have zero change to make it in.
That's a good idea. Note, however, that the item extracted from the iterable is not necessarily a tuple: >>> l = [[1,2], [3,4]] >>> [ (x,y) for (x,y) in l] [(1, 2), (3, 4)] So the tuple packing may or may not be required, but that can only be decided at runtime. Stefan

Am 26.02.2010 17:09, schrieb Boris Borcic:
Not really -- remember that there is "obvious" in the acronym. We should strive to make such changes that don't violate it. For example, the "with" statement introduced the obvious way to handle resources. The "if-else" expression introduced the obvious way to express conditionals. Etc. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On Fri, 26 Feb 2010 22:32:02 +0100 Georg Brandl <g.brandl@gmx.net> wrote:
From that point of view, I guess the obvious way to express the absence of a mapping should not be the expression of an identity mapping (in the sense of the math identity function) ;-) (for) item in collection vs item for item in collection In the same way that we don't express the absence of a filter using a True filter: expr for item in collection vs expr for item in collection if True But sure, again, the ambiguity of "in" seems to be an obstacle for python to adopt the obvious way. Denis -- ________________________________ la vita e estrany spir.wikidot.com

spir wrote:
But sure, again, the ambiguity of "in" seems to be an obstacle for python to adopt the obvious way.
Again, the human parsing problem is actually more important than the computer parser limitations. The Python for loop conditions developers to think "for x in seq <do something>" (where <do something> is written out as a colon, a newline, some indentation and additional code). The genexp/comprehension syntax, brings the "<do something>" to the front as "<produce this value> for x in seq". Mapping is just the degenerate case where the value provided *is* x, but making that implicit doesn't make *anything* more obvious in the way that omitting the filter clause obviously means "unfiltered". Instead, omitting the clause makes it look like the start of a regular for loop (assuming the for keyword is kept), so a developer is going to go looking for the "do something" part of the loop and won't find it. You're suggesting a change that doesn't make anything clearer (and, in fact, will almost certainly make reading code harder since developers will have another idiom to learn), provides a second way of writing "x for x", and doesn't really even save the person writing the code any time. What's the concrete benefit that justifies even the time we've spent discussing the matter in this thread, let alone anyone actually taking the time to code the necessary grammar changes? Cheers, Nick. P.S. We haven't even mentioned the fact that this runs afoul of the language change moratorium yet :) -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Fri, 26 Feb 2010 13:11:10 +1300 Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
(Discussing my own shorthand proposal :-) Sure, but remains a lexical issue: [x in stuff] (x in stuff) cannot be accepted because of ambiguity with membership test in the iterator case. [x from stuff] (x from stuff)) solves that. But then for consistency "from" should be extented to all comprehensions [x*x for x from stuff if x%2==1] and, more importantly, to traversal loops: for x from stuff: because the sense is analog. "in" would remain only as membership test operator. This is very few probable ;-) The issue is the ambiguity of "in". Denis -- ________________________________ la vita e estrany spir.wikidot.com

On Thu, 25 Feb 2010 18:35:17 -0500 Fred Drake <fdrake@acm.org> wrote:
You're right on readability. On the other hand, unpacking is a common practice in python. Can this be mis-interpreted? (I mean, is it ambiguous?) Also note that custom iterators can generate "items" of arbitrary complexity. This point cannot be addressed by the syntax for comprehensions. Denis -- ________________________________ la vita e estrany spir.wikidot.com

On Fri, 26 Feb 2010 12:40:22 +1300 Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Yes, that's precisely the semantics it should have (consistent with my view that <item in collection> is the core of a comprehension), since there is neither a mapping, nore a filter.
Requiring the word 'for' to appear in an LC ensures that there is never any ambiguity.
Yes, but the issue is rather with (x in numbers) beeing a test. (--> ambiguity of "in")
Wow, I like that. Better term than "in", in my opinion. (and solves the ambiguity of "in", but then consistency require to change "for x in numbers", which has about -1% chances to happen, I guess ;-) Denis -- ________________________________ la vita e estrany spir.wikidot.com

spir wrote:
Yes, there's a parsing problem: "x in numbers" is a containment test "x in numbers if cond else whatever" is a containment test combined with a conditional expression "x for x in numbers" and "x for x in numbers if cond" are genexps/comprehensions (depending on the kind of brackets you put around them, if any) "x in number if cond" is none of the above, but the parser can't tell it isn't meant to be the second one and hence will choke on the missing "else":
Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Nick Coghlan wrote:
Even aside from the interpreter parsing problem, there's actually a more important human parsing problem. Without a filter call to imply iteration, we want to retain the "for" keyword so human readers can easily tell there is a loop involved in the construct. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Thu, Feb 25, 2010 at 8:45 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Even aside from the interpreter parsing problem, there's actually a more important human parsing problem.
Another human parsing problem is this: x = [(k, v) in somedict.items() if condition(v)] What should be stored in the result? While we can decide what it should be, it's not blindingly obvious for a reader. -Fred -- Fred L. Drake, Jr. <fdrake at gmail.com> "Chaos is the score upon which reality is written." --Henry Miller

spir wrote:
But the analog does not apply to filter-only list comps: [x in numbers if x%2==1] # SyntaxError
If you were allowed to say that, you should also be able to say [x in numbers] but this already has a meaning as an ordinary list constructor. Requiring the word 'for' to appear in an LC ensures that there is never any ambiguity. However, it might be possible to phrase it another way, e.g. [x from numbers if x%2==1] -- Greg

On Thu, Feb 25, 2010 at 6:40 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
[x from numbers if x%2==1]
I like this, but it has the same readability issue I noted for cases where decomposition is used: [(k, v) from somedict.items() if condition(v)] -Fred -- Fred L. Drake, Jr. <fdrake at gmail.com> "Chaos is the score upon which reality is written." --Henry Miller

Fred Drake wrote:
I don't think that's too bad if you keep in mind that [x from stuff] is a shorthand for [x for x in stuff] whatever x happens to be. So your example would be equivalent to [(k, v) for (k, v) in somedict.items() if condition(v)] As a bonus, the short form could probably be made more efficient, because the tuples produced by the for-loop could be put straight into the result list, instead of having to re-pack k and v into a new tuple. -- Greg

Stephen J. Turnbull, 26.02.2010 04:28:
Absolutely, so I would expect this to have zero change to make it in.
That's a good idea. Note, however, that the item extracted from the iterable is not necessarily a tuple: >>> l = [[1,2], [3,4]] >>> [ (x,y) for (x,y) in l] [(1, 2), (3, 4)] So the tuple packing may or may not be required, but that can only be decided at runtime. Stefan

Am 26.02.2010 17:09, schrieb Boris Borcic:
Not really -- remember that there is "obvious" in the acronym. We should strive to make such changes that don't violate it. For example, the "with" statement introduced the obvious way to handle resources. The "if-else" expression introduced the obvious way to express conditionals. Etc. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.

On Fri, 26 Feb 2010 22:32:02 +0100 Georg Brandl <g.brandl@gmx.net> wrote:
From that point of view, I guess the obvious way to express the absence of a mapping should not be the expression of an identity mapping (in the sense of the math identity function) ;-) (for) item in collection vs item for item in collection In the same way that we don't express the absence of a filter using a True filter: expr for item in collection vs expr for item in collection if True But sure, again, the ambiguity of "in" seems to be an obstacle for python to adopt the obvious way. Denis -- ________________________________ la vita e estrany spir.wikidot.com

spir wrote:
But sure, again, the ambiguity of "in" seems to be an obstacle for python to adopt the obvious way.
Again, the human parsing problem is actually more important than the computer parser limitations. The Python for loop conditions developers to think "for x in seq <do something>" (where <do something> is written out as a colon, a newline, some indentation and additional code). The genexp/comprehension syntax, brings the "<do something>" to the front as "<produce this value> for x in seq". Mapping is just the degenerate case where the value provided *is* x, but making that implicit doesn't make *anything* more obvious in the way that omitting the filter clause obviously means "unfiltered". Instead, omitting the clause makes it look like the start of a regular for loop (assuming the for keyword is kept), so a developer is going to go looking for the "do something" part of the loop and won't find it. You're suggesting a change that doesn't make anything clearer (and, in fact, will almost certainly make reading code harder since developers will have another idiom to learn), provides a second way of writing "x for x", and doesn't really even save the person writing the code any time. What's the concrete benefit that justifies even the time we've spent discussing the matter in this thread, let alone anyone actually taking the time to code the necessary grammar changes? Cheers, Nick. P.S. We haven't even mentioned the fact that this runs afoul of the language change moratorium yet :) -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

Stephen J. Turnbull wrote:
I don't think it's as straightforward as all that. You can't do it at compile time, because even if the target and result expression are textually identical, the repacked item isn't necessarily the same type as the one produced by the iteration. -- Greg

On Fri, 26 Feb 2010 13:11:10 +1300 Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
(Discussing my own shorthand proposal :-) Sure, but remains a lexical issue: [x in stuff] (x in stuff) cannot be accepted because of ambiguity with membership test in the iterator case. [x from stuff] (x from stuff)) solves that. But then for consistency "from" should be extented to all comprehensions [x*x for x from stuff if x%2==1] and, more importantly, to traversal loops: for x from stuff: because the sense is analog. "in" would remain only as membership test operator. This is very few probable ;-) The issue is the ambiguity of "in". Denis -- ________________________________ la vita e estrany spir.wikidot.com

On Thu, 25 Feb 2010 18:35:17 -0500 Fred Drake <fdrake@acm.org> wrote:
You're right on readability. On the other hand, unpacking is a common practice in python. Can this be mis-interpreted? (I mean, is it ambiguous?) Also note that custom iterators can generate "items" of arbitrary complexity. This point cannot be addressed by the syntax for comprehensions. Denis -- ________________________________ la vita e estrany spir.wikidot.com

On Fri, 26 Feb 2010 12:40:22 +1300 Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Yes, that's precisely the semantics it should have (consistent with my view that <item in collection> is the core of a comprehension), since there is neither a mapping, nore a filter.
Requiring the word 'for' to appear in an LC ensures that there is never any ambiguity.
Yes, but the issue is rather with (x in numbers) beeing a test. (--> ambiguity of "in")
Wow, I like that. Better term than "in", in my opinion. (and solves the ambiguity of "in", but then consistency require to change "for x in numbers", which has about -1% chances to happen, I guess ;-) Denis -- ________________________________ la vita e estrany spir.wikidot.com
participants (8)
-
Boris Borcic
-
Fred Drake
-
Georg Brandl
-
Greg Ewing
-
Nick Coghlan
-
spir
-
Stefan Behnel
-
Stephen J. Turnbull