About list comprehension syntax
Hi List comprehensions (and generator expressions) come in two 'flavours' at the moment: (1) [f(x) for x in L], which stands for map(f, L). Let's call this a 'map comprehension' (2) [f(x) for x in L if p(x)], which stands for map(f, filter(p, L)). Let's call this a 'map-filter comprehension'. Now if one wants to write simply filter(p, L) as a list comprehension, one has to write: (3) [x for x in L if p(x)]. This could be called a 'filter comprehension'. the 'x for x in L' is not very nice IMHO, but it is often handy to use such expressions over 'filter(...)', eg building the sublist of a given list consisting of all the items of a given type could be written as: filter(lambda x: isinstance(x, FilteringType), heterogeneous_list) or: [x for x in heterogenous_list if isinstance(x, FilteringType)] I still prefer the list comprehension over the lambda/filter combination, but neither feels very satisfying (to me :) (not that one cannot use partial in the filter version) Why not just drop the 'x for' at the start of a 'filter comprehension' (or generator expression)? Thus (3) could be written more simply as: (3') [x in L if p(x)] This is consistent with common mathematical notation: * { f(x) | x \in L } means the set of all f(x) for x in L * { f(x) | x \in L, p(x) } means the set of all f(x) for x in L satisfying predicate p. * { x \in L | p(x) } means the set of all x in L satisfying predicate p. -- Arnaud
"Arnaud Delobelle" <arno@marooned.org.uk> wrote in message news:8C1BDF74-1DAB-4F64-A28E-16788C48AA95@marooned.org.uk... | Hi | | List comprehensions (and generator expressions) come in two | 'flavours' at the moment: Actually, you can have 1 to many for clauses and 0 to many if clauses. | (1) [f(x) for x in L], which stands for map(f, L). Let's call this a | 'map comprehension' | | (2) [f(x) for x in L if p(x)], which stands for map(f, filter(p, L)). | Let's call this a 'map-filter comprehension'. | | Now if one wants to write simply filter(p, L) as a list | comprehension, one has to write: | | (3) [x for x in L if p(x)]. This could be called a 'filter | comprehension'. | | the 'x for x in L' is not very nice IMHO, but it is often handy to | use such expressions over 'filter(...)', eg building the sublist of a | given list consisting of all the items of a given type could be | written as: | | filter(lambda x: isinstance(x, FilteringType), heterogeneous_list) | | or: | | [x for x in heterogenous_list if isinstance(x, FilteringType)] | | I still prefer the list comprehension over the lambda/filter | combination, but neither feels very satisfying (to me :) (not that | one cannot use partial in the filter version) | | Why not just drop the 'x for' at the start of a 'filter | comprehension' (or generator expression)? Because such micro abbreviations are against the spirit of Python, which is designed for readability over writablilty. Even for a writer, it might take as much time to mentally deal with the exception and to simply type 'for x', which takes all of a second. Also, this breaks the mapping between for/if statements and clauses and makes the code ambiguous for both humans and the parser | Thus (3) could be written more simply as: | | (3') [x in L if p(x)] (x in L) is a legal expression already. (x in L) if p(x) looks like the beginning of (x in L) if p(x) else 'blah' . The whole thing looks like a list literal with an incompletely specified one element. | This is consistent with common mathematical notation: 'Common mathematical notation' is not codified and varies from writer to writer and even within the work of one writer. Humans make do and make guesses, but parser programs are less flexible. | * { f(x) | x \in L } means the set of all f(x) for x in L | * { f(x) | x \in L, p(x) } means the set of all f(x) for x in L | satisfying predicate p. | * { x \in L | p(x) } means the set of all x in L satisfying predicate p. I personally do not like the inconsistency of the last form, which flips '\in L' over the bar just because f(x) is the identify function. It would be OK though in a situation where that was the only set comprehension being used. But that is not the case with Python. Terry Jan Reedy
On 30 May 2007, at 17:30, Terry Reedy wrote:
"Arnaud Delobelle" <arno@marooned.org.uk> wrote in message news:8C1BDF74-1DAB-4F64-A28E-16788C48AA95@marooned.org.uk... | Hi | | List comprehensions (and generator expressions) come in two | 'flavours' at the moment:
Actually, you can have 1 to many for clauses and 0 to many if clauses.
That's true. I use that very seldom in fact. [...]
| | Now if one wants to write simply filter(p, L) as a list | comprehension, one has to write: | | (3) [x for x in L if p(x)]. This could be called a 'filter | comprehension'. [...] | Why not just drop the 'x for' at the start of a 'filter | comprehension' (or generator expression)?
Because such micro abbreviations are against the spirit of Python, which is designed for readability over writablilty. Even for a writer, it might take as much time to mentally deal with the exception and to simply type 'for x', which takes all of a second.
I wasn't suggesting this to save myself from typing 5 characters. You'll find it strange but I actually find [x in L if p(x)] more readable than [x for x in L if p(x)]. To me it says that I'm filtering, not mapping.
Also, this breaks the mapping between for/if statements and clauses and makes the code ambiguous for both humans and the parser
By ambiguous do you mean 'difficult to parse'? I didn't think it was ambiguous in the technical sense.
| Thus (3) could be written more simply as: | | (3') [x in L if p(x)]
(x in L) is a legal expression already. (x in L) if p(x) looks like the beginning of (x in L) if p(x) else 'blah' . The whole thing looks like a list literal with an incompletely specified one element.
I'm not sure I understand. I agree that x if (y in L if p(y)) else z doesn't look great. Neither does x if (y for y in L if p(y)) else z Well, the 'for' in the second one is a bit of a hint, I suppose. I wouldn't write either anyway. Most of the time when I write a list comprehension / generator expression it is to bind it to a name.
| This is consistent with common mathematical notation:
'Common mathematical notation' is not codified and varies from writer to writer and even within the work of one writer. Humans make do and make guesses, but parser programs are less flexible.
Yet all modern mathematicians will understand the three forms without any hesitation and 'making guesses' (consciously at least).
| * { f(x) | x \in L } means the set of all f(x) for x in L | * { f(x) | x \in L, p(x) } means the set of all f(x) for x in L | satisfying predicate p. | * { x \in L | p(x) } means the set of all x in L satisfying predicate p.
I personally do not like the inconsistency of the last form, which flips '\in L' over the bar just because f(x) is the identify function.
In fact the last form is 'the consistent one', as the first two should really be written as: * { y \in M | \exists x \in L, y=f(x) } * { y \in M | \exists x \in L, p(x) and y=f(x) } (M being the codomain of f) ;oP Anyway, while I still like the idea, you've made me think about it as some sort of 'useless tinkering', which is probably is. -- Arnaud
Terry Reedy wrote:
"Arnaud Delobelle" ... | Thus (3) could be written more simply as: | | (3') [x in L if p(x)]
(x in L) is a legal expression already.
That's the only real issue IMO - and I agree there is no acceptable solution.
(x in L) if p(x) looks like the beginning of (x in L) if p(x) else 'blah' . The whole thing looks like a list literal with an incompletely specified one element.
| This is consistent with common mathematical notation:
'Common mathematical notation' is not codified and varies from writer to writer and even within the work of one writer. Humans make do and make guesses, but parser programs are less flexible.
I guess it also depends on how much math (eg theorem proofs) one had to deal with. FWIW, it took me months to adapt to the correct Python listcomp/genexp syntax, after being bitten dozens of times by Python not accepting Arnaud's (3') form above. The latter was *much* more natural to my fingers. Cheers, BB
Arnaud Delobelle <arno@marooned.org.uk> wrote:
Why not just drop the 'x for' at the start of a 'filter comprehension' (or generator expression)? Thus (3) could be written more simply as:
Explicit is better than implicit. Special cases aren't special enough to break the rules. There should be one-- and preferably only one --obvious way to do it. - Josiah
Arnaud Delobelle wrote:
Why not just drop the 'x for' at the start of a 'filter comprehension' (or generator expression)? Thus (3) could be written more simply as:
(3') [x in L if p(x)]
It would be very nice, but could be difficult to parse, because there's no clue you're not looking at a normal list constructor until you get to the 'if'. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing@canterbury.ac.nz +--------------------------------------+
Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Arnaud Delobelle wrote:
Why not just drop the 'x for' at the start of a 'filter comprehension' (or generator expression)? Thus (3) could be written more simply as:
(3') [x in L if p(x)]
It would be very nice, but could be difficult to parse, because there's no clue you're not looking at a normal list constructor until you get to the 'if'.
Even then it's not terribly clear except that there's a missing 'else' clause for conditionals... y = [x in L if p(x) else None] Will create a list of a single value in Python 2.5 . - Josiah
On Wed, 30 May 2007 12:41:51 +0200, Arnaud Delobelle <arno@marooned.org.uk> wrote:
(3') [x in L if p(x)]
I like the idea as well. It doesn't look ambiguous to me. 'if' can only appear as a statement or in conjunction with an 'else', so this expression can't mean anything else imo. Jan
Jan Kanis schrieb:
On Wed, 30 May 2007 12:41:51 +0200, Arnaud Delobelle <arno@marooned.org.uk> wrote:
(3') [x in L if p(x)]
I like the idea as well. It doesn't look ambiguous to me. 'if' can only appear as a statement or in conjunction with an 'else', so this expression can't mean anything else imo.
Tell that to the LL(1) parser ;) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out.
participants (7)
-
Arnaud Delobelle
-
Boris Borcic
-
Georg Brandl
-
Greg Ewing
-
Jan Kanis
-
Josiah Carlson
-
Terry Reedy