[Python-ideas] About list comprehension syntax

Wed May 30 19:31:55 CEST 2007

On 30 May 2007, at 17:30, Terry Reedy wrote:

>
> "Arnaud Delobelle" <arno at marooned.org.uk> wrote in
> message news:8C1BDF74-1DAB-4F64-A28E-16788C48AA95 at marooned.org.uk...
> | Hi
> |
> | List comprehensions (and generator expressions) come in two
> | 'flavours' at the moment:
>
> Actually, you can have 1 to many for clauses and 0 to many if clauses.

That's true.  I use that very seldom in fact.
[...]
> |
> | Now if one wants to write simply filter(p, L) as a list
> | comprehension, one has to write:
> |
> | (3) [x for x in L if p(x)].  This could be called a 'filter
> | comprehension'.
[...]
> | Why not just drop the 'x for' at the start of a 'filter
> | comprehension' (or generator expression)?
>
> Because such micro abbreviations are against the spirit of Python,  
> which is
> designed for readability over writablilty.  Even for a writer, it  
> might
> take as much time to mentally deal with the exception and to simply  
> type
> 'for x', which takes all of a second.

I wasn't suggesting this to save myself from typing 5 characters.   
You'll find it strange but I actually find [x in L if p(x)] more  
readable than [x for x in L if p(x)].  To me it says that I'm  
filtering, not mapping.

> Also, this breaks the mapping
> between for/if statements and clauses and makes the code ambiguous  
> for both
> humans and the parser
>

By ambiguous do you mean 'difficult to parse'?  I didn't think it was  
ambiguous in the technical sense.

> |  Thus (3) could be written more simply as:
> |
> | (3') [x in L if p(x)]
>
> (x in L) is a legal expression already.  (x in L) if p(x) looks  
> like the
> beginning of (x in L) if p(x) else 'blah' .  The whole thing looks  
> like a
> list literal with an incompletely specified one element.

I'm not sure I understand. I agree that

	x if (y in L if p(y)) else z

doesn't look great. Neither does

	x if (y for y in L if p(y)) else z

Well, the 'for' in the second one is a bit of a hint, I suppose.  I  
wouldn't write either anyway.  Most of the time when I write a list  
comprehension / generator expression it is to bind it to a name.

> | This is consistent with common mathematical notation:
>
> 'Common mathematical notation' is not codified and varies from  
> writer to
> writer and even within the work of one writer.  Humans make do and  
> make
> guesses, but parser programs are less flexible.

Yet all modern mathematicians will understand the three forms without  
any hesitation and 'making guesses' (consciously at least).

> | * { f(x) | x \in L } means the set of all f(x) for x in L
> | * { f(x) | x \in L, p(x) } means the set of all f(x) for x in L
> | satisfying predicate p.
> | * { x \in L | p(x) } means the set of all x in L satisfying  
> predicate p.
>
> I personally do not like the inconsistency of the last form, which  
> flips
> '\in L' over the bar just because f(x) is the identify function.

In fact the last form is 'the consistent one', as the first two  
should really be written as:

* { y \in M | \exists x \in L, y=f(x) }
* { y \in M | \exists x \in L, p(x) and y=f(x) }

(M being the codomain of f)

;oP

Anyway, while I still like the idea, you've made me think about it as  
some sort of 'useless tinkering', which is probably is.

-- 
Arnaud