PEP 289: Generator Expressions (please comment)

Alex Martelli aleax at aleax.it
Tue Oct 28 08:32:13 EST 2003


Werner Schiendl wrote:

> Alex Martelli wrote:
> 
>> No, but, in all cases where the lack of parentheses would make the
>> syntax unclear, you do need parentheses even today to denote tuples.
>> Cfr:
>>     [1,2,3]
>> versus
>>     [(1,2,3)]
>> a list of three items, versus a list of one tuple.
> 
> ok, but that's the whole point (1,2,3) *is* a tuple.

It's a tuple in parentheses.

The parentheses are only necessary when the syntax would otherwise
be unclear, and THAT is the whole point.

>>> x = 1, 2, 3
>>> type(x)
<type 'tuple'>

See?  "Look ma, no parentheses", because none are needed in the
assignment to x -- the tuple, by itself, is parentheses-less,
though people often (understandably) put redundant parentheses
around it anyway, just in case (so do repr and str when you
apply them to a tuple object, again for clarity).

Normally, tuples are indicated by commas; but when commas could
mean something else (list displays, function calls), then you 
ALSO put parentheses around the tuple and thus clarify things
(you also must use parentheses to indicate an EMPTY tuple,
because there is no comma involved in that case).


> if you want a 1-tuple in a list, you've to write
> 
>    [(1,)]

Yes, and if you write [1,] WITHOUT the parentheses, that's
a list of one number, NOT a list of one tuple -- exactly my
point.

> With "similar to tuples" I meant the kind of indication that I want
> a list with exactly one item that is a generator.

So you parenthesize the one item to disambiguate, just like you
would parenthesize the one item to disambiguate if the one item
was a tuple such as
    1,
rather than a genexp such as
    x*x for x in foo

I.e., the proposed syntax, requiring parentheses, IS strictly
similar to the identical requirement for parentheses in tuples
(when either a tuple or a genexp happens to be the one item
in a list display).

To assign a one-item tuple to x you can write

   x = 23,

the comma is what makes the tuple, and parentheses have nothing 
to do with the case.  You can of course choose to add redundant 
parentheses, if you think they improve readability, just like
you can generally include redundant parentheses in just about
all kinds of expressions for exactly the same reason (e.g. you
fear a reader would not feel secure about operator priorities).

The difference between tuples and the proposed syntax for
genexps is that genexp are going to _require_ parentheses
around them -- unless the parentheses already "happen" to be
there, i.e., in the frequent case in which the genexp is the
only argument to a function or type call.  Basically, that's
because, even though the _compiler_ would have no problems,
Guido judges that a _human reader_ might, in parsing

    x = y for y in foo

as meaning

    x = (y for y in foo)

and NOT the incorrect

    (x = y) for y in foo

So the parentheses are mandated to make it clear beyond any
doubt that the former, NOT the latter, parse is meant.


> The trailing comma rule *is* there in the language already.

A trailing comma is allowed in any "comma-separated list".
A tuple display is a comma-separated list.
Therefore, a tuple display allows a trailing comma.

[A SINGLETON tuple requires a trailing comma because commas
are what DEFINES that a tuple display is occurring -- except
in certain contexts such as list displays and function calls,
where the commas have another meaning, and for an empty
tuple, where commas are -- perhaps arbitrarily -- disallowed]

A genexp is not a comma-separated list, and has nothing to do
with comma-separated lists, therefor it would be entirely
arbitrary and capricious to inject commas in its syntax AT ALL.


> Still I do not like the follwing example from the specification:
> 
> g = (x**2 for x in range(10))
> 
> Whatfor are the parens?

See above: to avoid humans misparsing the unparenthesized form
as the incorrect (g = x**2) for ...


> To me this looks like requiring parens for
> 
> a = b + c

If there was any risk of humans misparsing this as the
incorrect (a = b) + c -- then parentheses around the RHS
might be prudent.  Fortunately, in this case there is no
such risk.  Generator expressions are novel enough that
Guido judges the risk would exist in their case, and most
particularly since assigning a genexp to a variable will be
extremely rare usage (could you point to a few compelling
use cases for it...?) -- therefore it's absolutely NOT a
bad idea to force the author of code for that rare case to
type a couple of extra characters to make his intentions
absolutely crystal-clear to human readers, who can easily
be unfamiliar with that rare case and will thus be helped.

The "a = b + c" case is exactly the opposite: it IS very
common, therefore [1] typical readers don't need help
parsing it because it's familiar to them AND [2] forcing
parentheses in a very common case would be cumbersome
(while it is not to force them in a very rare case).


Alex





More information about the Python-list mailing list