[Python-ideas] Accessing the result of comprehension's expression from the conditional

Sat Jun 20 02:32:51 CEST 2009

On Sat, 20 Jun 2009 08:31:26 am Lie Ryan wrote:

> > If you want expression first, then filter, you can get that already
> > in a one-liner:
> >
> > filter(lambda x: x > 0, [f(x) for x in seq])
> >
> > Don't create new syntax when there are perfectly good functions
> > that do the job already.
>
> That's ugly because of the same reason for using map():
> [y for y in map(lambda x: f(x), seq) if y > 0]

You don't like lambda? Fine, define an external function first. Then you 
can write:

filter(pred, (f(x) for x in seq))

There's no violation of DRY, there's no redundancy, there's no lambda, 
there's no "y" variable needed. What's ugly about it?

> or nested comprehension:
> [y for y in (f(x) for x in seq) if y > 0]

You seem to be labouring under the misapprehension that anything that 
requires two steps instead of one is "ugly". I find the nested 
comprehension perfectly readable, although for more complicated cases 
I'd split it into two explicit steps. It is (almost) completely 
general, covering both filtering on *both* input args and output args:

gen = (3*x**2-5*x+4 for x in seq if x % 3 != 2)
result = [y for y in gen if -3 < y < 3]

The only case it doesn't cover is where the second filter depends on the 
value of x, and even that can be covered with a *tiny* bit more work:

gen = ((x, 3*x**2-5*x+4) for x in seq if x % 3 != 2)
result = [y[1] for y in gen if -y[0] < y[1] < y[0]]

It requires no new syntax, no changes to the behaviour of list comps, no 
new meaning on "as", it's understandable and readable.

Compare your suggestion:

[3*x**2-5*x+4 as y for x in seq if (x % 3 != 2) and (-x < y < x)]

Disadvantages:

- It requires new syntax.
- It requires new behaviour to list comps and generator expressions.
- It creates yet another meaning to the keyword "as".

Advantages:

- It requires no intermediate tuple. But since intermediate tuples are 
only required for a tiny proportion of use-cases, this is not much of a 
advantage.
- It loops over the data once rather than twice, but since it does twice 
as much work inside the loop the only saving is the setup and teardown 
costs of the second loop. Truly a micro-optimization, and premature at 
that.

-- 
Steven D'Aprano