[Python-ideas] Map-then-filter in comprehensions
Sjoerd Job Postmus
sjoerdjob at sjec.nl
Fri Mar 11 08:23:51 EST 2016
On Thu, Mar 10, 2016 at 11:20:21PM +0100, Michał Żukowski wrote:
> Some time ago I was trying to solve the same issue but using a new keyword
> "where", and I thought that new keyword is too much for just list
> comprehension filtering, so I've made it something like assignments in
> expresion, eg.
>
> (x+y)**2 + (x-y)**2 where x=1, y=2
>
> So for list comprehension I can write:
>
> [stripped for line in lines if stripped where stripped=line.strip()]
>
> or:
>
> result = map(f, objs) where f=lambda x: x.return_something()
>
> or:
>
> it = iter(lines)
> while len(line) > 4 where line=next(it, '').strip():
> print(line)
>
> or:
>
> lambda x, y: (
> 0 if z == 0 else
> 1 if z > 0 else
> -1) where z = x + y
>
> or even:
>
> lambda something: d where (d, _)=something, d['a']=1
>
> I even implemented it:
> https://github.com/thektulu/cpython/commit/9e669d63d292a639eb6ba2ecea3ed2c0c23f2636
>
> and it works nicely. I was thinking to reuse "with [expr] as [var]" but I
> also don't like idea of context sensitive semantics, and I even thought
> that maybe someone, someday would want to write "content = fp.read() with
> open('foo.txt') as fp"...
>
> The "where" keyword is from guards pattern in Haskell :)
But in Haskell, the `where` keyword also considers scoping. That is,
outside the statement/expression with the `where`, you can't access the
variables introduced by the where.
Even though the `where` looks kind-of-nice, it (at least to me) is also
a bit confusing with respect to evaluation order. Consider
[ stripped for idx, line in enumerate(lines) if idx >= 5 or stripped where stripped=line.strip() ]
(intended semantics: give me all lines (stripped), but ignore
any lines that are whitespace-only in the first 5 lines)
retval = []
for idx, line in enumerate(lines):
stripped = line.strip()
if idx >= 5 or stripped:
retval.append(stripped)
now I'm not very sure, but I expect what actually happens is:
retval = []
for idx, line in enumerate(lines):
if idx < 5:
stripped = line.strip()
if idx >= 5 or stripped:
retval.append(stripped)
that is, should I read it as
(if idx >= 5 or stripped) where stripped=line.strip()
or
if idx >= 5 or (stripped where stripped=line.strip())
For comprehensions, I'd think the 'let' statement might make more sense.
Abusing Haskell's notation:
[ stripped | (idx, line) <- zip [0..] lines, let stripped = strip line, idx >= 5 || length stripped > 0 ]
Porting this to something Python-ish, it'd be
[ stripped for idx, line in enumerate(lines) let stripped = line.strip() if idx >= 5 or stripped ]
where `let` is a keyword (possibly only applicable in a compexpr). In
Haskell it's a keyword everywhere, but it has somewhat different
semantics.
More information about the Python-ideas
mailing list