map, filter, lambda, list comprehensions (was RE: parameter undefined in procedure)

Evan Simpson evan at 4-am.com
Sun Feb 27 18:15:30 CET 2000


Tim Peters <tim_one at email.msn.com> wrote in message
news:000701bf8103$e87c2bc0$172d153f at tim...
> The biggest attraction for Guido was in allowing to replace map and
filter,
> and many uses of lambda, with a much clearer ("more Pythonic") construct.

> But multi-argument uses of "map" require iterating over sequences in
> parallel (lockstep, not nested), and Python's "for" can't express that
> (directly) today.  Since "for" should act much the same whether in a loop
or
> a comprehension, it was decided that Python should first grow "parallel
> iteration 'for'" syntax.  That was scheduled for 1.6, but looks like it's
> getting delayed in the crunch to get Unicode support out the door.

Thanks for finding that posting; I actually think I "get" list
comprehensions now.  They describe a list by giving the expression used to
compute an element, generator(s) which produce the arguments to the
expression, and an optional filter expression.  Right?

So the difficulty you describe above is the problem of Pythonically
expressing things like 'for x, y in map(None, s1, s2)' versus the cartesian
'for x in s1, for y in s2'.  Despite my utter lack of language
implementation experience, I can never resist proposing a syntax, so here
goes:

We want to be able to combine multiple sequences in any combination of
nested and parallel iteration.  Redefining 'x, y in s1, s2' is right out,
and anything too similar to this would be confusing.  Do we want to avoid
computing the whole result before iteration begins?  I would think so.  We
probably just want to tell the comprehension/loop how to manage its internal
indexes.  If we want to spell 'cartesian product' or 'parallel zip' outside
of a list comprehension or for loop, we just *use* a list comprehesion, yes?
So the syntax can be special to 'for', like the use of 'in' is.

How's this:

[x * y for x, y in {range(3), range(3)}] == [0, 1, 4]
[x * y for x, y in {range(3)}{range(3)}] == [0, 0, 0, 0, 1, 2, 0, 2, 4]

That is, {s1, s2, ...} is parallel iteration over the sequences listed, and
{s1}{s2} (or {s1}*{s2}?) is nested iteration over the sequences from left to
right.  Nested iteration descriptors would flatten, so that {{s1, s2},
{s3}{s4, s5}} would produce five elements per iteration, not two.  'for x in
{s1}' is the same as 'for x in s1'.

Now suppose we abuse the similarity to dict notation, such that [(i, x) for
x in {i: s1}] == [(i, x) for i, x in{range(s1), s1}].  This is just a way of
naming the iteration index(es).We could then write...

for x in {i: s1}:
  if x is None:
    s1[i] = 0

...and...

result = {}
for x, y in {i: s1}{j: s2}:
  if x * 2 > y + 3:
    result[i, j] = x
  else:
    result[i, j] = y

I would expect assignment to an iteration index to be a compile-time error.

Well, *I* like it, anyway :-)

Cheers,

Evan @ 4-am & digicool





More information about the Python-list mailing list