Real-world use cases for map's None fill-in feature?

Mon Jan 23 18:00:02 EST 2006

"Andrae Muys" <andrae.muys at gmail.com> wrote:
> rurpy at yahoo.com wrote:
> > I am still left with a difficult to express feeling of
> > dissatifaction at this process.
> >
> > Plese try to see it from the point of view of
> > someone who it not a expert at Python:
> >
> > Here is izip().
> > My conception is it takes two sequence generators
> > and matches up the items from each.  (I am talking
> > overall coceptual models here, not details.)
> > Here is my problem.
> > I have two files that produce lines and I want to
> > compare each line.
> > Seems like a perfect fit.
> >
> > So I read that izip() only goes to shortest itereable,
> > I think, "why only the shortest?  why not the longest?
> > what's so special about the shortest?"
> > At this point explanations involving lack of uses cases
> > are not very convincing.  I have a use.  All the
> > alternative solutions are more code, less clear, less
> > obvious, less right.  But most importantly, there
> > seems to be a symmetry between the two cases
> > (shortest vs longest) that makes the lack of
> > support for matching-to-longest somehow a
> > defect.
> >
> > Now if there is something fundamental about
> > matching items in parallel lists that makes it a
> > sensible thing to do only for equal lists (or to the
> > shortest list) that's fine.  You seem to imply that's
> > the case by referencing Haskell, ML, etc.  If so,
> > that needs to be pointed out in izip's docs.
> > (Though nothing I have read in this thread has
> > been convincing.)
> >
>
> Because a simple call to chain() is an obvious (it's the very first
> itertool in the docs), efficient, and straight forward solution to the
> problem of padding a shorter iterable.
>
> izip(chain(shorter, pad), longer)

And how do you tell, a priori, which iterable will turn out to
be the shortest?

> It is not so straight forward to arrange the truncation of an iterable;
> moreover this is a far more common case as it is not uncommon to use
> infinite iterators in itertable based code.

It may be more common (which is arguable), but that
does not mean the "iterate-to-longest" is uncommon,
or is not common enough to be worth bothering about.

> izip(count(), iter(file))
>
> which doesn't terminate without truncation.  That a common use case
> fails to terminate is generally considered 'something fundamental'.

Nobody is suggesting changing the current behavior
of izip() in this case.

> The conversion between them is of course a matter of using takewhile
> and an appropriate fence.

"of course"?

> Padding in the presence of truncation:
>    def fence(): pass
>    takewhile(lambda x: x[0] != fence and x[1] != fence,
>                  izip(chain(iter1, repeat(fence)),
>                       chain(iter2, repeat(fence))))
>
> Truncation in the presence of padding:
>    def fence(): pass
>    takewhile(lambda x: x[0] != fence or x[1] != fence,
>                  izip(chain(iter1, repeat(fence)),
>                       chain(iter2, repeat(fence))))
>
> Of course you can use any value not in the domain of iter1 or iter2 as
> a fence, but a closure is guarrenteed to satisfy that requirement and
> hence keeps the code generic.  In the padding example, if you actually
> care what value is used for pad then either you can either replace
> fence, or wrap the result in an imap.

Thank you for the posting Andrae, it has increased my
knowledge.
But my original point was there are cases (often involving
file iterators) where the problem's complexity seems to be
on the same order as problems involving iterate-to-shortest
solutions, but, while the latter have simple, one function
call solutions, solutions for the former are far more complex
(as your post illustrates).  This seems at best unbalanced.
When encountered by someone with less than your level of
expertise, it leads to the feeling, "jeez, why is this simple
problem take hours to figure out and a half dozen function
calls?!?"  And please note, I am complaining about a general
problem with Python.  The izip() issue was just (at the time)
the most recent trigger of that reaction.  (Most recent is
<string>.translate() but that is for a new thread.)