itertools.intersect?

Jack Diederich jackdied at gmail.com
Thu Jun 11 00:37:32 EDT 2009


On Thu, Jun 11, 2009 at 12:03 AM, David M. Wilson<dw at botanicus.net> wrote:
[snip]
> I found my answer: Python 2.6 introduces heap.merge(), which is
> designed exactly for this.

Thanks, I knew Raymond added something like that but I couldn't find
it in itertools.
That said .. it doesn't help.  Aside, heapq.merge fits better in
itertools (it uses heaps internally but doesn't require them to be
passed in).  The other function that almost helps is
itertools.groupby() and it doesn't return an iterator so is an odd fit
for itertools.

More specifically (and less curmudgeonly) heap.merge doesn't help for
this particular case because you can't tell where the merged values
came from.  You want all the iterators to yield the same thing at once
but heapq.merge muddles them all together (but in an orderly way!).
Unless I'm reading your tokenizer func wrong it can yield the same
value many times in a row.  If that happens you don't know if four
"The"s are once each from four iterators or four times from one.

All that said your problem is an edge case so I'm happy to say the ten
line composite functions that we've been trading can do what you want
to do and in clear prose.  The stdlib isn't meant to have a one liner
for everything.

-Jack



More information about the Python-list mailing list