[Python-ideas] Possible new itertool: comm()

Antoine Pitrou solipsis at pitrou.net
Tue Jan 6 22:24:48 CET 2015


On Wed, 7 Jan 2015 08:09:59 +1100
Cameron Simpson <cs at zip.com.au> wrote:
> On 06Jan2015 19:36, Antoine Pitrou <solipsis at pitrou.net> wrote:
> >On Tue, 6 Jan 2015 18:22:44 +0000
> >Paul Moore <p.f.moore at gmail.com> wrote:
> >> On 6 January 2015 at 17:14, Raymond Hettinger
> >> <raymond.hettinger at gmail.com> wrote:
> >> >> On Jan 6, 2015, at 8:22 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> >> >>
> >> >> In writing a utility script today, I found myself needing to do
> >> >> something similar to what the Unix "comm" utility does - take two
> >> >> sorted iterators, and partition the values into "only in the first",
> >> >> "only in the second", and "in both" groups.
> >> >
> >> > As far as I can tell, this would be a very rare need.
> >>
> >> It's come up for me a few times, usually when trying to check two
> >> lists of files to see which ones have been missed by a program, and
> >> which ones the program thinks are present but no longer exist.
> >
> >Why don't you use sets for such things? Your iterator is really only
> >useful for huge or unhashable inputs.
> 
> In my use case (an existing tool):
> 
> 1) I'm merging log files of arbitrary size; I am _not_ going to suck them into 
> memory. A comm()-like function has a tiny and fixed memory footprint, versus an 
> unbounded out.

I don't understand what your use case has to do with comm(). If you
just want to merge sorted iterators you don't need all the complication
this function has.

Regards

Antoine.




More information about the Python-ideas mailing list