[Tutor] Iterable Understanding

Wayne Werner waynejwerner at gmail.com
Sat Nov 14 16:19:25 CET 2009


On Sat, Nov 14, 2009 at 8:49 AM, Stephen Nelson-Smith <sanelson at gmail.com>wrote:

> Hi Wayne,
>
> > Just write your own merge:
> > (simplified and probably inefficient and first thing off the top of my
> head)
> > newlist = []
> > for x, y, z in zip(list1, list2, list3):
>
> I think I need something like izip_longest don't I, since the list wil
> be of varied length?
>

Yes, probably


>
> Also, where do these lists come from?  They can't go in memory -
> they're much too big.  This is why I felt using some kind if generator
> was the right way - I can produce 3 (or 12) sets of tuples... i just
> need to work out how to merge them.
>

you can zip generators, too:
In [10]: for x, y in zip(xrange(10), xrange(5)):
   ....:     print x, y
   ....:
   ....:
0 0
1 1
2 2
3 3
4 4

In [38]: for x,y in itertools.izip_longest(xrange(10), xrange(5)):
   ....:     print x,y
   ....:
   ....:
0 0
1 1
2 2
3 3
4 4
5 None
6 None
7 None
8 None
9 None

>     if y > x < z:
> >         newlist.append(x)
> >     elif x > y < z:
> >         newlist.append(y)
> >     elif x > z < y:
> >         newlist.append(z)
> > I'm pretty sure that should work although it's untested.
>
> Well, no it won't work.  The lists are in time order, but they won't
> match up.  One log may have entries at the same numerical position (ie
> the 10th log entry) but earlier than the entries on the previous
> lines.  To give a simple example:
>
> List 1        List 2        List 3
> (1, cat)      (2, fish)     (1, cabbage)
> (4, dog)     (5, pig)      (2, ferret)
> (5, phone)  (6, horse)  (3, sausage)
>
> Won't this result in the lowest number *per row* being added to the
> new list?  Or am I misunderstanding how it works?


I forgot to add the rest of them. But it appears that you'll have to come up
with some better logic.

I would use some type of priority queue-style implementation. I don't know
if the heapq datatype would be sufficient, or if you'd need to roll your
own. The way I would do it is check the top value of each, whichever has the
smallest value, pop it off and add that to the new list (which could also be
a generator[there's some other term for it but I forget ATM] writing out to
a file) then check across the top of all 3 again.

Make sense?
HTH,
Wayne
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20091114/fd7ab812/attachment.htm>


More information about the Tutor mailing list