itertools: problem with nested groupby, list()
Nico Schlömer
nico.schloemer at gmail.com
Tue May 4 08:37:17 EDT 2010
> Are you basically after this, then?
>
> for a, a_iter in groupby(my_list, itemgetter('a')):
> print 'New A', a
> for b, b_iter in groupby(a_iter, itemgetter('b')):
> b_list = list(b_iter)
> for p in ['first', 'second']:
> for b_data in b_list:
> #whatever...
Yes. Moving the 'first', 'second' operation to the innermost loop
works all right, and I guess that's what I'll do.
> Cos that looks like it could be simplified to (untested)
> for (a, b), data_iter in groupby(my_list, itemgetter('a','b')):
> data = list(data) # take copy
> for pass_ in ['first', 'second']:
> # do something with data
Potentially yes, but for now I actually need to do something at "print
'New A', a", so I can't just skip this.
Anyway, the above suggestion works well for now. Thanks!
--Nico
On Tue, May 4, 2010 at 1:52 PM, Jon Clements <joncle at googlemail.com> wrote:
> On 4 May, 12:36, Nico Schlömer <nico.schloe... at gmail.com> wrote:
>> > Does this example help at all?
>>
>> Thanks, that clarified things a lot!
>>
>> To make it easier, let's just look at 'a' and 'b':
>>
>> > my_list.sort( key=itemgetter('a','b','c') )
>> > for a, a_iter in groupby(my_list, itemgetter('a')):
>> > print 'New A', a
>> > for b, b_iter in groupby(a_iter, itemgetter('b')):
>> > print '\t', 'New B', b
>> > for b_data in b_iter:
>> > print '\t'*3, a, b, b_data
>> > print '\t', 'End B', b
>> > print 'End A', a
>>
>> That works well, and I can wrap the outer loop in another loop without
>> problems. What's *not* working, though, is having more than one pass
>> on the inner loop, as in
>>
>> =============================== *snip* ===============================
>> my_list.sort( key=itemgetter('a','b','c') )
>> for a, a_iter in groupby(my_list, itemgetter('a')):
>> print 'New A', a
>> for pass in ['first pass', 'second pass']:
>> for b, b_iter in groupby(a_iter, itemgetter('b')):
>> print '\t', 'New B', b
>> for b_data in b_iter:
>> print '\t'*3, a, b, b_data
>> print '\t', 'End B', b
>> print 'End A', a
>> =============================== *snap* ===============================
>>
>> I tried working around this by
>>
>> =============================== *snip* ===============================
>> my_list.sort( key=itemgetter('a','b','c') )
>> for a, a_iter in groupby(my_list, itemgetter('a')):
>> print 'New A', a
>> inner_list = list( groupby(a_iter, itemgetter('b')) )
>> for pass in ['first pass', 'second pass']:
>> for b, b_iter in inner_list:
>> print '\t', 'New B', b
>> for b_data in b_iter:
>> print '\t'*3, a, b, b_data
>> print '\t', 'End B', b
>> print 'End A', a
>> =============================== *snap* ===============================
>>
>> which don't work either, and I don't understand why. -- I'll look at
>> Uli's comments.
>>
>> Cheers,
>> Nico
>>
>> On Tue, May 4, 2010 at 1:08 PM, Jon Clements <jon... at googlemail.com> wrote:
>> > On 4 May, 11:10, Nico Schlömer <nico.schloe... at gmail.com> wrote:
>> >> Hi,
>>
>> >> I ran into a bit of an unexpected issue here with itertools, and I
>> >> need to say that I discovered itertools only recently, so maybe my way
>> >> of approaching the problem is "not what I want to do".
>>
>> >> Anyway, the problem is the following:
>> >> I have a list of dictionaries, something like
>>
>> >> [ { "a": 1, "b": 1, "c": 3 },
>> >> { "a": 1, "b": 1, "c": 4 },
>> >> ...
>> >> ]
>>
>> >> and I'd like to iterate through all items with, e.g., "a":1. What I do
>> >> is sort and then groupby,
>>
>> >> my_list.sort( key=operator.itemgetter('a') )
>> >> my_list_grouped = itertools.groupby( my_list, operator.itemgetter('a') )
>>
>> >> and then just very simply iterate over my_list_grouped,
>>
>> >> for my_item in my_list_grouped:
>> >> # do something with my_item[0], my_item[1]
>>
>> >> Now, inside this loop I'd like to again iterate over all items with
>> >> the same 'b'-value -- no problem, just do the above inside the loop:
>>
>> >> for my_item in my_list_grouped:
>> >> # group by keyword "b"
>> >> my_list2 = list( my_item[1] )
>> >> my_list2.sort( key=operator.itemgetter('b') )
>> >> my_list_grouped = itertools.groupby( my_list2,
>> >> operator.itemgetter('b') )
>> >> for e in my_list_grouped:
>> >> # do something with e[0], e[1]
>>
>> >> That seems to work all right.
>>
>> >> Now, the problem occurs when this all is wrapped into an outer loop, such as
>>
>> >> for k in [ 'first pass', 'second pass' ]:
>> >> for my_item in my_list_grouped:
>> >> # bla, the above
>>
>> >> To be able to iterate more than once through my_list_grouped, I have
>> >> to convert it into a list first, so outside all loops, I go like
>>
>> >> my_list.sort( key=operator.itemgetter('a') )
>> >> my_list_grouped = itertools.groupby( my_list, operator.itemgetter('a') )
>> >> my_list_grouped = list( my_list_grouped )
>>
>> >> This, however, makes it impossible to do the inner sort and
>> >> groupby-operation; you just get the very first element, and that's it.
>>
>> >> An example file is attached.
>>
>> >> Hints, anyone?
>>
>> >> Cheers,
>> >> Nico
>>
>> > Does this example help at all?
>>
>> > my_list.sort( key=itemgetter('a','b','c') )
>> > for a, a_iter in groupby(my_list, itemgetter('a')):
>> > print 'New A', a
>> > for b, b_iter in groupby(a_iter, itemgetter('b')):
>> > print '\t', 'New B', b
>> > for c, c_iter in groupby(b_iter, itemgetter('c')):
>> > print '\t'*2, 'New C', c
>> > for c_data in c_iter:
>> > print '\t'*3, a, b, c, c_data
>> > print '\t'*2, 'End C', c
>> > print '\t', 'End B', b
>> > print 'End A', a
>>
>> > Jon.
>> > --
>> >http://mail.python.org/mailman/listinfo/python-list
>>
>>
>
> Are you basically after this, then?
>
> for a, a_iter in groupby(my_list, itemgetter('a')):
> print 'New A', a
> for b, b_iter in groupby(a_iter, itemgetter('b')):
> b_list = list(b_iter)
> for p in ['first', 'second']:
> for b_data in b_list:
> #whatever...
>
> Cos that looks like it could be simplified to (untested)
>
> for (a, b), data_iter in groupby(my_list, itemgetter('a','b')):
> data = list(data) # take copy
> for pass_ in ['first', 'second']:
> # do something with data
>
> But from my POV, it's almost looking like a 2-tuple key in a
> defaultdict jobby.
>
> Jon.
> --
> http://mail.python.org/mailman/listinfo/python-list
>
More information about the Python-list
mailing list