bug with itertools.groupby?

Rhodri James rhodri at wildebst.demon.co.uk
Tue Oct 6 19:21:55 EDT 2009


On Wed, 07 Oct 2009 00:06:43 +0100, Kitlbast <vlad.shevchenko at gmail.com>  
wrote:

> Hi there,
>
> the code below on Python 2.5.2:
>
> from itertools import groupby
>
> info_list = [
>     {'profile': 'http://somesite.com/profile1', 'account': 61L},
>     {'profile': 'http://somesite.com/profile2', 'account': 64L},
>     {'profile': 'http://somesite.com/profile3', 'account': 61L},
> ]
>
> grouped_by_account = groupby(info_list, lambda x: x['account'])
> for acc, iter_info_items in grouped_by_account:
>     print 'grouped acc: ', acc
>
> gives output:
>
> grouped acc:  61
> grouped acc:  64
> grouped acc:  61
>
> am I doing something wrong?

That depends on whether you expected groupby to sort your datastream
for you.  It doesn't.  It just collects up items in order until your
group key changes and then delivers that batch; it neither knows nor
cares that a previous group key has recurred.

If the output you want is more like:

grouped acc: 61
grouped acc: 64

then you're going to have to sort your info_list first.  That might
not be desirable, depending on just how long it is.  If you tell us
more about your specific use case, we may be able to give you more
specific advice.

-- 
Rhodri James *-* Wildebeest Herder to the Masses



More information about the Python-list mailing list