steve at REMOVE-THIS-cybersource.com.au
Fri Jul 3 11:50:09 CEST 2009
On Fri, 03 Jul 2009 01:39:27 -0700, Paul Rubin wrote:
> Steven D'Aprano <steve at REMOVE-THIS-cybersource.com.au> writes:
>> groupby() works on lists.
>>>> a = [1,3,4,6,7]
>>>> from itertools import groupby
>>>> b = groupby(a, lambda x: x%2==1) # split into even and odd
>>>> c = list(b)
>>>> print len(c)
>>>> d = list(c) # should be [4,6] print d # oops.
I didn't say it worked properly *wink*
Seriously, this behaviour caught me out too. The problem isn't that the
input data is a list, the same problem occurs for arbitrary iterators.
>From the docs:
The operation of groupby() is similar to the uniq filter in Unix. It
generates a break or new group every time the value of the key function
changes (which is why it is usually necessary to have sorted the data
using the same key function). That behavior differs from SQL’s GROUP BY
which aggregates common elements regardless of their input order.
The returned group is itself an iterator that shares the underlying
iterable with groupby(). Because the source is shared, when the groupby()
object is advanced, the previous group is no longer visible. So, if that
data is needed later, it should be stored as a list
More information about the Python-list