A way to re-organize a list
Alex Martelli
aleax at mac.com
Thu Jul 19 17:22:31 CEST 2007
beginner <zyzhu2000 at gmail.com> wrote:
> Hi Everyone,
>
> I have a simple list reconstruction problem, but I don't really know
> how to do it.
>
> I have a list that looks like this:
>
> l=[ ("A", "a", 1), ("A", "a", 2), ("A", "a", 3), ("A", "b", 1), ("A",
> "b", 2), ("B", "a", 1), ("B", "b", 1)]
>
> What I want to do is to reorganize it in groups, first by the middle
> element of the tuple, and then by the first element. I'd like the
> output look like this:
>
> out=[
> [ #group by first element "A"
> [("A", "a", 1), ("A", "a", 2), ("A", "a", 3)], #group by
> second element "a"
> [ ("A", "b", 1), ("A", "b", 2)], #group by second element
> "b"
> ],
> [ #group by first element "B"
> [("B", "a", 1)],
> [("B", "b", 1)]
> ]
> ]
>
>
> All the solutions I came up with are difficult to read and even harder
> to go back and change, and I just feel are too complicated for such a
> simple problem. I am wondering if anyone here has some insight.
>
> If there is a 'functional' way to do this, it would be even greater.
If it's already sorted as in your example, itertools.groupby may be what
you want. To pick a key, you can use operator.itemgetter, which builds
functions picking a specified N-th item of a sequence.
Consider...:
>>> l
[('A', 'a', 1), ('A', 'a', 2), ('A', 'a', 3), ('A', 'b', 1), ('A', 'b',
2), ('B', 'a', 1), ('B', 'b', 1)]
>>> itm=operator.itemgetter
>>> gb=itertools.groupby
>>> list(gb(l, itm(0)))
[('A', <itertools._grouper object at 0x4d110>), ('B',
<itertools._grouper object at 0x4d120>)]
the items in the sequence groupby returns are (key, subsequence) pairs;
you don't care about the key, so here's a first step (grouping by first
item only):
>>> [list(ss) for k, ss in gb(l, itm(0))]
[[('A', 'a', 1), ('A', 'a', 2), ('A', 'a', 3), ('A', 'b', 1), ('A', 'b',
2)], [('B', 'a', 1), ('B', 'b', 1)]]
However, you want a second level of grouping -- so, instead of the plain
list(ss), you need to apply this concept again:
>>> [[list(sss) for k,sss in gb(ss,itm(1))] for k, ss in gb(l, itm(0))]
[[[('A', 'a', 1), ('A', 'a', 2), ('A', 'a', 3)], [('A', 'b', 1), ('A',
'b', 2)]], [[('B', 'a', 1)], [('B', 'b', 1)]]]
...and there you are.
May be more readable with a little auxiliary function to capture what I
just called "this concept" with a readable name:
>>> def group(s, i): return [list(ss) for k, ss in gb(s, itm(i))]
...
>>> [group(ss,1) for ss in group(l,0)]
[[[('A', 'a', 1), ('A', 'a', 2), ('A', 'a', 3)], [('A', 'b', 1), ('A',
'b', 2)]], [[('B', 'a', 1)], [('B', 'b', 1)]]]
This does one more "list(...)" step, but if your lists aren't huge the
readability may be worth the small slow-down.
Alex
More information about the Python-list
mailing list