Grouping pairs - suggested tools

MRAB python at mrabarnett.plus.com
Wed Sep 22 13:06:17 EDT 2010


On 20/09/2010 22:42, Astley Le Jasper wrote:
> I have a list of tuples that indicate a relationship, ie a is related
> to b, b is related to c etc etc. What I want to do is cluster these
> relationships into groups.  An item will only be associated with a
> single cluster.
>
> Before I started, I wondered if there was any particular tool within
> Python I should be looking at. I don't expect anyone to code this for
> me, just say ... "you need to look at using x". I was going to use
> populate a dictionary and
>
> Sorry for being so vague.
>
[snip]
I thought I'd join in. I've used repr to ensure that I have something
that's hashable:

def show_groups(pairs):
     groups = {}
     items = {}
     for x, y in pairs:
         x_repr, y_repr = repr(x), repr(y)
         items[x_repr] = x
         items[y_repr] = y
         x_group = groups.setdefault(x_repr, set([x_repr]))
         y_group = groups.setdefault(y_repr, set([y_repr]))
         union = set()
         for m in x_group | y_group:
             union |= groups[m]
         for m in union:
             groups[m] = union

     indexes = {}
     for s in groups.values():
         indexes.setdefault(frozenset(s), len(indexes) + 1)

     indexes = sorted(indexes.items(), key=lambda pair: pair[1])

     for s, i in indexes:
         for m in sorted(s):
             print("{0} {1}".format(i, items[m]))

if __name__ == "__main__":
     edges = [
         ('a', 'b'),
         ('a', 'c'),
         ('a', 'd'),
         ('b', 'c'),
         ('b', 'd'),
         ('c', 'd'),
         ('e', 'f'),
         ('e', 'g'),
         ('f', 'g'),
         ('h', 'i'),
     ]

     show_groups(edges)
     print()

     show_groups([('a', 'c'), ('b', 'd'), ('c', 'd')])



More information about the Python-list mailing list