How to populate all possible hierarchical clusterings from a set of elements?
Peter Otten
__peter__ at web.de
Wed Jan 12 15:43:22 EST 2011
justin wrote:
> The title sounds too complex, but my question is actually simple.
>
> Suppose I have [1,2,3,4,5], then there are many ways of making
> clustering.
> Among them, I want to pair up terminals until there is only one left
> at the end.
> For example, ((((1,2),3),4),5), (1,(2,(3,(4,5)))), or (((1,2),(3,4)),
> 5) would be legitimate ones.
>
> How do you think can I, using the modules of Python such as itertools
> as much as possible, make all possible such clusterings?
Here's my first attempt:
def cluster(items):
if len(items) == 2:
yield items
return
for i in range(len(items)-1):
for c in cluster(items[:i] + (items[i:i+2],) + items[i+2:]):
yield c
def unique(items):
seen = set()
for item in items:
if item not in seen:
seen.add(item)
yield item
if __name__ == "__main__":
for item in unique(cluster(tuple("abcd"))):
print item
Unfortunately I get a lot of duplicates :(
You could define a kind of operator precedence using
itertools.combinations(), but I think it suffers from the same problem as
a 3 b 1 c 2 d
and
a 2 b 1 c 3 d
would both result in ((a, b), (c, d)).
More information about the Python-list
mailing list