fastest way to find the intersection of n lists of sets
jstroud at mbi.ucla.edu
Mon Apr 30 00:48:09 CEST 2007
> I have 3 variable length lists of sets. I need to find the common
> elements in each list (across sets) really really quickly.
> Here is some sample code:
> # Doesn't make sense to union the sets - we're going to do
> intersections later anyway
> l1 = reduce(operator.add, list(x) for x in l1)
> l2 = reduce(operator.add, list(x) for x in l2)
> l3 = reduce(operator.add, list(x) for x in l3)
> # Should I do this in two steps? Maybe by intersecting the two
> shortest lists first?
> s = frozenset(l1) & frozenset(l2) & frozenset(l3)
> I'm assuming frozensets are (somehow) quicker than sets because
> they're immutable.
> Any code suggestions? Maybe using something in the new fancy-schmancy
> itertools module?
I don't understand why you cast to list. I would propose:
lists_of_sets = [l1, l2, l3]
reduce(set.intersection, (reduce(set.union, x) for x in lists_of_sets))
Since this is simplest, I'm guessing it should be fastest because I'm
also guessing that set impelmentation is as optimized as list--I think
this would hold true especially for large sets with sparse overlap
between lists. So it might be reasonable consider the content of your
sets and lists and write your code based on the content or on assuming a
More information about the Python-list