[Python-ideas] Extremely weird itertools.permutations
Steven D'Aprano
steve at pearwood.info
Sun Oct 13 03:47:42 CEST 2013
On Sat, Oct 12, 2013 at 05:44:38PM -0700, Raymond Hettinger wrote:
> In general, if someone wants to eliminate duplicates
> from the population, they can do so easily with:
>
> permutations(set(population), n)
In fairness Raymond, the proposal is not to eliminate duplicates from
the population, but from the permutations themselves. Consider the
example I gave earlier, where you're permuting "RRRBB" two items at a
time. There are 20 permutations including duplicates, but sixteen of
them are repeated:
py> list(''.join(t) for t in permutations("RRRBB", 2))
['RR', 'RR', 'RB', 'RB', 'RR', 'RR', 'RB', 'RB', 'RR', 'RR', 'RB', 'RB',
'BR', 'BR', 'BR', 'BB', 'BR', 'BR', 'BR', 'BB']
py> set(''.join(t) for t in permutations("RRRBB", 2))
{'BR', 'RB', 'RR', 'BB'}
But if you eliminate duplicates from the population first, you get
only two permutations:
py> list(''.join(t) for t in permutations(set("RRRBB"), 2))
['BR', 'RB']
If it were just a matter of calling set() on the output of permutations,
that would be trivial enough. But, you might care about order, or
elements might not be hashable, or you might have a LOT of permutations
to generate before discarding:
population = "R"*1000 + "B"*500
set(''.join(t) for t in permutations(population, 2)) # takes a while...
In my opinion, if unique_permutations is no more efficient than calling
set on the output of permutations, it's not worth it. But if somebody
can come up with an implementation which is significantly more
efficient, without making unreasonable assumptions about orderability,
hashability or even comparability, then in my opinion that might be
worthwhile.
--
Steven
More information about the Python-ideas
mailing list