[Python-ideas] Extremely weird itertools.permutations

Sun Oct 13 03:47:42 CEST 2013

On Sat, Oct 12, 2013 at 05:44:38PM -0700, Raymond Hettinger wrote:

> In general, if someone wants to eliminate duplicates
> from the population, they can do so easily with:
> 
>    permutations(set(population), n)

In fairness Raymond, the proposal is not to eliminate duplicates from 
the population, but from the permutations themselves. Consider the 
example I gave earlier, where you're permuting "RRRBB" two items at a 
time. There are 20 permutations including duplicates, but sixteen of 
them are repeated:

py> list(''.join(t) for t in permutations("RRRBB", 2))
['RR', 'RR', 'RB', 'RB', 'RR', 'RR', 'RB', 'RB', 'RR', 'RR', 'RB', 'RB', 
'BR', 'BR', 'BR', 'BB', 'BR', 'BR', 'BR', 'BB']
py> set(''.join(t) for t in permutations("RRRBB", 2))
{'BR', 'RB', 'RR', 'BB'}

But if you eliminate duplicates from the population first, you get 
only two permutations:

py> list(''.join(t) for t in permutations(set("RRRBB"), 2))
['BR', 'RB']

If it were just a matter of calling set() on the output of permutations, 
that would be trivial enough. But, you might care about order, or 
elements might not be hashable, or you might have a LOT of permutations 
to generate before discarding:

population = "R"*1000 + "B"*500
set(''.join(t) for t in permutations(population, 2))  # takes a while...

In my opinion, if unique_permutations is no more efficient than calling 
set on the output of permutations, it's not worth it. But if somebody 
can come up with an implementation which is significantly more 
efficient, without making unreasonable assumptions about orderability, 
hashability or even comparability, then in my opinion that might be 
worthwhile.

-- 
Steven