[Python-ideas] Extremely weird itertools.permutations

Nick Coghlan ncoghlan at gmail.com
Sun Oct 13 11:27:54 CEST 2013


On 13 October 2013 17:38, Neil Girdhar <mistersheik at gmail.com> wrote:
> My intuition is that we want Python to be "complete".  Many other languages
> can find the permutations of a multiset.  Python has a permutations
> function.  Many people on stackoverflow expected that function to be able to
> find those permutations.

Nope, we expressly *don't* want the standard library to be "complete",
because that would mean growing to the size of PyPI (or larger).
There's always going to be scope for applications to adopt new domain
specific dependencies with more in-depth support than that provided by
the standard library.

Many standard library modules are in fact deliberately designed as
"stepping stone" modules that will meet the needs of code which have
an incidental relationship to that task, but will need to be replaced
with something more sophisticated for code directly related to that
domain. Many times, that means they will ignore as irrelevant
distinctions that are critical in certain contexts, simply because
they don't come up all that often outside those specific domains, and
addressing them involves making the core module more complicated to
use for more typical cases.

In this case, the proposed alternate permutations mechanism only makes
a difference when:

1. The data set contains equivalent values
2. Input order is not considered significant, so exchanging equivalent
values should *not* create a new permutation (i.e. multiset
permutations rather than sequence permutations).

If users aren't likely to encounter situations where that makes a
difference, then providing both in the standard library isn't being
helpful, it's being actively user hostile by asking them to make a
decision they're not yet qualified to make for the sake of the few
experts that specifically need . Hence Raymond's request for data
modelling problems outside the "learning or studying combinatorics"
context to make the case for standard library inclusion.

Interestingly, I just found another language which has the equivalent
of the currrent behaviour of itertools.permutations: Haskell has it as
Data.List.permutations. As far as I can tell, Haskell doesn't offer
support for multiset permutations in the core, you need an additional
package like Math.Combinatorics (see:
http://hackage.haskell.org/package/multiset-comb-0.2.3/docs/Math-Combinatorics-Multiset.html#g:4).

Since iterator based programming in Python is heavily inspired by
Haskell, this suggests that the current behaviour of
itertools.permutations is appropriate and that Raymond is right to be
dubious about including multiset permutations support directly in the
standard library.

Those interested in improving the experience of writing combinatorics
code in Python may wish to look into helping out with the
combinatorics package on PyPI:
http://phillipmfeldman.org/Python/for_developers.html (For example,
politely approach Phillip to see if he is interested in hosting it on
GitHub or BitBucket, providing Sphinx docs on ReadTheDocs, improving
the PyPI metadata, etc - note I have no experience with this package,
it's just the first hit for "python combinatorics")

> One suggestion: Why not make it so that itertools.permutations checks if its
> argument is an instance of collections.Mapping?  If it is, we could
> interpret the items as a mapping from elements to positive integers, which
> is a compact representation of a multiset.  Then, it could do the right
> thing for that case.

If you want to go down the path of only caring about hashable values,
you may want to argue for a permutations method on collections.Counter
(it's conceivable that approach has the potential to be even faster
than an approach based on accepting and processing an arbitrary
iterable, since it can avoid generating repeated values in the first
place).

A Counter based multiset permutation algorithm was actually posted to
python-list back in 2009, just after collections.Counter was
introduced: https://mail.python.org/pipermail/python-list/2009-January/521685.html

I just created an updated version of that recipe and posted it as
https://bitbucket.org/ncoghlan/misc/src/default/multiset_permutations.py

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list