Fwd: Extremely weird itertools.permutations

What would you like this hypothetical function to output here:
It's neither QUITE equality nor identity you are looking for, I think, in nonredundant_permutation():
On Fri, Oct 11, 2013 at 11:38 AM, Neil Girdhar <mistersheik@gmail.com>wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Andrew & Neil (or whoever): Is this *really* what you want:
pprint(list(nonredundant_permutations([F(3,1), D(3.0), 3.0]))) [(Fraction(3, 1), Decimal('3'), 3.0)]
It seems odd to me to want that. On the other hand, I provide a one-line implementation of the desired behavior if anyone wants it. Moreover, I don't think the runtime behavior of my one-liner is particularly costly... maybe not the best possible, but the best big-O possible. On Fri, Oct 11, 2013 at 1:19 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Moreover, I don't think the runtime behavior of my one-liner is particularly costly…
It is *extremely* costly. There can be n! permutations, so for even, say, 12 elements, you are looking at many gigabytes of memory needlessly used. One big motivator for itertools is not to have to do this. I'm curious how you would solve this problem: https://www.kattis.com/problems/industrialspy efficiently in Python. I did it by using a unique-ifying generator, but ideally this would not be necessary. Ideally, Python would do exactly what C++ does with next_permutation. Best, Neil On Fri, Oct 11, 2013 at 4:27 PM, David Mertz <mertz@gnosis.cx> wrote:

My code, which was the motivation for this suggestion: import itertools as it import math def is_prime(n): for i in range(2, int(math.floor(math.sqrt(n))) + 1): if n % i == 0: return False return n >= 2 def unique(iterable): # Should not be necessary in my opinion seen = set() for x in iterable: if x not in seen: seen.add(x) yield x n = int(input()) for _ in range(n): x = input() print(sum(is_prime(int("".join(y))) for len_ in range(1, len(x) + 1) for y in unique(it.permutations(x, len_)) if y[0] != '0')) On Fri, Oct 11, 2013 at 5:35 PM, Neil Girdhar <mistersheik@gmail.com> wrote:

On 11 October 2013 22:38, Neil Girdhar <mistersheik@gmail.com> wrote:
I don't really understand what your code is doing but I just wanted to point out that the above will fail for large integers (maybe not relevant in your case):
Even without the OverflowError I suspect that there are primes p > ~1e16 such that is_prime(p**2) would incorrectly return True. This is a consequence of depending on FP arithmetic in what should be exact computation. The easy fix is to break when i**2 > n avoiding the tricky sqrt operation. Alternatively you can use an exact integer sqrt function to fix this: def sqrt_floor(y): try: x = int(math.sqrt(y)) except OverflowError: x = y while not (x ** 2 <= y < (x+1) ** 2): x = (x + y // x) // 2 return x Oscar

On 13 October 2013 19:29, Neil Girdhar <mistersheik@gmail.com> wrote:
Did you read the problem?
I did but since you showed some code that you said you were working on I thought you'd be interested to know that it could be improved.
Anyway, let's not get off topic (permutations).
Getting back to your proposal, I disagree that permutations should be "fixed". The current behaviour is correct. If I was asked to define a permutation I would have given definition #3 from Steven's list: a bijection from a set to itself. Formally a permutation of a collection of non-unique elements is not defined. They may also be uses for a function like the one that you proposed but I've never needed it (and I have used permutations a few times) and no one in this thread (including you) has given a use-case for this. Oscar

On Sun, Oct 13, 2013 at 3:34 PM, Oscar Benjamin <oscar.j.benjamin@gmail.com>wrote:
The code solves the problem according to its specification :) (The numbers are less than 1e8.)
The problem is a use-case. Did you read it? Did you try solving it?

On Sun, Oct 13, 2013 at 1:04 PM, MRAB <python@mrabarnett.plus.com> wrote:
Here's a use case: anagrams.
For what it's worth, I've written anagram-finding code, and I didn't do it with permutations. The faster approach is to create a dictionary mapping a canonical form of each word to a list of words, e.g., { 'ACT': ['ACT', 'CAT'], 'AET': ['ATE', 'EAT', 'ETA', 'TEA'] } This requires extra work to build the map but you do that just once when you read the dictionary and then every lookup is O(1) not O(len(word)). This canonical form approach is useful for other word transformations that are used in puzzles, e.g., words that are have the same consonants (ignoring vowels). --- Bruce I'm hiring: http://www.cadencemd.com/info/jobs Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security P.S. Yes, I know: if you play Scrabble, TAE is also a word.

Here are a couple people looking for the function that doesn't exist (yet?) http://stackoverflow.com/questions/9660085/python-permutations-with-constrai... http://stackoverflow.com/questions/15592299/generating-unique-permutations-i... On Mon, Oct 14, 2013 at 4:52 PM, Bruce Leban <bruce@leapyear.org> wrote:

OK, you're right. Just using set() has bad worst case memory costs. I was thinking of the case where there actually WERE lots of equalities, and hence the resulting list would be much smaller than N!. But of course that's not general. It takes more than one line, but here's an incremental version: def nonredundant_permutations(seq): seq = sorted(seq) last = None for perm in permutations(seq): if perm != last: yield perm last = perm On Fri, Oct 11, 2013 at 2:35 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Unfortunately, that doesn't quite work… list("".join(x) for x in it.permutations('aaabb', 3)) ['aaa', 'aab', 'aab', 'aaa', 'aab', 'aab', 'aba', 'aba', 'abb', 'aba', 'aba', 'abb', 'aaa', 'aab', 'aab', 'aaa', 'aab', 'aab', 'aba', 'aba', 'abb', 'aba', 'aba', 'abb', 'aaa', 'aab', 'aab', 'aaa', 'aab', 'aab', 'aba', 'aba', 'abb', 'aba', 'aba', 'abb', 'baa', 'baa', 'bab', 'baa', 'baa', 'bab', 'baa', 'baa', 'bab', 'bba', 'bba', 'bba', 'baa', 'baa', 'bab', 'baa', 'baa', 'bab', 'baa', 'baa', 'bab', 'bba', 'bba', 'bba'] On Fri, Oct 11, 2013 at 5:48 PM, David Mertz <mertz@gnosis.cx> wrote:

Bummer. You are right, Neil. I saw MRAB's suggestion about sorting, and falsely thought that would be general; but obviously it's not. So I guess the question is whether there is ANY way to do this without having to accumulate a 'seen' set (which can grow to size N!). The answer isn't jumping out at me, but that doesn't mean there's not a way. I don't want itertools.permutations() to do "equality filtering", but assuming some other function in itertools were to do that, how could it do so algorithmically? Or whatever, same question if it is itertools.permutations(seq, distinct=True) as the API. On Fri, Oct 11, 2013 at 2:51 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On 11/10/2013 23:03, David Mertz wrote:
Here's an implementation: def unique_permutations(iterable, count=None, key=None): def perm(items, count): if count: prev_item = object() for i, item in enumerate(items): if item != prev_item: for p in perm(items[ : i] + items[i + 1 : ], count - 1): yield [item] + p prev_item = item else: yield [] if key is None: key = lambda item: item items = sorted(iterable, key=key) if count is None: count = len(items) yield from perm(items, count) And some results:

I realize after reading http://stackoverflow.com/questions/6284396/permutations-with-unique-valuesth... my version was ALMOST right: def nonredundant_permutations(seq, r=None): last = () for perm in permutations(sorted(seq), r): if perm > last: yield perm last = perm I can't look only for inequality, but must use the actual comparison.
Of course, this approach DOES rely on the order in which itertools.permutations() returns values. However, it's a bit more compact than MRAB's version. On Fri, Oct 11, 2013 at 3:23 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On 12 Oct 2013 08:45, "David Mertz" <mertz@gnosis.cx> wrote: > > > I realize after reading http://stackoverflow.com/questions/6284396/permutations-with-unique-valuesthat my version was ALMOST right: > > def nonredundant_permutations(seq, r=None): > last = () > for perm in permutations(sorted(seq), r): > if perm > last: > yield perm > last = perm > > I can't look only for inequality, but must use the actual comparison. > > >>> ["".join(x) for x in nonredundant_permutations('aaabb',3)] > ['aaa', 'aab', 'aba', 'abb', 'baa', 'bab', 'bba'] > >>> list(nonredundant_permutations([F(3,1), D(3.0), 3.0])) > [(Fraction(3, 1), Decimal('3'), 3.0)] > > Of course, this approach DOES rely on the order in which itertools.permutations() returns values. However, it's a bit more compact than MRAB's version. As there is no requirement that entries in a sequence handled by itertools.permutations be sortable, so the original question of why this isn't done by default has been answered (the general solution risks consuming too much memory, while the memory efficient solution constrains the domain to only sortable sequences). Cheers, Nick. > > > > > On Fri, Oct 11, 2013 at 3:23 PM, Neil Girdhar <mistersheik@gmail.com> wrote: >> >> Beautiful!! >> >> >> On Fri, Oct 11, 2013 at 6:19 PM, MRAB <python@mrabarnett.plus.com> wrote: >>> >>> On 11/10/2013 23:03, David Mertz wrote: >>>> >>>> Bummer. You are right, Neil. I saw MRAB's suggestion about sorting, >>>> and falsely thought that would be general; but obviously it's not. >>>> >>>> So I guess the question is whether there is ANY way to do this without >>>> having to accumulate a 'seen' set (which can grow to size N!). The >>>> answer isn't jumping out at me, but that doesn't mean there's not a way. >>>> >>>> I don't want itertools.permutations() to do "equality filtering", but >>>> assuming some other function in itertools were to do that, how could it >>>> do so algorithmically? Or whatever, same question if it is >>>> itertools.permutations(seq, distinct=True) as the API. >>>> >>> Here's an implementation: >>> >>> def unique_permutations(iterable, count=None, key=None): >>> def perm(items, count): >>> if count: >>> prev_item = object() >>> >>> for i, item in enumerate(items): >>> if item != prev_item: >>> for p in perm(items[ : i] + items[i + 1 : ], count - 1): >>> yield [item] + p >>> >>> prev_item = item >>> >>> else: >>> yield [] >>> >>> if key is None: >>> key = lambda item: item >>> >>> items = sorted(iterable, key=key) >>> >>> if count is None: >>> count = len(items) >>> >>> yield from perm(items, count) >>> >>> >>> And some results: >>> >>> >>> print(list("".join(x) for x in unique_permutations('aaabb', 3))) >>> ['aaa', 'aab', 'aba', 'abb', 'baa', 'bab', 'bba'] >>> >>> print(list(unique_permutations([0, 'a', 0], key=str))) >>> [[0, 0, 'a'], [0, 'a', 0], ['a', 0, 0]] >>> >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas@python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> >>> -- >>> >>> --- You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group. >>> To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/dDttJfkyu2k/unsubscribe. >>> To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe@googlegroups.com. >>> For more options, visit https://groups.google.com/groups/opt_out. >> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas@python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> > > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > > _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas >

Yes, that's all true. I want to suggest that the efficient unique permutations solution is very important to have. Sortable sequences are very common. There are itertools routines that only work with =-comparable elements (e.g. groupby), so it's not a stretch to have a permutations that is restricted to <-comparable elements. Best, Neil On Fri, Oct 11, 2013 at 7:49 PM, Nick Coghlan <ncoghlan@gmail.com> wrote: > > On 12 Oct 2013 08:45, "David Mertz" <mertz@gnosis.cx> wrote: > > > > > > I realize after reading > http://stackoverflow.com/questions/6284396/permutations-with-unique-valuesthat my version was ALMOST right: > > > > def nonredundant_permutations(seq, r=None): > > last = () > > for perm in permutations(sorted(seq), r): > > if perm > last: > > yield perm > > last = perm > > > > I can't look only for inequality, but must use the actual comparison. > > > > >>> ["".join(x) for x in nonredundant_permutations('aaabb',3)] > > ['aaa', 'aab', 'aba', 'abb', 'baa', 'bab', 'bba'] > > >>> list(nonredundant_permutations([F(3,1), D(3.0), 3.0])) > > [(Fraction(3, 1), Decimal('3'), 3.0)] > > > > Of course, this approach DOES rely on the order in which > itertools.permutations() returns values. However, it's a bit more compact > than MRAB's version. > > As there is no requirement that entries in a sequence handled by > itertools.permutations be sortable, so the original question of why this > isn't done by default has been answered (the general solution risks > consuming too much memory, while the memory efficient solution constrains > the domain to only sortable sequences). > > Cheers, > Nick. > > > > > > > > > > > On Fri, Oct 11, 2013 at 3:23 PM, Neil Girdhar <mistersheik@gmail.com> > wrote: > >> > >> Beautiful!! > >> > >> > >> On Fri, Oct 11, 2013 at 6:19 PM, MRAB <python@mrabarnett.plus.com> > wrote: > >>> > >>> On 11/10/2013 23:03, David Mertz wrote: > >>>> > >>>> Bummer. You are right, Neil. I saw MRAB's suggestion about sorting, > >>>> and falsely thought that would be general; but obviously it's not. > >>>> > >>>> So I guess the question is whether there is ANY way to do this without > >>>> having to accumulate a 'seen' set (which can grow to size N!). The > >>>> answer isn't jumping out at me, but that doesn't mean there's not a > way. > >>>> > >>>> I don't want itertools.permutations() to do "equality filtering", but > >>>> assuming some other function in itertools were to do that, how could > it > >>>> do so algorithmically? Or whatever, same question if it is > >>>> itertools.permutations(seq, distinct=True) as the API. > >>>> > >>> Here's an implementation: > >>> > >>> def unique_permutations(iterable, count=None, key=None): > >>> def perm(items, count): > >>> if count: > >>> prev_item = object() > >>> > >>> for i, item in enumerate(items): > >>> if item != prev_item: > >>> for p in perm(items[ : i] + items[i + 1 : ], count > - 1): > >>> yield [item] + p > >>> > >>> prev_item = item > >>> > >>> else: > >>> yield [] > >>> > >>> if key is None: > >>> key = lambda item: item > >>> > >>> items = sorted(iterable, key=key) > >>> > >>> if count is None: > >>> count = len(items) > >>> > >>> yield from perm(items, count) > >>> > >>> > >>> And some results: > >>> > >>> >>> print(list("".join(x) for x in unique_permutations('aaabb', 3))) > >>> ['aaa', 'aab', 'aba', 'abb', 'baa', 'bab', 'bba'] > >>> >>> print(list(unique_permutations([0, 'a', 0], key=str))) > >>> [[0, 0, 'a'], [0, 'a', 0], ['a', 0, 0]] > >>> > >>> > >>> _______________________________________________ > >>> Python-ideas mailing list > >>> Python-ideas@python.org > >>> https://mail.python.org/mailman/listinfo/python-ideas > >>> > >>> -- > >>> > >>> --- You received this message because you are subscribed to a topic in > the Google Groups "python-ideas" group. > >>> To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/dDttJfkyu2k/unsubscribe. > >>> To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe@googlegroups.com. > >>> For more options, visit https://groups.google.com/groups/opt_out. > >> > >> > >> > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas@python.org > >> https://mail.python.org/mailman/listinfo/python-ideas > >> > > > > > > > > -- > > Keeping medicines from the bloodstreams of the sick; food > > from the bellies of the hungry; books from the hands of the > > uneducated; technology from the underdeveloped; and putting > > advocates of freedom in prisons. Intellectual property is > > to the 21st century what the slave trade was to the 16th. > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas@python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > >

I think this is worth having even for 3.3 and 2.x, so I'd suggest sending a patch to more-itertools (https://github.com/erikrose/more-itertools) as well as here. Sent from a random iPhone On Oct 11, 2013, at 16:53, Neil Girdhar <mistersheik@gmail.com> wrote:

On 12/10/2013 00:49, Nick Coghlan wrote:
OK, here's a new implementation: def unique_permutations(iterable, count=None): def perm(items, count): if count: prev_item = object() for i, item in enumerate(items): if item != prev_item: for p in perm(items[ : i] + items[i + 1 : ], count - 1): yield [item] + p prev_item = item else: yield [] items = list(iterable) keys = {} for item in items: keys.setdefault(item, len(keys)) items.sort(key=keys.get) if count is None: count = len(items) yield from perm(items, count)

On 12/10/2013 02:55, MRAB wrote:
[snip] I've just realised that I don't need to sort them at all. Here's a new improved implementation: def unique_permutations(iterable, count=None): def perm(items, count): if count: seen = set() for i, item in enumerate(items): if item not in seen: for p in perm(items[ : i] + items[i + 1 : ], count - 1): yield [item] + p seen.add(item) else: yield [] items = list(iterable) if count is None: count = len(items) yield from perm(items, count)

[MRAB, posts a beautiful solution] I don't really have a use for this, but it was a lovely programming puzzle, so I'll include an elaborate elaboration of MRAB's algorithm below. And that's the end of my interest in this ;-) It doesn't require that elements be orderable or even hashable. It does require that they can be compared for equality, but it's pretty clear that if we _do_ include something like this, "equality" has to be pluggable. By default, this uses `operator.__eq__`, but any 2-argument function can be used. E.g., use `operator.is_` to make it believe that only identical objects are equal. Or pass a lambda to distinguish by type too (e.g., if you don't want 3 and 3.0 to be considered equal). Etc. The code is much lower-level, to make it closer to an efficient C implementation. No dicts, no sets, no slicing or concatenation of lists, etc. It sticks to using little integers (indices) as far as possible, which can be native in C (avoiding mounds of increfs and decrefs). Also, because "equality" is pluggable, it may be a slow operation. The `equal()` function is only called here during initial setup, to partition the elements into equivalence classes. Where N is len(iterables), at best `equal()` is called N-1 times (if all elements happen to be equal), and at worst N*(N-1)/2 times (if no elements happen to be equal), all independent of `count`. It assumes `equal()` is transitive. It doesn't always return permutations in the same order as MRAB's function, because - to avoid any searching - it iterates over equivalence classes instead of over the original iterables. This is the simplest differing example I can think of:
list(unique_permutations("aba", 2)) [('a', 'b'), ('a', 'a'), ('b', 'a')]
For the first result, MRAB's function first picks the first 'a', then removes it from the iterables and recurses on ("ba", 1). So it finds 'b' next, and yields ('a', 'b') (note: this is the modified unique_permutations() below - MRAB's original actually yielded lists, not tuples). But:
list(up("aba", 2)) [('a', 'a'), ('a', 'b'), ('b', 'a')]
Different order! That's because "up" is iterating over (conceptually) [EquivClass(first 'a', second 'a'), EquivClass('b')] It first picks the first `a`, then adjusts list pointers (always a fast, constant-time operation) so that it recurses on [EquivClass(second 'a'), EquivClass('b')] So it next finds the second 'a', and yields (first 'a', second 'a') as its first result. Maybe this will make it clearer:
list(up(["a1", "b", "a2"], 2, lambda x, y: x[0]==y[0])) [('a1', 'a2'), ('a1', 'b'), ('b', 'a1')]
No, I guess that didn't make it clearer - LOL ;-) Do I care? No. Anyway, here's the code. Have fun :-) # MRAB's beautiful solution, modified in two ways to be # more like itertools.permutations: # 1. Yield tuples instead of lists. # 2. When count > len(iterable), don't yield anything. def unique_permutations(iterable, count=None): def perm(items, count): if count: seen = set() for i, item in enumerate(items): if item not in seen: for p in perm(items[:i] + items[i+1:], count - 1): yield [item] + p seen.add(item) else: yield [] items = list(iterable) if count is None: count = len(items) if count > len(items): return for p in perm(items, count): yield tuple(p) # New code, ending in generator `up()`. import operator # In C, this would be a struct of native C types, # and the brief methods would be coded inline. class ENode: def __init__(self, initial_index=None): self.indices = [initial_index] # list of equivalent indices self.current = 0 self.prev = self.next = self def index(self): "Return current index." return self.indices[self.current] def unlink(self): "Remove self from list." self.prev.next = self.next self.next.prev = self.prev def insert_after(self, x): "Insert node x after self." x.prev = self x.next = self.next self.next.prev = x self.next = x def advance(self): """Advance the current index. If we're already at the end, remove self from list. .restore() undoes everything .advance() did.""" assert self.current < len(self.indices) self.current += 1 if self.current == len(self.indices): self.unlink() def restore(self): "Undo what .advance() did." assert self.current <= len(self.indices) if self.current == len(self.indices): self.prev.insert_after(self) self.current -= 1 def build_equivalence_classes(items, equal): ehead = ENode() # headed, doubly-linked circular list of equiv classes for i, elt in enumerate(items): e = ehead.next while e is not ehead: if equal(elt, items[e.indices[0]]): # Add (index of) elt to this equivalence class. e.indices.append(i) break e = e.next else: # elt not equal to anything seen so far: append # new equivalence class. e = ENode(i) ehead.prev.insert_after(e) return ehead def up(iterable, count=None, equal=operator.__eq__): def perm(i): if i: e = ehead.next assert e is not ehead while e is not ehead: result[count - i] = e.index() e.advance() yield from perm(i-1) e.restore() e = e.next else: yield tuple(items[j] for j in result) items = tuple(iterable) if count is None: count = len(items) if count > len(items): return ehead = build_equivalence_classes(items, equal) result = [None] * count yield from perm(count)

[Tim]
[MRAB]
I posted yet another implementation after that one.
I know. I was talking about the beautiful one ;-) The later one could build equivalence classes faster (than mine) in many cases, but I don't care much about the startup costs. I care a lot more about: 1. Avoiding searches in the recursive function; i.e., this: for i, item in enumerate(items): if item != prev_item: Making such tests millions (billions ...) of times adds up - and equality testing may not be cheap. The algorithm I posted does no item testing after the setup is done (none in its recursive function). 2. Making "equality" pluggable. Your later algorithm bought "find equivalence classes" speed for hashable elements by using a dict, but a dict's notion of equality can't be changed. So, make equality pluggable, and that startup-time speed advantage vanishes for all but operator.__eq__'s idea of equality.

On 13 October 2013 22:22, Tim Peters <tim.peters@gmail.com> wrote:
It sounds like you want Antoine's TransformDict: http://www.python.org/dev/peps/pep-0455/ Oscar

[Tim]
[Oscar Benjamin]
It sounds like you want Antoine's TransformDict: http://www.python.org/dev/peps/pep-0455/
Not really in this case - I want a two-argument function ("are A and B equal?"). Not all plausible cases of that can be mapped to a canonical hashable key. For example, consider permutations of a list of lists, where the user doesn't want int and float elements of the lists to be considered equal when they happen to have the same value. Is that a stretch? Oh ya ;-)

[Oscar Benjamin]
Will this do? d = TransformDict(lambda x: (type(x), x))
No. In the example I gave, *lists* will be passed as x (it was a list of lists: the lists are the elements of the permutations, and they happen to have internal structure of their own). So the `type(x)` there is useless (it will always be the list type), while the lists themselves would still be compared by operator.__eq__. Not to mention that the constructed tuple isn't hashable anyway (x is a list), so can't be used by TransformDict. So that idea doesn't work several times over ;-)

On 14 October 2013 00:44, Tim Peters <tim.peters@gmail.com> wrote:
Damn, you're right. I obviously didn't think that one through hard enough. Okay how about this? d = TransformDict(lambda x: (tuple(map(type, x)), tuple(x))) Oscar

[Oscar Benjamin]
Oscar, please give this up - it's not going to work. `x` can be any object whatsoever, with arbitrarily complex internal structure, and the user can have an arbitrarily convoluted idea of what "equal" means. Did I mention that these lists don't *only* have ints and floats as elements, but also nested sublists? Oh ya - they also want a float and a singleton list containing the same float to be considered equal ;-) Etc. Besides, you're trying to solve a problem I didn't have to begin with ;-) That is, I don't care much about the cost of building equivalence classes - it's a startup cost for the generator, not an "inner loop" cost. Even if you could bash every case into a different convoluted hashable tuple, in general it's going to be - in this specific problem - far easier for the user to define an equal() function they like, working directly on the two objects. That doesn't require an endless sequence of tricks.

Hi MRAB, I'm confused by your implementation. In particular, what do these lines do? # [...] items = list(iterable) keys = {} for item in items: keys.setdefault(item, len(keys)) items.sort(key=keys.get) I cannot understand how these can possibly have any effect (other than the first line that makes a concrete list out of an iterable). We loop through the list in its natural order. E.g. say the list is '[a, b, c]' (where those names are any types of objects whatsoever). The loop gives us: keys == {a:0, b:1, c:2} When we do a sort on 'key=keys.get()' how can that ever possibly change the order of 'items'? There's also a bit of a flaw in that your implementation blows up if anything yielded by iterable isn't hashable: >>> list(unique_permutations([ [1,2],[3,4],[5,6] ])) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'list' There's no problem doing this with itertools.permutations: >>> list(permutations([[1,2],[3,4],[5,6]])) [([1, 2], [3, 4], [5, 6]), ([1, 2], [5, 6], [3, 4]), ([3, 4], [1, 2], [5, 6]), ([3, 4], [5, 6], [1, 2]), ([5, 6], [1, 2], [3, 4]), ([5, 6], [3, 4], [1, 2])] This particular one also succeeds with my nonredundant_permutations: >>> list(nonredundant_permutations([[1,2],[3,4],[5,6]])) [([1, 2], [3, 4], [5, 6]), ([1, 2], [5, 6], [3, 4]), ([3, 4], [1, 2], [5, 6]), ([3, 4], [5, 6], [1, 2]), ([5, 6], [1, 2], [3, 4]), ([5, 6], [3, 4], [1, 2])] However, my version *DOES* fail when things cannot be compared under inequality: >>> list(nonredundant_permutations([[1,2],3,4])) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in nonredundant_permutations TypeError: unorderable types: int() < list() This also doesn't afflict itertools.permutations. Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On 12/10/2013 03:36, David Mertz wrote:
You're assuming that no item is equal to any other. Try this: keys = {} for item in [1, 2, 2.0]: keys.setdefault(item, len(keys)) You'll get: keys == {1: 0, 2: 1} because 2 == 2.0.
That is true, so here is yet another implementation: ----8<----------------------------------------8<---- def distinct_permutations(iterable, count=None): def perm(items, count): if count: prev_item = object() for i, item in enumerate(items): if item != prev_item: for p in perm(items[ : i] + items[i + 1 : ], count - 1): yield [item] + p prev_item = item else: yield [] hashable_items = {} unhashable_items = [] for item in iterable: try: hashable_items[item].append(item) except KeyError: hashable_items[item] = [item] except TypeError: for key, values in unhashable_items: if key == item: values.append(item) break else: unhashable_items.append((item, [item])) items = [] for values in hashable_items.values(): items.extend(values) for key, values in unhashable_items: items.extend(values) if count is None: count = len(items) yield from perm(items, count) ----8<----------------------------------------8<---- It uses a dict for speed, with the fallback of a list for unhashable items.
My result is:
My result is:

On 11/10/2013 21:27, David Mertz wrote:
n! gets very big very fast, so that can be a very big set. If you sort the original items first then it's much easier to yield unique permutations without having to remember them. (Each would be > than the previous one, although you might have to map them to orderable keys if they're not orderable themselves, e.g. a mixture of integers and strings.)

I think it's fair to use {3.0, 3} as precedent. But note that transitivity is not required by the __eq__() method. In cases of intransitive equality (A == B == C but not A == C), I imagine the result should be ill-defined in the same way that sorting is when the key function is inconsistent. Jon On Fri, Oct 11, 2013 at 4:19 PM, Andrew Barnert <abarnert@yahoo.com> wrote:

Andrew & Neil (or whoever): Is this *really* what you want:
pprint(list(nonredundant_permutations([F(3,1), D(3.0), 3.0]))) [(Fraction(3, 1), Decimal('3'), 3.0)]
It seems odd to me to want that. On the other hand, I provide a one-line implementation of the desired behavior if anyone wants it. Moreover, I don't think the runtime behavior of my one-liner is particularly costly... maybe not the best possible, but the best big-O possible. On Fri, Oct 11, 2013 at 1:19 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Moreover, I don't think the runtime behavior of my one-liner is particularly costly…
It is *extremely* costly. There can be n! permutations, so for even, say, 12 elements, you are looking at many gigabytes of memory needlessly used. One big motivator for itertools is not to have to do this. I'm curious how you would solve this problem: https://www.kattis.com/problems/industrialspy efficiently in Python. I did it by using a unique-ifying generator, but ideally this would not be necessary. Ideally, Python would do exactly what C++ does with next_permutation. Best, Neil On Fri, Oct 11, 2013 at 4:27 PM, David Mertz <mertz@gnosis.cx> wrote:

My code, which was the motivation for this suggestion: import itertools as it import math def is_prime(n): for i in range(2, int(math.floor(math.sqrt(n))) + 1): if n % i == 0: return False return n >= 2 def unique(iterable): # Should not be necessary in my opinion seen = set() for x in iterable: if x not in seen: seen.add(x) yield x n = int(input()) for _ in range(n): x = input() print(sum(is_prime(int("".join(y))) for len_ in range(1, len(x) + 1) for y in unique(it.permutations(x, len_)) if y[0] != '0')) On Fri, Oct 11, 2013 at 5:35 PM, Neil Girdhar <mistersheik@gmail.com> wrote:

On 11 October 2013 22:38, Neil Girdhar <mistersheik@gmail.com> wrote:
I don't really understand what your code is doing but I just wanted to point out that the above will fail for large integers (maybe not relevant in your case):
Even without the OverflowError I suspect that there are primes p > ~1e16 such that is_prime(p**2) would incorrectly return True. This is a consequence of depending on FP arithmetic in what should be exact computation. The easy fix is to break when i**2 > n avoiding the tricky sqrt operation. Alternatively you can use an exact integer sqrt function to fix this: def sqrt_floor(y): try: x = int(math.sqrt(y)) except OverflowError: x = y while not (x ** 2 <= y < (x+1) ** 2): x = (x + y // x) // 2 return x Oscar

On 13 October 2013 19:29, Neil Girdhar <mistersheik@gmail.com> wrote:
Did you read the problem?
I did but since you showed some code that you said you were working on I thought you'd be interested to know that it could be improved.
Anyway, let's not get off topic (permutations).
Getting back to your proposal, I disagree that permutations should be "fixed". The current behaviour is correct. If I was asked to define a permutation I would have given definition #3 from Steven's list: a bijection from a set to itself. Formally a permutation of a collection of non-unique elements is not defined. They may also be uses for a function like the one that you proposed but I've never needed it (and I have used permutations a few times) and no one in this thread (including you) has given a use-case for this. Oscar

On Sun, Oct 13, 2013 at 3:34 PM, Oscar Benjamin <oscar.j.benjamin@gmail.com>wrote:
The code solves the problem according to its specification :) (The numbers are less than 1e8.)
The problem is a use-case. Did you read it? Did you try solving it?

On Sun, Oct 13, 2013 at 1:04 PM, MRAB <python@mrabarnett.plus.com> wrote:
Here's a use case: anagrams.
For what it's worth, I've written anagram-finding code, and I didn't do it with permutations. The faster approach is to create a dictionary mapping a canonical form of each word to a list of words, e.g., { 'ACT': ['ACT', 'CAT'], 'AET': ['ATE', 'EAT', 'ETA', 'TEA'] } This requires extra work to build the map but you do that just once when you read the dictionary and then every lookup is O(1) not O(len(word)). This canonical form approach is useful for other word transformations that are used in puzzles, e.g., words that are have the same consonants (ignoring vowels). --- Bruce I'm hiring: http://www.cadencemd.com/info/jobs Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security P.S. Yes, I know: if you play Scrabble, TAE is also a word.

Here are a couple people looking for the function that doesn't exist (yet?) http://stackoverflow.com/questions/9660085/python-permutations-with-constrai... http://stackoverflow.com/questions/15592299/generating-unique-permutations-i... On Mon, Oct 14, 2013 at 4:52 PM, Bruce Leban <bruce@leapyear.org> wrote:

OK, you're right. Just using set() has bad worst case memory costs. I was thinking of the case where there actually WERE lots of equalities, and hence the resulting list would be much smaller than N!. But of course that's not general. It takes more than one line, but here's an incremental version: def nonredundant_permutations(seq): seq = sorted(seq) last = None for perm in permutations(seq): if perm != last: yield perm last = perm On Fri, Oct 11, 2013 at 2:35 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Unfortunately, that doesn't quite work… list("".join(x) for x in it.permutations('aaabb', 3)) ['aaa', 'aab', 'aab', 'aaa', 'aab', 'aab', 'aba', 'aba', 'abb', 'aba', 'aba', 'abb', 'aaa', 'aab', 'aab', 'aaa', 'aab', 'aab', 'aba', 'aba', 'abb', 'aba', 'aba', 'abb', 'aaa', 'aab', 'aab', 'aaa', 'aab', 'aab', 'aba', 'aba', 'abb', 'aba', 'aba', 'abb', 'baa', 'baa', 'bab', 'baa', 'baa', 'bab', 'baa', 'baa', 'bab', 'bba', 'bba', 'bba', 'baa', 'baa', 'bab', 'baa', 'baa', 'bab', 'baa', 'baa', 'bab', 'bba', 'bba', 'bba'] On Fri, Oct 11, 2013 at 5:48 PM, David Mertz <mertz@gnosis.cx> wrote:

Bummer. You are right, Neil. I saw MRAB's suggestion about sorting, and falsely thought that would be general; but obviously it's not. So I guess the question is whether there is ANY way to do this without having to accumulate a 'seen' set (which can grow to size N!). The answer isn't jumping out at me, but that doesn't mean there's not a way. I don't want itertools.permutations() to do "equality filtering", but assuming some other function in itertools were to do that, how could it do so algorithmically? Or whatever, same question if it is itertools.permutations(seq, distinct=True) as the API. On Fri, Oct 11, 2013 at 2:51 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On 11/10/2013 23:03, David Mertz wrote:
Here's an implementation: def unique_permutations(iterable, count=None, key=None): def perm(items, count): if count: prev_item = object() for i, item in enumerate(items): if item != prev_item: for p in perm(items[ : i] + items[i + 1 : ], count - 1): yield [item] + p prev_item = item else: yield [] if key is None: key = lambda item: item items = sorted(iterable, key=key) if count is None: count = len(items) yield from perm(items, count) And some results:

I realize after reading http://stackoverflow.com/questions/6284396/permutations-with-unique-valuesth... my version was ALMOST right: def nonredundant_permutations(seq, r=None): last = () for perm in permutations(sorted(seq), r): if perm > last: yield perm last = perm I can't look only for inequality, but must use the actual comparison.
Of course, this approach DOES rely on the order in which itertools.permutations() returns values. However, it's a bit more compact than MRAB's version. On Fri, Oct 11, 2013 at 3:23 PM, Neil Girdhar <mistersheik@gmail.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On 12 Oct 2013 08:45, "David Mertz" <mertz@gnosis.cx> wrote: > > > I realize after reading http://stackoverflow.com/questions/6284396/permutations-with-unique-valuesthat my version was ALMOST right: > > def nonredundant_permutations(seq, r=None): > last = () > for perm in permutations(sorted(seq), r): > if perm > last: > yield perm > last = perm > > I can't look only for inequality, but must use the actual comparison. > > >>> ["".join(x) for x in nonredundant_permutations('aaabb',3)] > ['aaa', 'aab', 'aba', 'abb', 'baa', 'bab', 'bba'] > >>> list(nonredundant_permutations([F(3,1), D(3.0), 3.0])) > [(Fraction(3, 1), Decimal('3'), 3.0)] > > Of course, this approach DOES rely on the order in which itertools.permutations() returns values. However, it's a bit more compact than MRAB's version. As there is no requirement that entries in a sequence handled by itertools.permutations be sortable, so the original question of why this isn't done by default has been answered (the general solution risks consuming too much memory, while the memory efficient solution constrains the domain to only sortable sequences). Cheers, Nick. > > > > > On Fri, Oct 11, 2013 at 3:23 PM, Neil Girdhar <mistersheik@gmail.com> wrote: >> >> Beautiful!! >> >> >> On Fri, Oct 11, 2013 at 6:19 PM, MRAB <python@mrabarnett.plus.com> wrote: >>> >>> On 11/10/2013 23:03, David Mertz wrote: >>>> >>>> Bummer. You are right, Neil. I saw MRAB's suggestion about sorting, >>>> and falsely thought that would be general; but obviously it's not. >>>> >>>> So I guess the question is whether there is ANY way to do this without >>>> having to accumulate a 'seen' set (which can grow to size N!). The >>>> answer isn't jumping out at me, but that doesn't mean there's not a way. >>>> >>>> I don't want itertools.permutations() to do "equality filtering", but >>>> assuming some other function in itertools were to do that, how could it >>>> do so algorithmically? Or whatever, same question if it is >>>> itertools.permutations(seq, distinct=True) as the API. >>>> >>> Here's an implementation: >>> >>> def unique_permutations(iterable, count=None, key=None): >>> def perm(items, count): >>> if count: >>> prev_item = object() >>> >>> for i, item in enumerate(items): >>> if item != prev_item: >>> for p in perm(items[ : i] + items[i + 1 : ], count - 1): >>> yield [item] + p >>> >>> prev_item = item >>> >>> else: >>> yield [] >>> >>> if key is None: >>> key = lambda item: item >>> >>> items = sorted(iterable, key=key) >>> >>> if count is None: >>> count = len(items) >>> >>> yield from perm(items, count) >>> >>> >>> And some results: >>> >>> >>> print(list("".join(x) for x in unique_permutations('aaabb', 3))) >>> ['aaa', 'aab', 'aba', 'abb', 'baa', 'bab', 'bba'] >>> >>> print(list(unique_permutations([0, 'a', 0], key=str))) >>> [[0, 0, 'a'], [0, 'a', 0], ['a', 0, 0]] >>> >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas@python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >>> >>> -- >>> >>> --- You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group. >>> To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/dDttJfkyu2k/unsubscribe. >>> To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe@googlegroups.com. >>> For more options, visit https://groups.google.com/groups/opt_out. >> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas@python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> > > > > -- > Keeping medicines from the bloodstreams of the sick; food > from the bellies of the hungry; books from the hands of the > uneducated; technology from the underdeveloped; and putting > advocates of freedom in prisons. Intellectual property is > to the 21st century what the slave trade was to the 16th. > > _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas >

Yes, that's all true. I want to suggest that the efficient unique permutations solution is very important to have. Sortable sequences are very common. There are itertools routines that only work with =-comparable elements (e.g. groupby), so it's not a stretch to have a permutations that is restricted to <-comparable elements. Best, Neil On Fri, Oct 11, 2013 at 7:49 PM, Nick Coghlan <ncoghlan@gmail.com> wrote: > > On 12 Oct 2013 08:45, "David Mertz" <mertz@gnosis.cx> wrote: > > > > > > I realize after reading > http://stackoverflow.com/questions/6284396/permutations-with-unique-valuesthat my version was ALMOST right: > > > > def nonredundant_permutations(seq, r=None): > > last = () > > for perm in permutations(sorted(seq), r): > > if perm > last: > > yield perm > > last = perm > > > > I can't look only for inequality, but must use the actual comparison. > > > > >>> ["".join(x) for x in nonredundant_permutations('aaabb',3)] > > ['aaa', 'aab', 'aba', 'abb', 'baa', 'bab', 'bba'] > > >>> list(nonredundant_permutations([F(3,1), D(3.0), 3.0])) > > [(Fraction(3, 1), Decimal('3'), 3.0)] > > > > Of course, this approach DOES rely on the order in which > itertools.permutations() returns values. However, it's a bit more compact > than MRAB's version. > > As there is no requirement that entries in a sequence handled by > itertools.permutations be sortable, so the original question of why this > isn't done by default has been answered (the general solution risks > consuming too much memory, while the memory efficient solution constrains > the domain to only sortable sequences). > > Cheers, > Nick. > > > > > > > > > > > On Fri, Oct 11, 2013 at 3:23 PM, Neil Girdhar <mistersheik@gmail.com> > wrote: > >> > >> Beautiful!! > >> > >> > >> On Fri, Oct 11, 2013 at 6:19 PM, MRAB <python@mrabarnett.plus.com> > wrote: > >>> > >>> On 11/10/2013 23:03, David Mertz wrote: > >>>> > >>>> Bummer. You are right, Neil. I saw MRAB's suggestion about sorting, > >>>> and falsely thought that would be general; but obviously it's not. > >>>> > >>>> So I guess the question is whether there is ANY way to do this without > >>>> having to accumulate a 'seen' set (which can grow to size N!). The > >>>> answer isn't jumping out at me, but that doesn't mean there's not a > way. > >>>> > >>>> I don't want itertools.permutations() to do "equality filtering", but > >>>> assuming some other function in itertools were to do that, how could > it > >>>> do so algorithmically? Or whatever, same question if it is > >>>> itertools.permutations(seq, distinct=True) as the API. > >>>> > >>> Here's an implementation: > >>> > >>> def unique_permutations(iterable, count=None, key=None): > >>> def perm(items, count): > >>> if count: > >>> prev_item = object() > >>> > >>> for i, item in enumerate(items): > >>> if item != prev_item: > >>> for p in perm(items[ : i] + items[i + 1 : ], count > - 1): > >>> yield [item] + p > >>> > >>> prev_item = item > >>> > >>> else: > >>> yield [] > >>> > >>> if key is None: > >>> key = lambda item: item > >>> > >>> items = sorted(iterable, key=key) > >>> > >>> if count is None: > >>> count = len(items) > >>> > >>> yield from perm(items, count) > >>> > >>> > >>> And some results: > >>> > >>> >>> print(list("".join(x) for x in unique_permutations('aaabb', 3))) > >>> ['aaa', 'aab', 'aba', 'abb', 'baa', 'bab', 'bba'] > >>> >>> print(list(unique_permutations([0, 'a', 0], key=str))) > >>> [[0, 0, 'a'], [0, 'a', 0], ['a', 0, 0]] > >>> > >>> > >>> _______________________________________________ > >>> Python-ideas mailing list > >>> Python-ideas@python.org > >>> https://mail.python.org/mailman/listinfo/python-ideas > >>> > >>> -- > >>> > >>> --- You received this message because you are subscribed to a topic in > the Google Groups "python-ideas" group. > >>> To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/dDttJfkyu2k/unsubscribe. > >>> To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscribe@googlegroups.com. > >>> For more options, visit https://groups.google.com/groups/opt_out. > >> > >> > >> > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas@python.org > >> https://mail.python.org/mailman/listinfo/python-ideas > >> > > > > > > > > -- > > Keeping medicines from the bloodstreams of the sick; food > > from the bellies of the hungry; books from the hands of the > > uneducated; technology from the underdeveloped; and putting > > advocates of freedom in prisons. Intellectual property is > > to the 21st century what the slave trade was to the 16th. > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas@python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > >

I think this is worth having even for 3.3 and 2.x, so I'd suggest sending a patch to more-itertools (https://github.com/erikrose/more-itertools) as well as here. Sent from a random iPhone On Oct 11, 2013, at 16:53, Neil Girdhar <mistersheik@gmail.com> wrote:

On 12/10/2013 00:49, Nick Coghlan wrote:
OK, here's a new implementation: def unique_permutations(iterable, count=None): def perm(items, count): if count: prev_item = object() for i, item in enumerate(items): if item != prev_item: for p in perm(items[ : i] + items[i + 1 : ], count - 1): yield [item] + p prev_item = item else: yield [] items = list(iterable) keys = {} for item in items: keys.setdefault(item, len(keys)) items.sort(key=keys.get) if count is None: count = len(items) yield from perm(items, count)

On 12/10/2013 02:55, MRAB wrote:
[snip] I've just realised that I don't need to sort them at all. Here's a new improved implementation: def unique_permutations(iterable, count=None): def perm(items, count): if count: seen = set() for i, item in enumerate(items): if item not in seen: for p in perm(items[ : i] + items[i + 1 : ], count - 1): yield [item] + p seen.add(item) else: yield [] items = list(iterable) if count is None: count = len(items) yield from perm(items, count)

[MRAB, posts a beautiful solution] I don't really have a use for this, but it was a lovely programming puzzle, so I'll include an elaborate elaboration of MRAB's algorithm below. And that's the end of my interest in this ;-) It doesn't require that elements be orderable or even hashable. It does require that they can be compared for equality, but it's pretty clear that if we _do_ include something like this, "equality" has to be pluggable. By default, this uses `operator.__eq__`, but any 2-argument function can be used. E.g., use `operator.is_` to make it believe that only identical objects are equal. Or pass a lambda to distinguish by type too (e.g., if you don't want 3 and 3.0 to be considered equal). Etc. The code is much lower-level, to make it closer to an efficient C implementation. No dicts, no sets, no slicing or concatenation of lists, etc. It sticks to using little integers (indices) as far as possible, which can be native in C (avoiding mounds of increfs and decrefs). Also, because "equality" is pluggable, it may be a slow operation. The `equal()` function is only called here during initial setup, to partition the elements into equivalence classes. Where N is len(iterables), at best `equal()` is called N-1 times (if all elements happen to be equal), and at worst N*(N-1)/2 times (if no elements happen to be equal), all independent of `count`. It assumes `equal()` is transitive. It doesn't always return permutations in the same order as MRAB's function, because - to avoid any searching - it iterates over equivalence classes instead of over the original iterables. This is the simplest differing example I can think of:
list(unique_permutations("aba", 2)) [('a', 'b'), ('a', 'a'), ('b', 'a')]
For the first result, MRAB's function first picks the first 'a', then removes it from the iterables and recurses on ("ba", 1). So it finds 'b' next, and yields ('a', 'b') (note: this is the modified unique_permutations() below - MRAB's original actually yielded lists, not tuples). But:
list(up("aba", 2)) [('a', 'a'), ('a', 'b'), ('b', 'a')]
Different order! That's because "up" is iterating over (conceptually) [EquivClass(first 'a', second 'a'), EquivClass('b')] It first picks the first `a`, then adjusts list pointers (always a fast, constant-time operation) so that it recurses on [EquivClass(second 'a'), EquivClass('b')] So it next finds the second 'a', and yields (first 'a', second 'a') as its first result. Maybe this will make it clearer:
list(up(["a1", "b", "a2"], 2, lambda x, y: x[0]==y[0])) [('a1', 'a2'), ('a1', 'b'), ('b', 'a1')]
No, I guess that didn't make it clearer - LOL ;-) Do I care? No. Anyway, here's the code. Have fun :-) # MRAB's beautiful solution, modified in two ways to be # more like itertools.permutations: # 1. Yield tuples instead of lists. # 2. When count > len(iterable), don't yield anything. def unique_permutations(iterable, count=None): def perm(items, count): if count: seen = set() for i, item in enumerate(items): if item not in seen: for p in perm(items[:i] + items[i+1:], count - 1): yield [item] + p seen.add(item) else: yield [] items = list(iterable) if count is None: count = len(items) if count > len(items): return for p in perm(items, count): yield tuple(p) # New code, ending in generator `up()`. import operator # In C, this would be a struct of native C types, # and the brief methods would be coded inline. class ENode: def __init__(self, initial_index=None): self.indices = [initial_index] # list of equivalent indices self.current = 0 self.prev = self.next = self def index(self): "Return current index." return self.indices[self.current] def unlink(self): "Remove self from list." self.prev.next = self.next self.next.prev = self.prev def insert_after(self, x): "Insert node x after self." x.prev = self x.next = self.next self.next.prev = x self.next = x def advance(self): """Advance the current index. If we're already at the end, remove self from list. .restore() undoes everything .advance() did.""" assert self.current < len(self.indices) self.current += 1 if self.current == len(self.indices): self.unlink() def restore(self): "Undo what .advance() did." assert self.current <= len(self.indices) if self.current == len(self.indices): self.prev.insert_after(self) self.current -= 1 def build_equivalence_classes(items, equal): ehead = ENode() # headed, doubly-linked circular list of equiv classes for i, elt in enumerate(items): e = ehead.next while e is not ehead: if equal(elt, items[e.indices[0]]): # Add (index of) elt to this equivalence class. e.indices.append(i) break e = e.next else: # elt not equal to anything seen so far: append # new equivalence class. e = ENode(i) ehead.prev.insert_after(e) return ehead def up(iterable, count=None, equal=operator.__eq__): def perm(i): if i: e = ehead.next assert e is not ehead while e is not ehead: result[count - i] = e.index() e.advance() yield from perm(i-1) e.restore() e = e.next else: yield tuple(items[j] for j in result) items = tuple(iterable) if count is None: count = len(items) if count > len(items): return ehead = build_equivalence_classes(items, equal) result = [None] * count yield from perm(count)

[Tim]
[MRAB]
I posted yet another implementation after that one.
I know. I was talking about the beautiful one ;-) The later one could build equivalence classes faster (than mine) in many cases, but I don't care much about the startup costs. I care a lot more about: 1. Avoiding searches in the recursive function; i.e., this: for i, item in enumerate(items): if item != prev_item: Making such tests millions (billions ...) of times adds up - and equality testing may not be cheap. The algorithm I posted does no item testing after the setup is done (none in its recursive function). 2. Making "equality" pluggable. Your later algorithm bought "find equivalence classes" speed for hashable elements by using a dict, but a dict's notion of equality can't be changed. So, make equality pluggable, and that startup-time speed advantage vanishes for all but operator.__eq__'s idea of equality.

On 13 October 2013 22:22, Tim Peters <tim.peters@gmail.com> wrote:
It sounds like you want Antoine's TransformDict: http://www.python.org/dev/peps/pep-0455/ Oscar

[Tim]
[Oscar Benjamin]
It sounds like you want Antoine's TransformDict: http://www.python.org/dev/peps/pep-0455/
Not really in this case - I want a two-argument function ("are A and B equal?"). Not all plausible cases of that can be mapped to a canonical hashable key. For example, consider permutations of a list of lists, where the user doesn't want int and float elements of the lists to be considered equal when they happen to have the same value. Is that a stretch? Oh ya ;-)

[Oscar Benjamin]
Will this do? d = TransformDict(lambda x: (type(x), x))
No. In the example I gave, *lists* will be passed as x (it was a list of lists: the lists are the elements of the permutations, and they happen to have internal structure of their own). So the `type(x)` there is useless (it will always be the list type), while the lists themselves would still be compared by operator.__eq__. Not to mention that the constructed tuple isn't hashable anyway (x is a list), so can't be used by TransformDict. So that idea doesn't work several times over ;-)

On 14 October 2013 00:44, Tim Peters <tim.peters@gmail.com> wrote:
Damn, you're right. I obviously didn't think that one through hard enough. Okay how about this? d = TransformDict(lambda x: (tuple(map(type, x)), tuple(x))) Oscar

[Oscar Benjamin]
Oscar, please give this up - it's not going to work. `x` can be any object whatsoever, with arbitrarily complex internal structure, and the user can have an arbitrarily convoluted idea of what "equal" means. Did I mention that these lists don't *only* have ints and floats as elements, but also nested sublists? Oh ya - they also want a float and a singleton list containing the same float to be considered equal ;-) Etc. Besides, you're trying to solve a problem I didn't have to begin with ;-) That is, I don't care much about the cost of building equivalence classes - it's a startup cost for the generator, not an "inner loop" cost. Even if you could bash every case into a different convoluted hashable tuple, in general it's going to be - in this specific problem - far easier for the user to define an equal() function they like, working directly on the two objects. That doesn't require an endless sequence of tricks.

Hi MRAB, I'm confused by your implementation. In particular, what do these lines do? # [...] items = list(iterable) keys = {} for item in items: keys.setdefault(item, len(keys)) items.sort(key=keys.get) I cannot understand how these can possibly have any effect (other than the first line that makes a concrete list out of an iterable). We loop through the list in its natural order. E.g. say the list is '[a, b, c]' (where those names are any types of objects whatsoever). The loop gives us: keys == {a:0, b:1, c:2} When we do a sort on 'key=keys.get()' how can that ever possibly change the order of 'items'? There's also a bit of a flaw in that your implementation blows up if anything yielded by iterable isn't hashable: >>> list(unique_permutations([ [1,2],[3,4],[5,6] ])) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'list' There's no problem doing this with itertools.permutations: >>> list(permutations([[1,2],[3,4],[5,6]])) [([1, 2], [3, 4], [5, 6]), ([1, 2], [5, 6], [3, 4]), ([3, 4], [1, 2], [5, 6]), ([3, 4], [5, 6], [1, 2]), ([5, 6], [1, 2], [3, 4]), ([5, 6], [3, 4], [1, 2])] This particular one also succeeds with my nonredundant_permutations: >>> list(nonredundant_permutations([[1,2],[3,4],[5,6]])) [([1, 2], [3, 4], [5, 6]), ([1, 2], [5, 6], [3, 4]), ([3, 4], [1, 2], [5, 6]), ([3, 4], [5, 6], [1, 2]), ([5, 6], [1, 2], [3, 4]), ([5, 6], [3, 4], [1, 2])] However, my version *DOES* fail when things cannot be compared under inequality: >>> list(nonredundant_permutations([[1,2],3,4])) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in nonredundant_permutations TypeError: unorderable types: int() < list() This also doesn't afflict itertools.permutations. Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On 12/10/2013 03:36, David Mertz wrote:
You're assuming that no item is equal to any other. Try this: keys = {} for item in [1, 2, 2.0]: keys.setdefault(item, len(keys)) You'll get: keys == {1: 0, 2: 1} because 2 == 2.0.
That is true, so here is yet another implementation: ----8<----------------------------------------8<---- def distinct_permutations(iterable, count=None): def perm(items, count): if count: prev_item = object() for i, item in enumerate(items): if item != prev_item: for p in perm(items[ : i] + items[i + 1 : ], count - 1): yield [item] + p prev_item = item else: yield [] hashable_items = {} unhashable_items = [] for item in iterable: try: hashable_items[item].append(item) except KeyError: hashable_items[item] = [item] except TypeError: for key, values in unhashable_items: if key == item: values.append(item) break else: unhashable_items.append((item, [item])) items = [] for values in hashable_items.values(): items.extend(values) for key, values in unhashable_items: items.extend(values) if count is None: count = len(items) yield from perm(items, count) ----8<----------------------------------------8<---- It uses a dict for speed, with the fallback of a list for unhashable items.
My result is:
My result is:

On 11/10/2013 21:27, David Mertz wrote:
n! gets very big very fast, so that can be a very big set. If you sort the original items first then it's much easier to yield unique permutations without having to remember them. (Each would be > than the previous one, although you might have to map them to orderable keys if they're not orderable themselves, e.g. a mixture of integers and strings.)

I think it's fair to use {3.0, 3} as precedent. But note that transitivity is not required by the __eq__() method. In cases of intransitive equality (A == B == C but not A == C), I imagine the result should be ill-defined in the same way that sorting is when the key function is inconsistent. Jon On Fri, Oct 11, 2013 at 4:19 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
participants (9)
-
Andrew Barnert
-
Bruce Leban
-
David Mertz
-
Jonathan Brandvein
-
MRAB
-
Neil Girdhar
-
Nick Coghlan
-
Oscar Benjamin
-
Tim Peters