# Recursive generator for combinations of a multiset?

James hslee911 at yahoo.com
Fri Nov 22 03:14:41 CET 2013

```On Thursday, November 21, 2013 5:01:15 AM UTC-8, John O'Hagan wrote:
> On Thu, 21 Nov 2013 11:42:49 +0000
>
> Oscar Benjamin  wrote:
>
>
>
> > On 21 November 2013 06:46, John O'Hagan
>
> > wrote:
>
> > >
>
> > > I found a verbal description of such an algorithm and came up with
>
> > > this:
>
> > >
>
> > > def multicombs(it, r):
>
> > >     result = it[:r]
>
> > >     yield result
>
> > >     while 1:
>
> > >         for i in range(-1, -r - 1, -1):
>
> > >             rep = result[i]
>
> > >             if rep < it[i]:
>
> > >                 break
>
> > >         else:
>
> > >             break
>
> > >         for j, n in enumerate(it):
>
> > >             if n > rep:
>
> > >                 break
>
> > >         result = result[:i] + it[j:j - i]
>
> > >         yield result
>
> >
>
> > I'm not really sure what it is you're asking for. I thought if I ran
>
> > the code I'd understand but that just confused me more. Is the output
>
> > below correct? If not what should it be?
>
> >
>
>
> > ['']
>
>
> > ['a']
>
>
> > ['ab', 'br', 'ra']
>
>
> > ['abr', 'ara', 'bra']
>
>
> > ['abra']
>
>
> > ['abrac', 'abrbr', 'abrra', 'braca', 'brara', 'brbra', 'racad',
>
> > 'racbr', 'racra']
>
>
>
>
>
> I neglected to mention that multicombs takes a sorted iterable;
>
> it doesn't work right otherwise. I'd forgotten that because my
>
> wordlists are guaranteed sorted by the way they're built. Sorry about
>
> that.
>
>
>
> In my use-case the first argument to multicombs is a tuple of words
>
> which may contain duplicates, and it produces all unique combinations
>
> of a certain length of those words, eg:
>
>
>
> list(multicombs(('cat', 'hat', 'in', 'the', 'the'), 3))
>
>
>
> [('cat', 'hat', 'in'), ('cat', 'hat', 'the'), ('cat', 'in', 'the'),
>
> ('cat', 'the', 'the'), ('hat', 'in', 'the'), ('hat', 'the', 'the'),
>
> ('in', 'the', 'the')]
>
>
>
> Contrast this with:
>
>
>
> list(itertools.combinations(('cat', 'hat', 'in', 'the', 'the'), 3))
>
>
>
> [('cat', 'hat', 'in'), ('cat', 'hat', 'the'), ('cat', 'hat', 'the'),
>
> ('cat', 'in', 'the'), ('cat', 'in', 'the'), ('cat', 'the', 'the'),
>
> ('hat', 'in', 'the'), ('hat', 'in', 'the'), ('hat', 'the', 'the'),
>
> ('in', 'the', 'the')]
>
>
>
> which produces results which are redundant for my purposes.
>
>
>
> What I'm looking for is a recursive algorithm which does what
>
> multicombs does (order unimportant) so that I can apply a pruning
>
> shortcut like the one I used in the recursive cartesian product
>
> algorithm in my original post.
>
>
>
> Multiset combination algorithms seem pretty thin on the ground out
>
> there - as I said, I could only find a description of the procedure
>
> above, no actual code. The ones I did find are non-recursive. I'm
>
> hoping some combinatorics and/or recursion experts can offer advice.
>
>
>
> Regards,
>
>
>
> --
>
>
>
> John

Could convert the following perl script to python?

use Data::Dump qw(dump);
dump combo([@ARGV], 3);

sub combo {
my (\$t, \$k) = @_;
my @T = @\$t;
my @R = ();
my %g = ();
if (\$k == 1) {
for (@T) {
push @R, \$_ unless \$g{\$_}++;
}
} else {
while (my \$x = shift @T) {
\$p = combo([@T], \$k-1);
for (@{\$p}) {
my \$q = \$x.",".\$_;
push @R, \$q unless \$g{\$q}++;
}
}
}
[@R];
}

\$ prog.pl cat hat in the the
[
"cat,hat,in",
"cat,hat,the",
"cat,in,the",
"cat,the,the",
"hat,in,the",
"hat,the,the",
"in,the,the",
]

James

```