[Tutor] recursion and power sets

Tim Peters tim.one at comcast.net
Mon Mar 8 23:34:16 EST 2004


[alice]
> ..
> <image>
> light bulb turning on above alice's head
> </image>
>
> <python>
>  def powset(seq):
>      if seq:
>          head, tail = seq[:1], seq[1:]
>          for smaller in powset(tail):
>              yield smaller
>              yield head + smaller
>      else:
>          yield []
>
>  for s in powset([1, 2, 3]):
>      print s
> </python>
>
> This is one of them iterator-thingy-ma-bobs, isn't it?  hmmmm.....

It's called a generator.  It's always easy to turn a generator into a
function that materializes the entire result list in one gulp; for example,
we could write the function above like so:

def powset2(seq):
    result = []
    if seq:
        head, tail = seq[:1], seq[1:]
        for smaller in powset(tail):
            result.append(smaller)
            result.append(head + smaller)
    else:
        result.append([])
    return result

IOW, everywhere we "yield"ed a result in the generator version, in the
giant-result-list version we just append it to a result list instead.

The joy of a generator is that it produces intermediate results one at a
time.  If, for example, len(seq)==60, there's not enough computer memory on
Earth to hold the entire powerset at once, but the generator version doesn't
need to.  There's a lot more info about generators in PEP 255:

    http://www.python.org/peps/pep-0255.html

Once you're used to them, they're almost always exactly what you want for,
umm, *generating* a sequence of combinatorial objects (powersets,
combinations, permutations, etc).


[Tim]
>> There isn't really a canonical way -- "whatever works" is fine.....

[alice]
> I kind of thought that, with respect to algorithms, "canonical" and
> "most concise" were the same thing? Surely python programmers, of all
> people, are not of the attitude that: "whatever works is fine"!

Happy Python programmers generally prefer clarity to conciseness, and the
two aren't always the same.  If you're generating powersets just for the
intellectual challenge, then the definition of "works" is pretty loose.
But, for example, if you're generating powersets for "a real reason",
perhaps you *need* to generate them in increasing size.  Then the definition
of "works" gets harder to meet.

> In retrospect it does seem like a very natural way of thinking about
> the problem. I wonder why I didn't see it myself?
> Perhaps because I knew that I _could_ find the subsets in order of
> smallest to largest, I felt that somehow I _should_.

The *output* is prettier that way.  You could try to think about that
recursively too:  suppose you had the powerset of S, and wanted the powerset
of S *plus* a new element x, and wanted to generate the elements in
increasing order of size.  Well, we could first sort the powerset of S by
size.  The powerset of T is again the powerset of S, plus the powerset of S
fiddled to add x to each element.  Adding x to an element increases its size
by 1.  So we first want to yield the size-0 elements of the powerset of S,
then the size-1 elements of the powerset of S, then the size-0 elements of
the powerset of S fiddled to add x to each, then the size-2 elements of the
powerset of S, then the size-1 elements of the powerset of S fiddled to add
x to each, and so on.  Here's a program to do that:

def powset_bysize(seq):
    if seq:
        head, tail = seq[:-1], seq[-1:]
        segregated_by_size = [[] for dummy in range(len(seq))]
        for x in powset_bysize(head):
            segregated_by_size[len(x)].append(x)
        last_one = []
        for sized_list in segregated_by_size:
            for x in sized_list:
                yield x
            for x in last_one:
                yield x + tail
            last_one = sized_list
        for x in last_one:
            yield x + tail
    else:
        yield []

Then powset_bysize([1, 2, 3]) generates in this order:

[]
[1]
[2]
[3]
[1, 2]
[1, 3]
[2, 3]
[1, 2, 3]

It's not nearly as *clear* a program, though, IMO.  Maybe you can find a way
to make it prettier.  It also loses much of the memory advantage of
generators, because the topmost invocation ends up materializing the entire
powerset of seq[:-1] (which is half the size of the powerset of seq) at
once.

A more natural approach could follow from the one you started with:  take
"generate all the subsets of size i" as a problem in its own right (that's
called "a combination", btw).  Then, assuming you have a generator

    gen_subsets_of_size(set, size)

the powerset generator is just

    def gen_powset_by_size(set):
        for size in range(len(set) + 1):
            for x in gen_subsets_of_size(set, size):
                yield x

That has a different kind of beauty, because gen_subsets_of_size() is likely
to be a useful generator on its own, and it's always lovely to reuse code!

In the concise department, here's a shorter way to write a giant-result-list
version:

def powset3(seq):
    if seq:
        p = powset3(seq[1:])
        return p + [x + seq[:1] for x in p]
    else:
        return [[]]

If you want to get *really* concise <wink>, here's a tiny non-recursive
generator version, using the bit-fiddling trick you discovered earlier:

def powset4(seq):
    pairs = [(2**i, x) for i, x in enumerate(seq)]
    for i in xrange(2**len(pairs)):
        yield [x for (mask, x) in pairs if i & mask]

Are these getting clearer as they get shorter?  I don't think so.

> hmmmm...
> Anyway, thanks for guiding me from the first apparent solution to the
> beautiful one!

It's just decades of practice <wink>.




More information about the Tutor mailing list