[Python-Dev] Re: PEP 218 (sets); moving set.py to Lib

Guido van Rossum guido@python.org
Tue, 20 Aug 2002 23:47:11 -0400


> Here's the pre-generator version I wrote using lists as the underlying
> representation.  Should be trivially transformable into a generator
> version.  I'd do it myself but I'm heads-down on bogofilter just now
> 
> def powerset(base):
>     "Compute the set of all subsets of a set."
>     powerset = []
>     for n in xrange(2 ** len(base)):
>         subset = []
>         for e in xrange(len(base)):
>              if n & 2 ** e:
>                 subset.append(base[e])
>         powerset.append(subset)
>     return powerset
> 
> Are you slapping your forehead yet? :-)

Yes!  I didn't actually know that algorithm.

Here's the generator version for sets (still requires a real set as
input):

def powerset(base):
    size = len(base)
    for n in xrange(2**size):
        subset = []
        for e, x in enumerate(base):
            if n & 2**e:
                subset.append(x)
        yield Set(subset)

I would like to write n & (1<<e) instead of n & 2**e; but 1<<e drops
bits when e is > 31.  Now, for a set with that many elements, there's
no hope that this will ever complete in finite time, but does that
mean it shouldn't start?  I could write 1L<<e and avoid the issue, but
then I'd be paying for long ops that I'll only ever need in a case
that's only of theoretical importance.

A variation: rather than calling enumerate(base) 2**size times,
concretize in it into a list.  We know it can't be very big or else
the result isn't representable:

def powerset(base):
    size = len(base)
    pairs = list(enumerate(base))
    for n in xrange(2**size):
        subset = []
        for e, x in pairs:
            if n & 2**e:
                subset.append(x)
        yield Set(subset)

Ah, and now it's trivial to write this so that base can be an
arbitrary iterable again, rather than a set (or sequence):

def powerset(base):
    pairs = list(enumerate(base))
    size = len(pairs)
    for n in xrange(2**size):
        subset = []
        for e, x in pairs:
            if n & 2**e:
                subset.append(x)
        yield subset

This is a generator that yields a series of lists whose values are the
items of base.  And again, like cartesian product, it's now more a
generator thing than a set thing.

BTW, the correctness of all my versions trivially derives from the
correctness of your version -- each is a very simple transformation of
the previous one.  My mentor Lambert Meertens calls this process
Algorithmics (and has developed a mathematical notation and theory for
program transformations).

Thanks!

--Guido van Rossum (home page: http://www.python.org/~guido/)