[Python-Dev] Re: PEP 218 (sets); moving set.py to Lib
Guido van Rossum
guido@python.org
Tue, 20 Aug 2002 23:47:11 -0400
> Here's the pre-generator version I wrote using lists as the underlying
> representation. Should be trivially transformable into a generator
> version. I'd do it myself but I'm heads-down on bogofilter just now
>
> def powerset(base):
> "Compute the set of all subsets of a set."
> powerset = []
> for n in xrange(2 ** len(base)):
> subset = []
> for e in xrange(len(base)):
> if n & 2 ** e:
> subset.append(base[e])
> powerset.append(subset)
> return powerset
>
> Are you slapping your forehead yet? :-)
Yes! I didn't actually know that algorithm.
Here's the generator version for sets (still requires a real set as
input):
def powerset(base):
size = len(base)
for n in xrange(2**size):
subset = []
for e, x in enumerate(base):
if n & 2**e:
subset.append(x)
yield Set(subset)
I would like to write n & (1<<e) instead of n & 2**e; but 1<<e drops
bits when e is > 31. Now, for a set with that many elements, there's
no hope that this will ever complete in finite time, but does that
mean it shouldn't start? I could write 1L<<e and avoid the issue, but
then I'd be paying for long ops that I'll only ever need in a case
that's only of theoretical importance.
A variation: rather than calling enumerate(base) 2**size times,
concretize in it into a list. We know it can't be very big or else
the result isn't representable:
def powerset(base):
size = len(base)
pairs = list(enumerate(base))
for n in xrange(2**size):
subset = []
for e, x in pairs:
if n & 2**e:
subset.append(x)
yield Set(subset)
Ah, and now it's trivial to write this so that base can be an
arbitrary iterable again, rather than a set (or sequence):
def powerset(base):
pairs = list(enumerate(base))
size = len(pairs)
for n in xrange(2**size):
subset = []
for e, x in pairs:
if n & 2**e:
subset.append(x)
yield subset
This is a generator that yields a series of lists whose values are the
items of base. And again, like cartesian product, it's now more a
generator thing than a set thing.
BTW, the correctness of all my versions trivially derives from the
correctness of your version -- each is a very simple transformation of
the previous one. My mentor Lambert Meertens calls this process
Algorithmics (and has developed a mathematical notation and theory for
program transformations).
Thanks!
--Guido van Rossum (home page: http://www.python.org/~guido/)