[Python-Dev] Re: A `cogen' module [was: Re: PEP 218 (sets); moving set.py to Lib]
Guido van Rossum
guido@python.org
Wed, 28 Aug 2002 15:41:00 -0400
> Of course, even statisticians cannot afford all subsets of sets having
> 100 elements, or will not scan permutations of 1000 elements. But 100s
> or 1000s elements are well within the bounds of cartesian products.
Yes, and there the cost of an extra list() copy is neglegeable
(allocate and copy 4K bytes).
> > Or you might use Oren's favorite rule of thumb and listify it when iter(x)
> > is iter(x) (or iter(x) is x).
>
> I'm a bit annoyed by the idea that `iter(x)' might require some computation
> for producing an iterator, and that we immediately throw away the result.
> Granted that `__iter__(self): return self' is efficient when an object is
> an iterator, but nowhere it is said that `__iter__' has to be efficient
> when the object is a container, and it does not shock me that some complex
> containers require time to produce their iterator. I much prefer limiting
> the use of `__iter__' for when one intends to use the iterator...
Yes, that's why I prefer to just make a list() copy.
> > def powerset(base):
> > """Powerset of an iterable, yielding lists."""
> > pairs = [(2**i, x) for i, x in enumerate(base)]
> > for n in xrange(2**len(pairs)):
> > yield [x for m, x in pairs if m&n]
>
> Thanks! Hmph! This does not yield the subsets in "sorted order", like
> the other `cogen' methods do, and I would prefer to keep that promise.
That may be a matter of permuting the bits?
> Hopefully, the optimisation I added this morning will make both algorithms
> more comparable, speed-wise. I should benchmark them to see.
Yes! Nothing beats a benchmark as an eye-opener.
--Guido van Rossum (home page: http://www.python.org/~guido/)