Granted, thanks! I postponed optimisations for the first draft of `cogen', but if it looks acceptable overall, we can try to get some speed of it now.
I'm not saying that it looks good overall -- I'd like to defer to Tim, who has used and written this kind of utities for real, and who probably has a lot of useful feedback. Right now, he's felled by some kind of ilness though.
def cartesian(*sequences): if len(sequences) == 0: yield [] else: head, tail = sequences[:-1], sequences[-1] for x in cartesian(*head): for y in tail: yield x + [y]
I also wonder if perhaps ``tail = list(tail)'' should be inserted just before the for loop, so that the arguments may be iterators as well.
`cogen' does not make any special effort for protecting iterators given as input sequences. `cartesian' is surely not the only place where iterators would create a problem. Solving it as a special case for `cartesian' only is not very nice. Of course, we might transform all sequences into lists everywhere in `cogen', but `list'-ing a list copies it, and I'm not sure this would be such a good idea in the end.
Hm. All but the last will be iterated over many times. In practice the inputs will be relatively small (I can't imagine using this for sequences with 100s of 1000s elements). Or you might sniff the type and avoid the copy if you know it's a list or tuple. Or you might use Oren's favorite rule of thumb and listify it when iter(x) is iter(x) (or iter(x) is x).
Best may be to let the user explicitly transform input iterators into lists by calling `list' explicitly those arguments. That might be an acceptable compromise.
Maybe.
I expect that Eric Raymond's powerset is much faster than your recursive subsets() [...]
Very likely, yes. I started `cogen' with algorithms looking a bit all alike, and did not look at speed. OK, I'll switch. I do not have Eric's algorithm handy, but if I remember well, it merely mapped successive integers to subsets by associating each bit with an element.
def powerset(base): """Powerset of an iterable, yielding lists.""" pairs = [(2**i, x) for i, x in enumerate(base)] for n in xrange(2**len(pairs)): yield [x for m, x in pairs if m&n] --Guido van Rossum (home page: http://www.python.org/~guido/)