[Python-Dev] Re: A `cogen' module [was: Re: PEP 218 (sets); moving set.py to Lib]
François Pinard
pinard@iro.umontreal.ca
28 Aug 2002 12:12:41 -0400
[Guido van Rossum]
> > def cartesian(*sequences): [...]
> It occurred to me that this is rather ineffecient because it invokes
> itself recursively many time (once for each element in the first
> sequence). This version is much faster, because iterating over a
> built-in sequence (like a list) is much faster than iterating over a
> generator:
Granted, thanks! I postponed optimisations for the first draft of `cogen',
but if it looks acceptable overall, we can try to get some speed of it now.
> def cartesian(*sequences):
> if len(sequences) == 0:
> yield []
> else:
> head, tail = sequences[:-1], sequences[-1]
> for x in cartesian(*head):
> for y in tail:
> yield x + [y]
> I also wonder if perhaps ``tail = list(tail)'' should be inserted
> just before the for loop, so that the arguments may be iterators as
> well.
`cogen' does not make any special effort for protecting iterators given as
input sequences. `cartesian' is surely not the only place where iterators
would create a problem. Solving it as a special case for `cartesian'
only is not very nice. Of course, we might transform all sequences into
lists everywhere in `cogen', but `list'-ing a list copies it, and I'm not
sure this would be such a good idea in the end.
Best may be to let the user explicitly transform input iterators into
lists by calling `list' explicitly those arguments. That might be an
acceptable compromise.
> I expect that Eric Raymond's powerset is much faster than your recursive
> subsets() [...]
Very likely, yes. I started `cogen' with algorithms looking a bit all
alike, and did not look at speed. OK, I'll switch. I do not have Eric's
algorithm handy, but if I remember well, it merely mapped successive
integers to subsets by associating each bit with an element.
--
François Pinard http://www.iro.umontreal.ca/~pinard