Programming challenge: wildcard exclusion in cartesian products

Dinko Tenev dinko.tenev at gmail.com
Mon Mar 20 05:46:05 EST 2006


wkehowski at cox.net wrote:
> It would seem that your program is just filtering the full cartesian
> product, right? The solution I'm looking for generates the elements
> one-by-one so that it could be used in a loop.

OK, having read some of the comments so far, I have the feeling that I
may be missing the point in more than one way, so let's set this
straight:

If I understand correctly, for an alphabet S, and a subset W of S*
specified by the wildcards, you expect the enumeration of sequences of
length n to run in Theta( n*|S^n - W| ) instead of Theta( n*|S^n| ).

First, this doesn't seem to hold for your Python program.  Try, for
example, S = { a, b, c }, W = { *a*b*, *b*c*, *c*a*, *b*a*, *c*b*,
*a*c* }, with some large values of n.  Theta( n*|S^n - W| ) predicts
that the enumeration time should grow linearly with n, as |S^n - W| =
3, but if you take some measurements, you'd notice that it grows faster
than that.

Second, my current bet is that such an improvement in asymptotic
complexity is not possible, if we consider *both* pre-processing of the
wildcard set and subsequent enumeration.  Speculation: the time for
building-up a smart structure to speed-up enumeration, together with
the time for enumerating the set using that structure, should sum up to
roughly Theta( n*|S^n| ), even with a really smart algorithm.

Even if you're willing to pay up-front for tighter loop execution
later, and you build a suitable structure for this purpose, you would
have to consider the structure's size, so here's another speculation:
such structure would likely take up Theta( |S^n| ) space in memory, in
the worst case.

I would really appreciate it if you could pour some light into what
you're trying to do exactly, and possibly point out anything that I
might have missed so far.

Cheers,

Dinko




More information about the Python-list mailing list