Programming challenge: wildcard exclusion in cartesian products
Dirk Thierbach
dthierbach at usenet.arcornews.de
Wed Mar 22 09:14:00 EST 2006
[Had to drop alt.comp.lang.haskell, otherwise my newsserver doesn't accept it]
Dinko Tenev <dinko.tenev at gmail.com> wrote:
> OK, here's a case that will make your program run in exponential time:
> S = { a, b }, W = { *a*b, *b*a } -- on my machine, it starts getting
> ugly as soon as n is 15 or so. Note that S^n - W = { a^n, b^n }.
> In general, whenever all the patterns in the set match against the last
> position, your current implementation is guaranteed to have to sift
> through all of S^n. I'd say the very idea of checking against a
> blacklist is fundamentally flawed, as far as performance is concerned.
If more time during preprocessing is allowed, another idea is to
treat the wildcard expressions as regular expressions, convert
each into a finite state machine, construct the "intersection" of
all these state machines, minimize it and then swap final and non-final
states. Then you can use the resulting automaton to efficiently
enumerate S^n - W. In the above case, the resulting FSM would have just
three states.
And it doesn't really matter what language you use to implement this
algorithm, it's the idea that counts. Notation aside, all
implementations will be quite similar.
- Dirk
More information about the Python-list
mailing list