Efficient String Lookup?

Diez B. Roggisch deetsNOSPAM at web.de
Sat Oct 16 14:22:24 CEST 2004

Chris S. wrote:

> I have a number of strings, containing wildcards (e.g. 'abc#e#' where #
> is anything), which I want to match with a test string (e.g 'abcdef').
> What would be the best way for me to store my strings so lookup is as
> fast as possible?

As a compiled regular expression, I guess - you don't give much info here,
so maybe there is a better way. But to me it looks like a classic regexp
thing. Maybe if your wildcards are equivalent to .*, then using subsequent
string searches lik this helps you:

pattern = 'abc#e#'.split('#')
s = 'abcdef'
found = True
pos = 0
for p in pattern:
    h = s.find(p)
    if h != -1:
        p = h + 1b
        found = False

That might be faster, if the string.find operation uses something else than
simple brute force linear searching - but I don't know enough about the
internals of python's string implementation to give an definite answer

But to be honest: I don't think regexps are easy to beat, unless your
usecase is modeled in a way that makes it prone to other approaches.


Diez B. Roggisch

More information about the Python-list mailing list