Efficient String Lookup?
chrisks at NOSPAM.udel.edu
Sun Oct 17 07:56:07 CEST 2004
Andrew Dalke wrote:
> One way is with groups. Make each pattern into a regexp
> pattern then concatenate them as
> (pat1)|(pat2)|(pat3)| ... |(patN)
> Do the match and find which group has the non-None value.
> You may need to tack a "$" on the end of string (in which
> case remember to enclose everything in a () so the $ doesn't
> affect only the last pattern).
> One things to worry about is you can only have 99 groups
> in a pattern.
> Here's example code.
> import re
> config_data = [
> ("abc#e#", "Reactor meltdown imminent"),
> ("ab##", "Antimatter containment field breach"),
> ("b####f", "Coffee too strong"),
> as_regexps = ["(%s)" % pattern.replace("#", ".")
> for (pattern, text) in config_data]
> full_regexp = "|".join(as_regexps) + "$"
> pat = re.compile(full_regexp)
> input_data = [
> for text in input_data:
> m = pat.match(text)
> if not m:
> print "%s? That's okay." % (text,)
> for i, val in enumerate(m.groups()):
> if val is not None:
> print "%s? We've got a %r warning!" % (text,
> Here's the output I got when I ran it
> abadb? We've got a 'Antimatter containment field breach' warning!
> abcdef? We've got a 'Reactor meltdown imminent' warning!
> zxc? That's okay.
> abcq? We've got a 'Antimatter containment field breach' warning!
> b1234f? We've got a 'Coffee too strong' warning!
Thanks, that's almost exactly what I'm looking for. The only downside I
see is that I still need to add and remove patterns, so continually
recompiling the expression might be expensive.
More information about the Python-list