regular expression dictionary search
mkPyVS
mikeminer53 at hotmail.com
Mon Aug 20 11:48:21 EDT 2007
On Aug 20, 9:35 am, "Shawn Milochik" <Sh... at Milochik.com> wrote:
> #!/usr/bin/env python
>
> import re
>
> patterns = { 'sho.' : 6, '.ilk' : 8, '.an.' : 78 }
>
> def returnCode(aWord):
> for k in patterns:
> p = "^%s$" % k
> regex = re.compile(p)
> if re.match(regex, aWord):
> return patterns[k]
>
> if __name__ == "__main__":
>
> print "The return for 'fred' : %s" % returnCode('fred')
> print "The return for 'silk' : %s" % returnCode('silk')
> print "The return for 'silky' : %s" % returnCode('silky')
> print "The return for 'hand' : %s" % returnCode('hand')
> print "The return for 'strand' : %s" % returnCode('strand')
> print "The return for 'bank' : %s" % returnCode('bank')
>
> Note: If a word matches more than one pattern, only one will be returned.
>
> I'm not sure if I'm doing the patterns thing properly -- if anyone
> could instruct me on whether it would be proper to declare it in the
> function, or use a global declaration, please let me know. However, it
> runs properly as far as I tested it.
>
> Shawn
I think global/local declaration should in part depend on the scope of
your usage. Are you going to re-use the function over and over again
in multiple modules? Does it need any state collecting statistics? If
so I would recommend you upgrade your function to a class then define
"patterns" as a static class level variable. Then the initialization
cost is eaten only for creation of the class (most often) the 1st
time.
As a side note unless you are searching large buffers it is possibly
more costly to compile into a re object then do a match with it as
opposed to let the match object perform a compile a function level
itself- if you use the class option above I would recommend storing
the re.compiled versions of your patterns in the dictionary
(everything is an object!) rather than the string repr and issuing a
compile.
mkPyVS
More information about the Python-list
mailing list