How to check if any item from a list of strings is in a big string?
John Machin
sjmachin at lexicon.net
Thu Jul 9 23:07:58 EDT 2009
On Jul 10, 12:53 pm, Nobody <nob... at nowhere.com> wrote:
> On Thu, 09 Jul 2009 18:36:05 -0700, inkhorn wrote:
> > For one of my projects, I came across the need to check if one of many
> > items from a list of strings could be found in a long string.
>
> If you need to match many strings or very long strings against the same
> list of items, the following should (theoretically) be optimal:
>
> r = re.compile('|'.join(map(re.escape,list_items)))
> ...
> result = r.search(string)
"theoretically optimal" happens only if the search mechanism builds a
DFA or similar out of the list of strings. AFAIK Python's re module
doesn't.
Try this:
http://hkn.eecs.berkeley.edu/~dyoo/python/ahocorasick/
More information about the Python-list
mailing list