[Python-Dev] Re: string find(substring) vs. substring in string

Scott David Daniels Scott.Daniels at Acm.Org
Wed Feb 16 23:00:54 CET 2005


Irmen de Jong wrote:
> There's the Boyer-Moore string search algorithm which is
> allegedly much faster than a simplistic scanning approach,
> and I also found this: http://portal.acm.org/citation.cfm?id=79184
> So perhaps there's room for improvement :)

The problem is setup vs. run.  If the question is 'ab in 'rabcd',
Boyer-Moore and other fancy searches will be swamped with prep time.
In Fred's comparison with re, he does the re.compile(...) outside of
the timing loop.  You need to decide what the common case is.
The longer the thing you are searching in, the more one-time-only
overhead you can afford to reduce the per-search-character cost.

--Scott David Daniels
Scott.Daniels at Acm.Org



More information about the Python-Dev mailing list