[Python-Dev] Re: Changing Python's string search algorithms

17 Oct 2020

      [Tim]
...
...
Alas, the higher preprocessing costs leave the current PR slower in "too
many" cases too, especially when the needle is short and found early
in the haystack. Then any preprocessing cost approaches a pure waste
of time.
But that was this morning.  Since then, Dennis changed the PR to back
off to the current code when the needle is "too small". There are very
few cases we know of now where the PR code is slower at all than the
current code, none where it's dramatically slower, many where it's
significantly faster, and some non-contrived cases where it's
dramatically faster (for example, over a factor of 10 in
stringbench.py's "late match, 100 characters" forward-search tests,
and especially beneficial for Unicode (as opposed to bytes)).  Then
there are the pathological cases like in the original issue report,
where it's multiple orders of magnitude faster (about 3 1/2 hours to
under a tenth of a second in the issue report's case).

Still waiting for someone who thinks string search speed is critical
in their real app to give it a try.  In the absence of that, I endorse
merging this.

[Python-Dev] Re: Changing Python's string search algorithms

Tim Peters