[issue41972] bytes.find consistently hangs in a particular scenario

Tue Oct 13 17:30:39 EDT 2020

Tim Peters <tim at python.org> added the comment:

> For a bunch of cases it's slower, for some others it's faster.

I have scant real idea what you're doing, but in the output you showed 4 output lines are labelled "slower" but 18 are labelled "faster".

What you wrote just above appears to say the reverse (I'd call 18 "a bunch" compared to 4 "some others").

Could please state plainly which of {status quo, PR} is faster on an output line labelled "faster"?

My a priori guess was that the PR had the highest chance of being slower when the needle is short and is found in the haystack early on. Then preprocessing time accounts for a relatively higher percentage of the total time taken, and the PR's preprocessing is more expensive than the status quo's.

The alphabet size here is small (just 26 possible letters, from `ascii_uppercase`), so it's quite likely that a short needle _will_ be found early on in a long haystack.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue41972>
_______________________________________