[Python-ideas] FInd first tuple argument for str.find and str.index

Terry Jones terry at jon.es
Wed Sep 5 18:37:34 CEST 2007


>>>>> "Terry" == Terry Jones <terry at jon.es> writes:
>>>>> "Guido" == Guido van Rossum <guido at python.org> writes:
Guido> I was surprised to find that startswith and endswith support this,
Guido> but it does make sense. Adding a patch to 2.6 would cause it to be
Guido> merged into 3.0 soon enough.

Guido> On 9/4/07, Ron Adam <rrr at ronadam.com> wrote:
>>> Could we add the ability of str.index and str.find to accept a tuple as the
>>> first argument and return the index of the first item found in it.

I should have added a few more comments.

If you're going to implement the original desired functionality and make it
run quickly, you're probably going to dream up something along the lines of
what Aho & Corasick did so beautifully.

It's tricky to get it right. As you walk the text string, several patterns
may be currently matching. But the next char you consider might cause one
or more of the current matches to fail, or a currently non-matching pattern
to begin to match. The A&C algorithm builds a trie with failure arcs, so
the matching is linear (both linear time to build the trie and the failure
arcs, and then linear to walk the trie with the text). It has accepting
states, so you know as soon as something matches, and can quit early.

If this is going to be implemented you may as well do it right the first time.

You could also return a dict in which (pattern) keys are absent if they
didn't match at all. Then it would be fast to tell which, if any, patterns
matched - no need to step through all passed patterns, just use
result.keys() to get them.

Terry



More information about the Python-ideas mailing list