[Python-ideas] FInd first tuple argument for str.find and str.index
Terry Jones
terry at jon.es
Wed Sep 5 21:55:44 CEST 2007
Hi Ron
>>>>> "Ron" == Ron Adam <rrr at ronadam.com> writes:
Ron> I was thinking of something a bit more light weight.
Ah, now we got to what you actually want to do :-)
Ron> For more complex stuff I think the 're' module already does pretty
Ron> much what you are describing. It may even already take advantage of
Ron> the algorithms you referred to. If not, that would be an important
Ron> improvement to the re module. :-)
Yes, that would make a good SoC project. But as you say it may already be
done that way.
Ron> The use case I had in mind was to find starting and ending delimiters.
Ron> And to avoid the following type of awkward code. (This would work for
Ron> finding other things as well of course.)
start = 0
while start < len(s):
i1 = s.find('{', start)
if i1 == -1:
i1 = len(s)
i2 = s.find('}', start)
if i2 == -1:
i2 = len(s)
# etc... for as many search terms as you have...
# or use a loop to locate each one.
start = min(i1, i2)
if start == len(s):
break
...
# do something with s[start]
...
Ron> That works but it has to go through the string once for each item.
It's worse than that. _Each time around the loop_ it tests all candidates
against all of the remaining text. Imagine matching L patterns that were
each 'a' * M against a text of 'a' * N. You're going to do (roughly) O(N *
L * M) comparisons, using naive string matching.
You could completely drop patterns that have already returned -1. You can
short-circuit a loop if you ever got a 0 index back. Sorry if this seems
picky - I guess you're just writing quick pseudo-code.
Ron> Of course I would use 're' for anything more complex than a few fixed
Ron> length terms.
Yes. It depends partly on what you want back. You could write a super-fast
iterator based on A&C that told you everything you need to know, with
guaranteed linear worst case behavior, but that seems like overkill here
(and it may be in re, as you say).
Ron> If the function returns something other than a simple index, then I
Ron> think it will need to be a new function or method and not just an
Ron> alteration of str.index and str.find. In that case it may also need a
Ron> PEP.
Be my guest :-)
Regards,
Terry
More information about the Python-ideas
mailing list