String search vs regexp search
Duncan Booth
duncan at NOSPAMrcp.co.uk
Tue Oct 14 04:34:34 EDT 2003
tweedgeezer at hotmail.com (Jeremy Fincher) wrote in
news:698f09f8.0310132019.4bd918b2 at posting.google.com:
> Duncan Booth <duncan at NOSPAMrcp.co.uk> wrote in message
> news:<Xns941360C9B9445duncanrcpcouk at 127.0.0.1>...
>> The regular expression code has a startup penalty since it has to
>> compile the regular expression at least once, however the actual
>> searching may be faster than the naive str.find. If the time spent
>> doing the search is sufficiently long compared with the time doing
>> the compile, the regular expression may win out.
>
> Both regular expression searching and string.find will do searching
> one character at a time; given that, it seems impossible to me that
> the hand-coded-in-C "naive" string.find could be slower than the
> machine-translated-coded-in-Python regular expression search.
> Compilation time only serves to further increase string.find's
> advantage.
>
I may have misremembered, but I thought there was a thread discussing this
a little while back which claimed that the regular expression library
looked for constant strings at the start of the regex, and if it found one
used Boyer-Moore to do the search. If it does, then regular expressions
searching for a constant string certainly ought to be much faster than a
plain string.find (as the length of the searched string tends towards
infinity).
If it doesn't, then it should.
--
Duncan Booth duncan at rcp.co.uk
int month(char *p){return(124864/((p[0]+p[1]-p[2]&0x1f)+1)%12)["\5\x8\3"
"\6\7\xb\1\x9\xa\2\0\4"];} // Who said my code was obscure?
More information about the Python-list
mailing list