[Python-Dev] string find(substring) vs. substring in string
A.M. Kuchling
amk at amk.ca
Wed Feb 16 21:54:31 CET 2005
On Wed, Feb 16, 2005 at 01:34:16PM -0700, Mike Brown wrote:
> time. But I would also hope that it would be smart enough to know that it
> doesn't need to look past the 2nd character in 'not the xyz' when it is
> searching for 'not there' (due to the lengths of the sequences).
Assuming stringobject.c:string_contains is the right function, the
code looks like this:
size = PyString_GET_SIZE(el);
rhs = PyString_AS_STRING(el);
lhs = PyString_AS_STRING(a);
/* optimize for a single character */
if (size == 1)
return memchr(lhs, *rhs, PyString_GET_SIZE(a)) != NULL;
end = lhs + (PyString_GET_SIZE(a) - size);
while (lhs <= end) {
if (memcmp(lhs++, rhs, size) == 0)
return 1;
}
So it's doing a zillion memcmp()s. I don't think there's a more
efficient way to do this with ANSI C; memmem() is a GNU extension that
searches for blocks of memory. Perhaps saving some memcmps by writing
if ((*lhs == *rhs) && memcmp(lhs++, rhs, size) == 0)
would help.
--amk
More information about the Python-Dev
mailing list