counting number of (overlapping) occurances
Ben Cartwright
bencvt at gmail.com
Fri Mar 10 01:25:06 EST 2006
John wrote:
> This works but is a bit slow, I guess I'll have to live with it.
> Any chance this could be sped up in python?
Sure, to a point. Instead of:
def countoverlap(s1, s2):
return len([1 for i in xrange(len(s1)) if s1[i:].startswith(s2)])
Try this version, which takes smaller slices (resulting in 2x-5x speed
increase when dealing with a large s1 and a small s2):
def countoverlap(s1, s2):
L = len(s2)
return len([1 for i in xrange(len(s1)-L+1) if s1[i:i+L] == s2])
And for a minor extra boost, this version eliminates the list
comprehension:
def countoverlap(s1, s2):
L = len(s2)
cnt = 0
for i in xrange(len(s1)-L+1):
if s1[i:i+L] == s2:
cnt += 1
return cnt
Finally, if the execution speed of this function is vital to your
application, create a C extension. String functions like this one are
generally excellent candidates for extensionizing.
--Ben
More information about the Python-list
mailing list