Is there a command that returns the number of substrings in a string?
Gerard Flanagan
grflanagan at gmail.com
Mon Oct 26 06:46:28 EDT 2009
Peng Yu wrote:
> For example, the long string is 'abcabc' and the given string is
> 'abc', then 'abc' appears 2 times in 'abcabc'. Currently, I am calling
> 'find()' multiple times to figure out how many times a given string
> appears in a long string. I'm wondering if there is a function in
> python which can directly return this information.
re.findall?
>>> patt = re.compile('abc')
>>> len(patt.findall('abcabc'))
2
For groups of non-overlapping substrings, tested only as far as you see:
8<----------------------------------------------------------------------
import re
from collections import defaultdict
def count(text, *args):
"""
>>> ret = count('abcabc', 'abc')
>>> ret['abc']
2
>>> ret = count('xabcxabcx', 'abc', 'x')
>>> ret['abc']
2
>>> ret['x']
3
>>> ret = count('abcabc', 'abc', 'cab')
>>> ret['abc']
2
>>> ret['cab']
0
>>> ret = count('abcabc', 'abc', 'ab')
>>> ret['abc']
2
>>> ret['ab']
0
"""
args = map(re.escape, args)
args.sort()
args.reverse()
pattern = re.compile('|'.join(args))
result = defaultdict(int)
def callback(match):
matched = match.group(0)
result[matched] += 1
return matched
pattern.sub(callback, text)
return result
if __name__ == '__main__':
import doctest
doctest.testmod()
8<----------------------------------------------------------------------
More information about the Python-list
mailing list