[Tutor] Python re without string consumption
Kent Johnson
kent37 at tds.net
Thu Jan 25 12:13:48 CET 2007
Jacob Abraham wrote:
> Hi Danny Yoo,
>
> I would like to thank you for the solution and
> the helper funtion that I have written is as follows. But I do hope
> that future versions of Python include a regular expression syntax to
> handle such cases simply because this method seems very process and
> memory intensive. I also notice that fall_back_len is a very crude
> solution.
>
> def searchall(expr, text, fall_back_len=0):
> while True:
> match = re.search(expr, text)
> if not match:
> break
> yield match
> end = match.end()
> text = text[end-fall_back_len:]
>
> for match in searchall("abca", "abcabcabca", 1):
> print match.group()
The string slicing is not needed. The search() method for a compiled re
has an optional pos parameter that tells where to start the search,. You
can start the next search at the next position after the *start* of a
successful search, so fall_back_len is not needed. How about this:
def searchall(expr, text):
searchRe = re.compile(expr)
match = searchRe.search(text)
while match:
yield match
match = searchRe.search(text, match.start() + 1)
Also, if you are just finding plain text, you don't need to use regular
expressions at all, you can use str.find():
def searchall(expr, text):
pos = text.find(expr)
while pos != -1:
yield pos
pos = text.find(expr, pos+1)
(inspired by this recipe:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/499314)
Kent
More information about the Tutor
mailing list