How to get the "longest possible" match with Python's RE module?
Bryan Olson
fakeaddress at nowhere.org
Tue Sep 12 21:29:02 EDT 2006
Licheng Fang wrote:
> Basically, the problem is this:
>
>>>> p = re.compile("do|dolittle")
>>>> p.match("dolittle").group()
> 'do'
>
> Python's NFA regexp engine trys only the first option, and happily
> rests on that. There's another example:
>
>>>> p = re.compile("one(self)?(selfsufficient)?")
>>>> p.match("oneselfsufficient").group()
> 'oneself'
>
> The Python regular expression engine doesn't exaust all the
> possibilities, but in my application I hope to get the longest possible
> match, starting from a given point.
>
> Is there a way to do this in Python?
Yes. Here's a way, but it sucks real bad:
def longest_match(re_string, text):
regexp = re.compile('(?:' + re_string + ')$')
while text:
m = regexp.match(text)
if m:
return m
text = text[:-1]
return None
--
--Bryan
More information about the Python-list
mailing list