Groups in regular expressions don't repeat as expected
John Nagle
nagle at animats.com
Wed Apr 20 15:20:06 EDT 2011
Here's something that surprised me about Python regular expressions.
>>> krex = re.compile(r"^([a-z])+$")
>>> s = "abcdef"
>>> ms = krex.match(s)
>>> ms.groups()
('f',)
The parentheses indicate a capturing group within the
regular expression, and the "+" indicates that the
group can appear one or more times. The regular
expression matches that way. But instead of returning
a captured group for each character, it returns only the
last one.
The documentation in fact says that, at
http://docs.python.org/library/re.html
"If a group is contained in a part of the pattern that matched multiple
times, the last match is returned."
That's kind of lame, though. I'd expect that there would be some way
to retrieve all matches.
John Nagle
More information about the Python-list
mailing list