Groups in regular expressions don't repeat as expected

John Nagle nagle at animats.com
Wed Apr 20 21:20:06 CEST 2011


Here's something that surprised me about Python regular expressions.

 >>> krex = re.compile(r"^([a-z])+$")
 >>> s = "abcdef"
 >>> ms = krex.match(s)
 >>> ms.groups()
('f',)

The parentheses indicate a capturing group within the
regular expression, and the "+" indicates that the
group can appear one or more times.  The regular
expression matches that way.  But instead of returning
a captured group for each character, it returns only the
last one.

The documentation in fact says that, at

http://docs.python.org/library/re.html

"If a group is contained in a part of the pattern that matched multiple 
times, the last match is returned."

That's kind of lame, though. I'd expect that there would be some way
to retrieve all matches.

					John Nagle



More information about the Python-list mailing list