Regular Expression Grouping
Michael J. Fromberger
Michael.J.Fromberger at Clothing.Dartmouth.EDU
Sun Aug 12 13:48:21 EDT 2007
In article <1186939262.144073.182450 at z24g2000prh.googlegroups.com>,
linnewbie at gmail.com wrote:
> Fairly new to this regex thing, so this might be very juvenile but
> important.
>
> I cannot understand and why 'c' constitutes a group here without being
> surrounded by "(" ,")" ?
>
> >>>import re
> >>> m = re.match("([abc])+", "abc")
> >>> m.groups()
> ('c',)
>
> Grateful for any clarity.
Hello!
I believe your confusion arises from the placement of the "+" operator
in your expression. You wrote:
'([abc])+'
This means, in plain language, "one or more groups in which each group
contains a string of one character from the set {a, b, c}."
Contrast this with what you probably intended, to wit:
'([abc]+)'
The latter means, in plain language, "a single group containing a string
of one or more characters from the set {a, b, c}."
In the former case, the greedy property of matching attempts to maximize
the number of times the quantified expression is matched -- thus, you
match the group three times, once for each character of "abc", and the
result shows you only the last occurrence of the matching.
Compare this with the following:
] import re
] m = re.match('([abc]+)', 'abc')
] m.groups()
=> ('abc',)
I suspect the latter is what you are after.
Cheers,
-M
--
Michael J. Fromberger | Lecturer, Dept. of Computer Science
http://www.dartmouth.edu/~sting/ | Dartmouth College, Hanover, NH, USA
More information about the Python-list
mailing list