[Tutor] regex newbie question

Michael Janssen mi.janssen at gmail.com
Fri May 9 13:54:52 CEST 2008


On Fri, May 9, 2008 at 3:16 AM, Dick Moores <rdm at rcblue.com> wrote:
>
> At 04:32 PM 5/8/2008, Steve Willoughby wrote:
>>
>> That would be r'^\d\d(\d\d)*$'
>
> I bought RegexBuddy (<http://www.regexbuddy.com/>) today, which is a big
> help. However, it has a comment about your regex

The comment on http://www.rcblue.com/Regex/all_even_number_of_digits.htm
is telling us, that the group only "captures" the last repetition.
That's fine for the problem given, since we're not interessted in
whatever the groups "captures". The group is solely used to enforce
repetitions of two-digits-together.

Since we're not interested in the group's submatch, we can ignore it beforehand:

>>> mt = re.search(r'^\d\d(\d\d)*$', '1234') # capture (\d\d)
>>> mt.group() # the full match
'1234'
>>> mt.group(1) # here is the captured submatch of group one
'34'
>>> mt = re.search(r'^\d\d(?:\d\d)*$', '1234') # non-grouping version, simply enforce two-digits, do not capture
>>> mt.group()
'1234'
>>> mt.group(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
IndexError: no such group

the benefit of such a non-group is, that it's clear for the reader,
that you're not intending to use the submatch later on and it keeps
mt.group(*) clean. When I first saw the grouping versions, I've
actually asked myself, hey, what you're gonna make with the group's
match...

regards
Michael


More information about the Tutor mailing list