[Python-ideas] Proposed convenience functions for re module
Jan Kaliszewski
zuo at chopin.edu.pl
Thu Jul 23 00:03:35 CEST 2009
22-07-2009, 02:00 Steven D'Aprano <steve at pearwood.info>:
> Following the thread "Experiment: Adding "re" to string objects.", I
> would like to propose the addition of two convenience functions to the
> re module:
>
>
> def multimatch(s, *patterns):
> """Do a re.match on s using each pattern in patterns,
> returning the first one to succeed, or None if they all fail."""
> for pattern in patterns:
> m = re.match(pattern, s)
> if m: return m
>
> def multisearch(s, *patterns):
> """Do a re.search on s using each pattern in patterns,
> returning the first one to succeed, or None if they all fail."""
> for pattern in patterns:
> m = re.search(pattern, s)
> if m: return m
>
>
> The rationale is to make the following idiom easier:
>
>
> m = re.match(s, pattern1)
> if not m:
> m = re.match(s, pattern2)
> if not m:
> m = re.match(s, pattern3)
> if not m:
> m = re.match(s, pattern4)
> if m:
> m.group()
>
>
> which will become:
>
> m = re.multimatch(s, pattern1, pattern2, pattern3, pattern4)
> if m:
> m.group()
>
>
> Is there any support or objections to this proposal? Any comments?
It sounds nice. But why not to use simply:
m = re.match(s, '|'.join(pattern1, pattern2, pattern3, pattern4))
And if we want the feature anyway, I'd prefer MRAB's:
> m = re.match((pattern1, pattern2, pattern3, pattern4), s)
> if m:
> print m.group()
>
> This format is already used by some string methods, eg str.startswith().
***
But if we are talking about convenience functions in re module, it'd
be IMHO very nice to have such functions:
def matchgrouping(pattern, string, flags=0, default=None):
"""Do a re.match on string using pattern,
returning dict containing groups which could be
got by index or by name."""
match = re.match(pattern, string, flags)
groups = collections.DefaultDict()
groups.update(enumerate(match.groups()))
groups.update(match.groupdict())
return result
Plus the analogous function for searching).
Plus 2 analogous methods of RegexObject instances).
* Then e.g. -- instead of:
m = re.search(pattern, s)
if m:
first_group = m.group(1)
surname = m.group('surname')
else:
first_group = None
surname = None
-- we could write simply:
m = re.matchgrouping(pattern, s)
first_group = m[1]
surname = m['surname']
* And e.g. -- instead of:
withip = log_re.match(logline)
if withip and withip.group('ip_addr'):
iplog.append(logline)
-- we could write simply:
if log_re.matchgrouping(logline)['ip_addr']:
iplog.append(logline)
What do you think about it?
*j
--
Jan Kaliszewski (zuo) <zuo at chopin.edu.pl>
More information about the Python-ideas
mailing list