[Python-Dev] hierarchicial named groups extension to the re library

ottrey at py.redsoft.be ottrey at py.redsoft.be
Sat Apr 2 09:22:41 CEST 2005


I've written an extension to the re library, to provide a more
complete matching of hierarchical named groups in regular expressions.

I've set up a sourceforge project for it:

  http://pyre2.sourceforge.net/

re2 extracts a hierarchy of named groups matches from a string,
rather than the flat, incomplete dictionary that the
standard re module returns.

(ie. the re library only returns the ~last~ match for named groups - not
a list of ~all~ the matches for the named groups.  And the hierarchy of
those named groups is non-existant in the flat dictionary of matches
that results. )

eg.

>>> import re
>>> buf='12 drummers drumming, 11 pipers piping, 10 lords a-leaping'
>>> regex='^((?P<verse>(?P<number>\d+) (?P<activity>[^,]+))(, )?)*$'
>>> pat1=re.compile(regex)
>>> m=pat1.match(buf)
>>> m.groupdict()
{'verse': '10 lords a-leaping', 'number': '10',
'activity': 'lords a-leaping'}

>>> import re2
>>> buf='12 drummers drumming, 11 pipers piping, 10 lords a-leaping'
>>> regex='^((?P<verse>(?P<number>\d+) (?P<activity>[^,]+))(, )?)*$'
>>> pat2=re2.compile(regex)
>>> x=pat2.extract(buf)
>>> x
{'verse': [{'number': '12', 'activity': 'drummers
drumming'}, {'number': '11', 'activity': 'pipers
piping'}, {'number': '10', 'activity': 'lords a-leaping'}]}



(See http://pyre2.sourceforge.net/ for more details.)


I am wondering what would be the best direction to take this project in.

Firstly is it, (or can it be made) useful enough to be included in the
python stdlib?  (ie. Should I bother writing a PEP for it.)

And if so, would it be best to merge its functionality in with the re
library, or to leave it as a separate module?

And, also are there any suggestions/criticisms on the library itself?


More information about the Python-Dev mailing list