[Python-Dev] Re: hierarchicial named groups extension to the re library

Sat Apr 2 23:01:40 CEST 2005

ottrey at py.redsoft.be wrote:
>>>>import re2
>>>>buf='12 drummers drumming, 11 pipers piping, 10 lords a-leaping'
>>>>regex='^((?P<verse>(?P<number>\d+) (?P<activity>[^,]+))(, )?)*$'
>>>>pat2=re2.compile(regex)
>>>>x=pat2.extract(buf)
>>>>x
> 
> {'verse': [{'number': '12', 'activity': 'drummers
> drumming'}, {'number': '11', 'activity': 'pipers
> piping'}, {'number': '10', 'activity': 'lords a-leaping'}]}

Is a dictionary the good container or should another class be used? 
Because in the example the content of the "verse" group is lost, 
excluding its sub-groups.  Something like a hierarchic MatchObject could 
provide access to both information, the sub-groups and the group itself. 
  Also, should it be limited to named groups?

> I am wondering what would be the best direction to take this project in.
> 
> Firstly is it, (or can it be made) useful enough to be included in the
> python stdlib?  (ie. Should I bother writing a PEP for it.)
> 
> And if so, would it be best to merge its functionality in with the re
> library, or to leave it as a separate module?
> 
> And, also are there any suggestions/criticisms on the library itself?

I find the feature very interesting, but being used to live without it, 
I have difficulty evaluating its usefulness.  However, it reminds me how 
much at first I found strange that only the last match was kept, so I 
think, FWIW, that on a purist point of vue the functionality would make 
sense in the stdlib in some way or another.

Regards,
Nicolas