[Python-Dev] pre-PEP: Complete, Structured Regular Expression Group Matching

Edward Loper edloper at gradient.cis.upenn.edu
Sat Aug 7 03:01:47 CEST 2004


Michael Hudson wrote:
>> Generally, it strikes me as mildly useful.  An implementation would
>> surely help  :-) 

Mike wrote:
> If yours is the most positive review, there probably won't be one.   :-)

I wouldn't lose heart over the lack of response -- the current flood of 
decorator messages tends to drown out everything else.

I like your idea in the abstract, but I have trouble following many of 
the examples in your pre-pep (and as a result, I'm not sure if I 
*really* understand what you're proposing).  E.g., you have:

|  >>> m1 = re.match(r'("([A-Z]|[a-z])*"\s*)*', '"Xx" "yy" "ZzZ"')
|  >>> m1.group(2)
|  [['X', 'x'], ['yy'], ['ZzZ']]

But it seems to me like the output should instead be:

|  [['X', 'x'], ['y', 'y'], ['Z', 'z', 'Z']]

In particular, group two is "([A-Z]|[a-z])", which matches a single 
character; so it seems like the value of m1.group(2) should be a tree 
with leaves that are single characters.  (Looking at it from another 
angle, I can't see any difference between "Xx" and "ZzZ" that would 
cause one to be decomposed and the other to be left as a string.)

Similar comments apply for many of your other examples.  So either:
   - Your examples are incorrect
   - My understanding of your proposed algorithm is incorrect

If it's the former, then please fix the examples.  If not, then perhaps 
we can talk about your algorithm some more off-list, so I can try to see 
where my misunderstanding is coming from.

-Edward



More information about the Python-Dev mailing list