[Python-bugs-list] [ python-Bugs-476912 ] regex annoyance

noreply@sourceforge.net noreply@sourceforge.net
Wed, 31 Oct 2001 18:07:32 -0800


Bugs item #476912, was opened at 2001-10-31 12:17
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=476912&group_id=5470

>Category: Regular Expressions
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Bill Bumgarner (bbum)
>Assigned to: Fredrik Lundh (effbot)
Summary: regex annoyance

Initial Comment:
(this may be a feature request-- but it is annoying 
enough that I filed it as a bug)

Python's named sub expressions  within regular 
expressions are an incredibly valuable feature;  
between it and the ability to automatically collapse 
multiline regex's w/comments leads to very 
readable regex's.   

However, there is an annoyance in named 
subexpressions that has bitten me several times.

Namely, if you have a situation where a particular 
token must be parsed out of the input through the 
use of one of two (or more) expressions in a 
fashion that cannot be expressed without multiple 
possible means of matching any given 
subexpression, then the named subexpression 
will only be non-None intermittently (depending on 
expression order and what was matched).

That is, given:

(?:(?<Tok1>[a-z]+)\s(?<Tok2>[a-z]+))|(?:(?<Tok1>
[a-z]+)\t(?<Tok2>[a-z]+))

In this case, Tok1 and Tok2 will be None if the first 
expression matches... 

(Yes, this is a contrived example that could be 
refactored to not use multiple <Tok1>/<Tok2> 
references-- however, more complex expressions 
do not always enable easy refactoring.)

----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2001-10-31 18:07

Message:
Logged In: YES 
user_id=31435

Since symbolic names are names *of* integer group numbers, 
the regexp compiler should really raise an exception when 
seeing a given symbolic name defined more than once in a 
regexp.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=476912&group_id=5470