[Python-bugs-list] [ python-Bugs-456612 ] SRE fix 133283 (minimizing repeats) bug

noreply@sourceforge.net noreply@sourceforge.net
Mon, 10 Dec 2001 00:04:42 -0800


Bugs item #456612, was opened at 2001-08-29 11:03
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=456612&group_id=5470

Category: Regular Expressions
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Greg Chapman (glchapman)
Assigned to: Fredrik Lundh (effbot)
Summary: SRE fix 133283 (minimizing repeats) bug

Initial Comment:
The following is from Python 2.2a2:

>>> pat = sre.compile(r'"(?:\")*?"')
>>> m = pat.search(r'"a"')
>>> m.group()
'"a"'

Clearly, that string should not have matched the 
pattern (since the pattern should not allow the 'a').  
I believe the problem is from the change marked as 
#133283 in _sre.c.  The code in the first ("unbounded 
repeat") branch of the if statement searches forward 
through the string until it matches the tail ('"'), 
but then the code neglects to check if the repeated 
pattern actually matches the intervening characters.

FWIW, it seems to me that the behavior for non-
unbounded repeats (which I believe is identical to the 
pcre behavior) makes more sense.  That is, it seems to 
me that a minimizing repeat should minimize the number 
of times the repeated pattern is applied, not the 
number of characters consumed by those applications.  
(However, I see that Perl (v. 5.5) minimizes 
characters consumed, so I suppose that's the reason 
for the change).


----------------------------------------------------------------------

>Comment By: Fredrik Lundh (effbot)
Date: 2001-12-10 00:04

Message:
Logged In: YES 
user_id=38376

fixed in 2.2rc1 (patch #133283 was broken)

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=456612&group_id=5470