[Python-bugs-list] [ python-Bugs-456612 ] SRE fix 133283 (minimizing repeats) bug
noreply@sourceforge.net
noreply@sourceforge.net
Mon, 10 Dec 2001 00:04:42 -0800
Bugs item #456612, was opened at 2001-08-29 11:03
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=456612&group_id=5470
Category: Regular Expressions
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Greg Chapman (glchapman)
Assigned to: Fredrik Lundh (effbot)
Summary: SRE fix 133283 (minimizing repeats) bug
Initial Comment:
The following is from Python 2.2a2:
>>> pat = sre.compile(r'"(?:\")*?"')
>>> m = pat.search(r'"a"')
>>> m.group()
'"a"'
Clearly, that string should not have matched the
pattern (since the pattern should not allow the 'a').
I believe the problem is from the change marked as
#133283 in _sre.c. The code in the first ("unbounded
repeat") branch of the if statement searches forward
through the string until it matches the tail ('"'),
but then the code neglects to check if the repeated
pattern actually matches the intervening characters.
FWIW, it seems to me that the behavior for non-
unbounded repeats (which I believe is identical to the
pcre behavior) makes more sense. That is, it seems to
me that a minimizing repeat should minimize the number
of times the repeated pattern is applied, not the
number of characters consumed by those applications.
(However, I see that Perl (v. 5.5) minimizes
characters consumed, so I suppose that's the reason
for the change).
----------------------------------------------------------------------
>Comment By: Fredrik Lundh (effbot)
Date: 2001-12-10 00:04
Message:
Logged In: YES
user_id=38376
fixed in 2.2rc1 (patch #133283 was broken)
----------------------------------------------------------------------
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=456612&group_id=5470