[Python-bugs-list] [ python-Bugs-439997 ] infinite loop in re.match
noreply@sourceforge.net
noreply@sourceforge.net
Tue, 10 Jul 2001 12:58:10 -0700
Bugs item #439997, was opened at 2001-07-10 02:44
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=439997&group_id=5470
Category: Regular Expressions
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Dieter Maurer (dmaurer)
Assigned to: Fredrik Lundh (effbot)
Summary: infinite loop in re.match
Initial Comment:
Python 2.0.1 "re.match" enters infinite loop for
a regular expression that was carefully designed not
to lead to loops (an easy expression).
See attached file.
----------------------------------------------------------------------
>Comment By: Guido van Rossum (gvanrossum)
Date: 2001-07-10 12:58
Message:
Logged In: YES
user_id=6380
A general comment for folks who report problems with regular
expressions: Komodo has a tool that debugs a regular
expression, by stepping through its execution step by step.
Very educational, and would probably help understanding
what's going on in cases like the case here reported.
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2001-07-10 10:01
Message:
Logged In: YES
user_id=31435
Assigned to /F, but I don't see a bug here. While your
regexp should be speedy *when* applied to a string that
actually matches (the regexp does not match the long string
in your test case), you create an exponential number of
backtracking possibilities by repeating \s* at both ends of
the first repeated group (in effect, you're specifying an
unbounded number of instances of the highly ambiguous
\s*\s*). So when the regexp doesn't match, it can be
unboundedly slow. This minor rewrite doesn't have that
problem (it simply moves the first \s* "out of the loop"):
r='</[^>]*>\s*(<[?][^>]*>\s*)+<[^/?][^>]*>'
BTW, your test string doesn't match because it ends with
</list>
It *would* match if it ended with
<list>
It's unclear whether you expected it to match.
----------------------------------------------------------------------
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=439997&group_id=5470