[ python-Bugs-866081 ] re infinite loop on valid input

SourceForge.net noreply at sourceforge.net
Fri Jan 2 11:03:16 EST 2004


Bugs item #866081, was opened at 2003-12-26 15:29
Message generated for change (Comment added) made by akuchling
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=866081&group_id=5470

Category: Python Library
>Group: Not a Bug
>Status: Closed
>Resolution: Wont Fix
Priority: 5
Submitted By: David Jeske (jeske)
>Assigned to: A.M. Kuchling (akuchling)
Summary: re infinite loop on valid input

Initial Comment:
Run the following two lines in an interpreter and watch 
your CPU get eaten in an infinite loop inside the RE 
library:

>>> a_line = " http://groups.yahoo.com/group/tuna/"
>>> re.search("/(([^ ]+)){30,}",a_line)



----------------------------------------------------------------------

>Comment By: A.M. Kuchling (akuchling)
Date: 2004-01-02 11:03

Message:
Logged In: YES 
user_id=11375

It's not actually an infinite loop, just one that takes time
that's an exponential function of the size of the pattern. 
I've attached a test program that demonstrates it; on my
machine it runs in .02 sec for a 6-character string, .04 sec
for a 7-character string, .07 sec for an 8-character string,
and so forth.  18 characters takes 56 seconds, 19 takes 112,
and so forth.  

The problem is nesting repetition operators, applying {30,}
to the +.  The engine tries one possible grouping, e.g. the
[]+ matches 1 character the first time, 1 character the
second, etc.  This fails, so it tries a different grouping
([]+ matches 2 characters the first time, 1 character the
second, etc.) and fails again.
It's going to try all possible combinations, and they're all
going to fail.

Solution: rewrite the regex to avoid nesting repetitions
like this.  In your case, it's not clear what you're trying
to do with this regex; perhaps you can use string.split('/')
to disassemble URL components.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=866081&group_id=5470



More information about the Python-bugs-list mailing list