[Python-bugs-list] [ python-Bugs-620412 ] Max recursion limit with "*?" pattern

SourceForge.net noreply@sourceforge.net
Tue, 27 May 2003 07:45:50 -0700


Bugs item #620412, was opened at 2002-10-08 19:40
Message generated for change (Settings changed) made by thorstein
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=620412&group_id=5470

Category: Regular Expressions
Group: Python 2.2.1
Status: Closed
Resolution: Fixed
Priority: 5
Submitted By: Thorstein Thorsteinsson (thorstein)
Assigned to: Gustavo Niemeyer (niemeyer)
Summary: Max recursion limit with "*?" pattern

Initial Comment:
I ran into the following problem trying to parse an ms
outlook mail
box. Cut down to its bare essentials:

> cat tst.py
import re

mstr = (11000*' ') + 'A'
pattern = re.compile('.*?A')
pattern.search(mstr)
> python tst.py
Traceback (most recent call last):
  File "tst.py", line 5, in ?
    pattern.search(mstr)
RuntimeError: maximum recursion limit exceeded
> python
Python 2.2.1c1 (#6, Jul 20 2002, 09:40:07)
[GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.62mdk)]
on linux2
Type "help", "copyright", "credits" or "license" for
more information.
>>>

The combination of a longish string with ".*?" gives
the error. Using
".*" is ok.

Could "non-greedy" matching be implemented non-recursively?

If I understand correctly, the limit exceeded is
USE_RECURSION_LIMIT in Modules/_sre.c. It is slightly
confusing
because we also have the Python recursion limit (my
first reaction
was to bump it up with sys.setrecursionlimit(), but
that of course
didn't help).

----------------------------------------------------------------------

>Comment By: Thorstein Thorsteinsson (thorstein)
Date: 2003-05-27 14:45

Message:
Logged In: YES 
user_id=587322

I've tried my example with Python 2.3, and the error has
disappeared.
So as far as I'm concerned, this bug reported can be deleted.

Thanks.
Thorstein

----------------------------------------------------------------------

Comment By: Gustavo Niemeyer (niemeyer)
Date: 2003-05-24 16:49

Message:
Logged In: YES 
user_id=7887

As Gary Herron correctly pointed me out, this was fixed in
2.3 with the introduction of a new opcode to handle single
character non-greedy matching.

This won't be fixed in 2.2.3, but hopefully will be
backported to 2.2.4 together with other regular expression
fixes.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=620412&group_id=5470