regexp: maximum recursion limit exceeded

F. GEIGER fgeiger at
Sun Jan 6 12:18:53 EST 2002

As I faced the RuntimeError "maximum recursion limit exceeded" when I
applied re.findall() to an HTML-file to find form contents, I thought the
reason could be a limit within findall().

So I tried the following to replace findall() with:

def findall(re, string):
   '''Find all /re/ in /string/.

   Idea: Alex Martelli in a response to a Usenet post on 4th of January
   mos = []
   pos = 0
   while 1:
      mo =, pos)
      if mo is None:
         return [ for mo in mos]
      pos = mo.end()
   return None

But the problem remains. Now reports the same error, which
means, that not findall() but some deeper mechanisms have problems with the
string that has to be searched in.

If you want to reproduce the error, try this (any ill formatting caused by
my newsreader, sorry):

def test():
   stringToBeSearchedIn = ("<form blablabla>%s</form>" %
("<blablabla>blablabla</blablabla> " * 500)) * 100
#   print stringToBeSearchedIn

   for stringFound in findall(re.compile(r"\<form.*?\/form\>", re.DOTALL |
re.MULTILINE | re.IGNORECASE), stringToBeSearchedIn)[:10]:
      print stringFound

It's very likely, that a form causes this error, if the contents between the
form tags are large and - more important - have many '<tag></tag>' pairs.

To overcome this, I could

1) use find() to search for '<form' and '/form>',

2) use the SGML parser,

3) for the opening tag and kill everything up to it, then for the closing tag and kill everything after it.

BTW, increasing the recursion depth doesn't solve the problem.

What other options do I have? How is this done "Pythonicly"?

(Platform: Win2kPro/SP2, ActivePython 2.1).

Many thanks in advance and best regards

More information about the Python-list mailing list