[New-bugs-announce] [issue3128] Regex causes python to hang up? / loop infinite?

André Fritzsche report at bugs.python.org
Tue Jun 17 09:16:26 CEST 2008


New submission from André Fritzsche <computercrustie at users.sourceforge.net>:

After struggling around with my code for nearly 1 hour now, I found out
that one of my regular expressions with a special string causes python
to hang up - not really hang up, because the processor usage is at
nearly 100%, so I think the regex machine is looping infinite.

Here is the regex-string:

re_exc_line = re.compile (
        # ignore everything before the first match
        r'^.*' +
        # first group (includes second | third)
        r'(?:' +
         # second group "(line) (file)"
         r'(?:' +
          # (text to ignore, line [number])
          r'\([^,]+\s*,\s*line\s+(?P<line1>\d+)\)' +
          # any text ([filename]) any text
          r'.*\((?:(?P<file1>[^)]+))*\).*' +
         # end of second group
         r')' +
        # or
        r'|' +
         # third group "(file) (line)"
         r'(?:' +
          # ([filename])
          r'\((?:(?P<file2>[^)]+))*\)' +
          # any text (text to ignore, line [number]) any text
          r'.*\([^,]+\s*,\s*line\s+(?P<line2>\d+)\).*' +
          # end of third group
         r')' +
        # end of first group
        r')' +
        # any text after it
        r'.*$'
        , re.I
    )

It should match either the construct:

1. """some optional text (text to ignore, line [12]) ([any_filename])
followed by optional text"""

or:

2. """some optional text ([any_filename]) (text to ignore, line [12])
followed by optional text"""

If first text matches, it is put into 'line1' and 'file1' and if the
second one matches into 'line2' and 'file2' of the groupdict.

For the upper both examples everything is ok, but having the following
string (I had to change some pathnames, because they contained customer
names):
msg = (
r"Error: Error during parsing: invalid syntax " +
r"(D:\Projects\retest\ver_700\lib\_test\test_sapekl.py, line 14) " +
r"-- Error during parsing: invalid syntax " + 
r"(D:\projects\retest\ver_700\modules\sapekl\__init__.py, line 21) " +
r"-- Attempted relative import in non-package, or beyond toplevel " +
r"package")

used with the upper regex:

re_exc_line.match(msg)

is running for two hours now (on a 3Ghz Machine)!

I've attached everything as an example file and hope, I could help you.

----------
components: Regular Expressions
files: re_problem.py
messages: 68304
nosy: computercrustie
severity: normal
status: open
title: Regex causes python to hang up? / loop infinite?
type: behavior
versions: Python 2.5
Added file: http://bugs.python.org/file10642/re_problem.py

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue3128>
_______________________________________


More information about the New-bugs-announce mailing list