Pathological regular expression

John Machin sjmachin at lexicon.net
Sat Apr 11 21:08:20 EDT 2009


On Apr 12, 10:31 am, Steven D'Aprano <st... at REMOVE-THIS-
cybersource.com.au> wrote:
> On Sat, 11 Apr 2009 16:46:20 -0700, John Machin wrote:
> > On Apr 12, 3:40 am, Steven D'Aprano <st... at REMOVE-THIS-
> > cybersource.com.au> wrote:
> >> On Sat, 11 Apr 2009 08:40:03 -0700, John Machin wrote:
> >> >> To my mind, this is a bug in the RE engine. Is there any reason to
> >> >> not treat it as a bug?
>
> >> > IMHO it's not a bug -- s/hang/takes a long time to compute/
>
> >> > Just look at it: 2 + operators and 3 * operators ... It's one of
> >> > those "come back after lunch" REs.
>
> >> Well, it's been running now for about two and a half hours, that's a
> >> rather long lunch. And despite MRAB's assertion, it *cannot* be
> >> interrupted by ctrl-C. That means that to all intents and purposes, the
> >> interpreter has locked up for the duration of the calculation, which
> >> may be days or weeks for all I know.
>
> > If you don't know, experiment!
>
> My original test has now been running for close to ten hours now, and
> still can't be interrupted with ctrl-C. However that's in Python 2.5,

What platform are you running on? It works OK for me back to 2.1 on
Windows XP SP3:

Traceback (most recent call last):
  File "weirdre.py", line 34, in ?
    result = _re_comments.sub(r"\1", line)
  File "C:\python21\lib\sre.py", line 164, in _sub
    return _subn(pattern, template, string, count)[0]
  File "C:\python21\lib\sre.py", line 179, in _subn
    m = c.search()
KeyboardInterrupt


> having tried it in Python 2.6.2 they can be interrupted, so I'm satisfied
> that this bug of "regex hangs the interpreter" is not worth reporting, as
> it effects only older versions.
>
> [...]
>
> > 3. Test the RE, get it correct
>
> They're not my regexes. I don't care whether they are correct or not, I'm
> more concerned about them locking up the interpreter.

That's understood -- it was a shotgun reply to the OP and most
responders; each can take their own pellets ;-)



More information about the Python-list mailing list