Pathological regular expression

John Machin sjmachin at lexicon.net
Sat Apr 11 17:40:03 CEST 2009


On Apr 12, 1:07 am, Steven D'Aprano <st... at REMOVE-THIS-
cybersource.com.au> wrote:
> On Thu, 09 Apr 2009 02:56:00 -0700, David Liang wrote:
> > Hi all,
> > I'm having a weird problem with a regular expression (tested in 2.6 and
> > 3.0):
>
> > Basically, any of these:
> > _re_comments = re.compile(r'^(([^\\]+|\\.|"([^"\\]+|\\.)*")*)#.*$')
> > _re_comments = re.compile(r'^(([^#]+|\\.|"([^"\\]+|\\.)*")*)#.*$')
> > _re_comments = re.compile(r'^(([^"]+|\\.|"([^"\\]+|\\.)*")*)#.*$')
>
> > followed by for example,
> > line = r'~/.[m]ozilla/firefox/*.default/chrome'
> > print(_re_comments.sub(r'\1', line))
>
> > ...hangs the interpreter.
>
> I can confirm the first one hangs the interpreter in Python 2.5 as well.
> I haven't tested the other two.
>
> To my mind, this is a bug in the RE engine. Is there any reason to not
> treat it as a bug?

IMHO it's not a bug -- s/hang/takes a long time to compute/

Just look at it: 2 + operators and 3 * operators ... It's one of those
"come back after lunch" REs.




More information about the Python-list mailing list