[Python-Dev] sysconfig/SRE bug

A.M. Kuchling akuchlin@mems-exchange.org
Sat, 1 Jul 2000 09:22:30 -0400

sysconfig.py contains a pattern that breaks SRE:

>>> import re, pre
>>> pre.compile(r"\${([A-Za-z][A-Za-z0-9_]*)}")
<pre.RegexObject instance at 0x827a394>
>>> re.compile(r"\${([A-Za-z][A-Za-z0-9_]*)}")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "/usr/lib/python2.0/sre.py", line 54, in compile
    return _compile(pattern, flags)
  File "/usr/lib/python2.0/sre_parse.py", line 538, in parse
    p = _parse(source, state)
  File "/usr/lib/python2.0/sre_parse.py", line 394, in _parse
    raise error, "bogus range"

Escaping the { and } fix this.  This is a problem with the {xxx,yyy}
notation for ranges; PCRE looks ahead, and treats it as a literal
unless it's followed by digits of the right form.  From pypcre.c:

/* This function is called when a '{' is encountered in a place where it might
start a quantifier. It looks ahead to see if it really is a quantifier or not.
It is only a quantifier if it is one of the forms {ddd} {ddd,} or {ddd,ddd}
where the ddds are digits.

I suppose the goal of Perl compatibility means this magical behaviour
needs to be preserved?