[Python-Dev] one last SRE headache

Guido van Rossum guido@beopen.com
Thu, 31 Aug 2000 16:12:29 -0500


> amk wrote:
> > outside a character class it's a character if there are exactly
> > 3 octal digits; otherwise it's a backref.  So \41 is a backref
> > to group 41, but \041 is the literal character ASCII 33.
> 
> so what's the right way to parse this?
> 
> read up to three digits, check if they're a valid octal
> number, and treat them as a decimal group number if
> not?

Suggestion:

If there are fewer than 3 digits, it's a group.

If there are exactly 3 digits and you have 100 or more groups, it's a
group -- too bad, you lose octal number support.  Use \x. :-)

If there are exactly 3 digits and you have at most 99 groups, it's an
octal escape.

(Can you even have more than 99 groups in SRE?)

--Guido van Rossum (home page: http://www.pythonlabs.com/~guido/)