[Python-Dev] one last SRE headache

Tim Peters tim_one@email.msn.com
Thu, 31 Aug 2000 17:55:56 -0400


The PRE documentation expresses the true intent:

    \number
    Matches the contents of the group of the same number. Groups
    are numbered starting from 1. For example, (.+) \1 matches 'the the'
    or '55 55', but not 'the end' (note the space after the group). This
    special sequence can only be used to match one of the first 99 groups.
    If the first digit of number is 0, or number is 3 octal digits long,
    it will not be interpreted as a group match, but as the character with
    octal value number. Inside the "[" and "]" of a character class, all
    numeric escapes are treated as characters

This was discussed at length when we decided to go the Perl-compatible
route, and Perl's rules for backreferences were agreed to be just too ugly
to emulate.  The meaning of \oo in Perl depends on how many groups precede
it!  In this case, there are fewer than 41 groups, so Perl says "octal
escape"; but if 41 or more groups had preceded, it would mean
"backreference" instead(!).  Simply unbearably ugly and error-prone.

> -----Original Message-----
> From: python-dev-admin@python.org [mailto:python-dev-admin@python.org]On
> Behalf Of Fredrik Lundh
> Sent: Thursday, August 31, 2000 3:47 PM
> To: python-dev@python.org
> Subject: [Python-Dev] one last SRE headache
>
>
> can anyone tell me how Perl treats this pattern?
>
>     r'((((((((((a))))))))))\41'
>
> in SRE, this is currently a couple of nested groups, surrounding
> a single literal, followed by a back reference to the fourth group,
> followed by a literal "1" (since there are less than 41 groups)
>
> in PRE, it turns out that this is a syntax error; there's no group 41.
>
> however, this test appears in the test suite under the section "all
> test from perl", but they're commented out:
>
> # Python does not have the same rules for \\41 so this is a syntax error
> #    ('((((((((((a))))))))))\\41', 'aa', FAIL),
> #    ('((((((((((a))))))))))\\41', 'a!', SUCCEED, 'found', 'a!'),
>
> if I understand this correctly, Perl treats as an *octal* escape
> (chr(041) == "!").
>
> now, should I emulate PRE, Perl, or leave it as it is...
>
> </F>
>
> PS. in case anyone wondered why I haven't seen this before, it's
> because I just discovered that the test suite masks syntax errors
> under some circumstances...
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://www.python.org/mailman/listinfo/python-dev