[Python-Dev] should we keep the \xnnnn escape in unicode strings?
Finn Bock
bckfnn@worldonline.dk
Sun, 16 Jul 2000 12:42:01 GMT
[Fredrik Lundh]
>mal wrote:
>
>> > 1. treat \x as a hexadecimal byte, not a hexadecimal
>> > character. or in other words, make sure that
>> >=20
>> > ord("\xabcd") =3D=3D ord(u"\xabcd")
>> >=20
>> > fwiw, this is how it's done in SRE's parser (see the
>> > python-dev archives for more background).
>...
>> > 5. leave it as it is (just fix the comment).
>>=20
>> I'd suggest 5 -- makes converting 8-bit strings using \x
>> to Unicode a tad easier.
>
>if that's the main argument, you really want alternative 1.
>
>with alternative 5, the contents of the string may change
>if you add a leading "u".
>
>alternative 1 is also the only reasonable way to make ordinary
>strings compatible with SRE (see the earlier discussion for why
>SRE has to be strict on this one...)
>
>so let's change the question into a proposal:
>
> for maximum compatibility with 8-bit strings and SRE,
> let's change "\x" to mean "binary byte" in unicode string
> literals too.
This would potentially break JPython where the \x is already used to
introduce 16-bit chars in ordinary strings. OTOH the implementation of
\x in JPython is so full of bugs and inconsistencies that I'm +1 on your
proposal.
regards,
finn