[I18n-sig] Re: Python and Unicode == Britain and the Euro?

Tim Peters tim.one@home.com
Sun, 11 Feb 2001 16:43:41 -0500


>> The Python Reference Manual says (chapter 2, "Lexical analysis"):
>>
>>     Python uses the 7-bit ASCII character set for program text and
>>     string literals.

[/F]
> ...and then says "8-bit characters may be used in string literals
> ad comments but their interpretation is platform dependent".
>
> for a non-ASCII programmer, that pretty much means "no native
> character set".

Absolutely.  That's why the Ref Man also says:

    the proper way to insert 8-bit characters in string literals
    is by using octal or hexadecimal escape sequences

Note too that Python opens Python source files in C text mode, and C doesn't
guarantee that high-bit characters can be faithfully written to or read back
from text-mode files either.

What's the point?  As I said before, the *intent* was that Python source
code use 7-bit ASCII.  All we're demonstrating here is the various ways in
which the Ref Man is consistent with that intent.  Go beyond that, and if
"it works" you're seeing a platform accident, albeit a reliable accident on
the major Python platforms.