[Python-Dev] Is this safe enough? Re: [Python-checkins] cpython: _Py_Identifier are always ASCII strings
Victor Stinner
victor.stinner at haypocalc.com
Tue Feb 7 09:55:06 CET 2012
2012/2/7 "Martin v. Löwis" <martin at v.loewis.de>:
>> _Py_IDENTIFIER(xxx) defines a variable called PyId_xxx, so xxx can
>> only be ASCII: the C language doesn't accept non-ASCII identifiers.
>
> That's not exactly true. In C89, source code is in the "source character
> set", which is implementation-defined, except that it must contain
> the "basic character set". I believe that it allows for
> implementation-defined characters in identifiers.
Hum, I hope that these C89 compilers use UTF-8.
> In C99, this is
> extended to include "universal character names" (\u escapes). They may
> appear in identifiers
> as long as the characters named are listed in annex D.59 (which I cannot
> locate).
Does C99 specify the encoding? Can we expect UTF-8?
Python is supposed to work on many platforms ans so support a lot of
compilers, not only compilers supporting non-ASCII identifiers.
Victor
More information about the Python-Dev
mailing list