[Python-3000] PEP: Supporting Non-ASCII Identifiers
Jim Jewett
jimjjewett at gmail.com
Mon Jun 4 22:50:09 CEST 2007
On 6/4/07, "Martin v. Löwis" <martin at v.loewis.de> wrote:
> > It seems to me the simplest thing to do is to require that Python
> > source files be normalized. Then the ambiguity just goes away.
> > Everyone knows what form their files should be in, and if you really
> > need to construct a non-normalized string, you can do that explicitly
> > using "\u" notation.
> However, what would that mean wrt. non-Unicode source encodings.
> Say you have a Latin-1-encoded source code. Is that in NFC or not?
Doesn't that depend on whether they happened to ever write some of the
combined characters (such as ö) using a two-character form like o¨?
FWIW, I would prefer "the parser will normalize" to "the parser will
reject unnormalized", to support even the dumbest of editors.
-jJ
More information about the Python-3000
mailing list