[Python-3000] PEP: Supporting Non-ASCII Identifiers

Mon Jun 4 22:50:09 CEST 2007

On 6/4/07, "Martin v. Löwis" <martin at v.loewis.de> wrote:
> > It seems to me the simplest thing to do is to require that Python
> > source files be normalized.  Then the ambiguity just goes away.
> > Everyone knows what form their files should be in, and if you really
> > need to construct a non-normalized string, you can do that explicitly
> > using "\u" notation.

> However, what would that mean wrt. non-Unicode source encodings.

> Say you have a Latin-1-encoded source code. Is that in NFC or not?

Doesn't that depend on whether they happened to ever write some of the
combined characters (such as ö) using a two-character form like o¨?

FWIW, I would prefer "the parser will normalize" to "the parser will
reject unnormalized", to support even the dumbest of editors.

-jJ