[Python-3000] Support for PEP 3131

Stephen J. Turnbull stephen at xemacs.org
Sat May 26 07:05:36 CEST 2007


Thank you for the apology.  I have cooled off, and I hope you won't
hold the "take offense" against me.  I was hurt, for sure, but you're
right, that's a legitimate reading in colloquial English.

Ka-Ping Yee writes:

 > That just means, if we're going to provide this feature, we shouldn't
 > force subtle wrongness upon people by making it the default.

I agree wholeheartedly!  But AFAIK this is the first time you have
explicitly limited yourself in principle to discussion of the default.
Up to now you've opposed the whole idea.

 > The PEP says that Python will *convert* the identifiers into NFC.
 > I'd rather there not be lots of different ways to write the same
 > identifier (TOOWTDI), so this particular recommendation is that
 > identifiers in source code have to already be normalized.

A Unicode conforming process may not distinguish between different
representations of a given character.  Ie, the NFC conversion is an
internal optimization.  The characters are the same.  I think Unicode
conformance is close enough to TOOWDTI, and far more important than
the remaining difference.  YMMV.

Pragmatically, users are likely not to know how to do it.  I do it
with an explicit call to an external library provided by Mac OS X; I
don't know how to do it (ie, what the (de)composition is, and often
even how to input the resulting characters) without access to the
library canonicalization API.  My input methods do not provide such a
facility.  (And Unicode says that they may refuse to do so.)

Finally, this would also be inconsistent with the definition of Python
implicit in PEP 263, which clearly envisions a Python program as a
sequence of abstract characters which may have an arbitrary
ASCII-compatible encoding on disk.



More information about the Python-3000 mailing list