[Python-3000] PEP 3131 roundup
Ka-Ping Yee
python at zesty.ca
Mon Jun 4 13:08:44 CEST 2007
Hi,
Here's a summary of some of the remaining open issues and unaddressed
arguments regarding PEP 3131. These are the ones I'm familiar with,
so I don't claim this to be complete. I hope it helps give some
perspective on this huge thread, though.
A. Should identifiers be allowed to contain any Unicode letter?
Drawbacks of allowing non-ASCII identifiers wholesale:
1. Python will lose the ability to make a reliable round trip to
a human-readable display on screen or on paper.
http://mail.python.org/pipermail/python-3000/2007-May/007855.html
2. Python will become vulnerable to a new class of security exploits;
code and submitted patches will be much harder to inspect.
http://mail.python.org/pipermail/python-3000/2007-May/007855.html
3. Humans will no longer be able to validate Python syntax.
http://mail.python.org/pipermail/python-3000/2007-May/007855.html
4. Unicode is young; its problems are not yet well understood and
solved; tool support is weak.
http://mail.python.org/pipermail/python-3000/2007-May/007855.html
5. Languages with non-ASCII identifiers use different character sets
and normalization schemes; PEP 3131's choices are non-obvious.
http://mail.python.org/pipermail/python-3000/2007-May/007947.html
http://mail.python.org/pipermail/python-3000/2007-May/007725.html
6. The Unicode bidi algorithm yields an extremely confusing display
order for RTL text when digits or operators are nearby.
http://www.w3.org/International/iri-edit/draft-duerst-iri.html#anchor5
http://mail.python.org/pipermail/python-3000/2007-May/007823.html
B. Should the default behaviour accept only ASCII identifiers, or
should it accept identifiers containing non-ASCII characters?
Arguments for ASCII only by default:
1. Non-ASCII identifiers by default makes common practice/assumptions
subtly/unknowingly wrong; rarely wrong is worse than obviously wrong.
http://mail.python.org/pipermail/python-3000/2007-May/007992.html
http://mail.python.org/pipermail/python-3000/2007-May/008009.html
http://mail.python.org/pipermail/python-3000/2007-May/007961.html
2. Better to raise a warning than to fail silently when encountering
an probably unexpected situation.
http://mail.python.org/pipermail/python-3000/2007-May/007993.html
http://mail.python.org/pipermail/python-3000/2007-May/007945.html
3. All of current usage is ASCII-only; the vast majority of future
usage will be ASCII-only.
http://mail.python.org/pipermail/python-3000/2007-May/007952.html
http://mail.python.org/pipermail/python-3000/2007-May/007927.html
3. It is the pockets of Unicode adoption that are parochial, not the
ASCII advocates.
http://mail.python.org/pipermail/python-3000/2007-May/008010.html
4. Python should audit for ASCII-only identifiers for the same
reasons that it audits for tab-space consistency
http://mail.python.org/pipermail/python-3000/2007-May/007942.html
5. Incremental change is safer.
http://mail.python.org/pipermail/python-3000/2007-May/008000.html
6. An ASCII-only default favors open-source development and sharing
of source code.
http://mail.python.org/pipermail/python-3000/2007-May/007988.html
http://mail.python.org/pipermail/python-3000/2007-May/007990.html
7. Existing projects won't have to waste any brainpower worrying
about the implications of Unicode identifiers.
http://mail.python.org/pipermail/python-3000/2007-May/007957.html
C. Should non-ASCII identifiers be optional?
Various voices in support of a flag (although there's been debate
over which should be the default, no one seems to be saying that
there shouldn't be an off switch):
http://mail.python.org/pipermail/python-3000/2007-May/007855.html
http://mail.python.org/pipermail/python-3000/2007-May/007916.html
http://mail.python.org/pipermail/python-3000/2007-May/007923.html
http://mail.python.org/pipermail/python-3000/2007-May/007935.html
http://mail.python.org/pipermail/python-3000/2007-May/007948.html
D. Should the identifier character set be configurable?
Various voices proposing and supporting a selectable character set,
so that users can get all the benefits of using their own language
without the drawbacks of confusable/unfamiliar characters:
http://mail.python.org/pipermail/python-3000/2007-May/007890.html
http://mail.python.org/pipermail/python-3000/2007-May/007896.html
http://mail.python.org/pipermail/python-3000/2007-May/007935.html
http://mail.python.org/pipermail/python-3000/2007-May/007950.html
http://mail.python.org/pipermail/python-3000/2007-May/007977.html
http://mail.python.org/pipermail/python-3000/2007-May/007957.html
http://mail.python.org/pipermail/python-3000/2007-May/008038.html
http://mail.python.org/pipermail/python-3000/2007-June/008121.html
E. Which identifier characters should be allowed?
1. What to do about bidi format control characters?
http://mail.python.org/pipermail/python-3000/2007-May/007750.html
http://mail.python.org/pipermail/python-3000/2007-May/007823.html
http://mail.python.org/pipermail/python-3000/2007-May/007826.html
2. What about other ID_Continue characters? What about characters
that look like punctuation? What about other recommendations
in UTS #39? What about mixed-script identifiers?
http://mail.python.org/pipermail/python-3000/2007-May/007836.html
F. Which normalization form should be used, NFC or NFKC?
http://mail.python.org/pipermail/python-3000/2007-May/007995.html
G. Should source code be required to be in normalized form?
http://mail.python.org/pipermail/python-3000/2007-May/007997.html
http://mail.python.org/pipermail/python-3000/2007-June/008137.html
-- ?!ng
More information about the Python-3000
mailing list