[Python-3000] PEP 3131 roundup

Ka-Ping Yee python at zesty.ca
Mon Jun 4 13:08:44 CEST 2007


Hi,

Here's a summary of some of the remaining open issues and unaddressed
arguments regarding PEP 3131.  These are the ones I'm familiar with,
so I don't claim this to be complete.  I hope it helps give some
perspective on this huge thread, though.


A. Should identifiers be allowed to contain any Unicode letter?

   Drawbacks of allowing non-ASCII identifiers wholesale:

   1. Python will lose the ability to make a reliable round trip to
      a human-readable display on screen or on paper.

      http://mail.python.org/pipermail/python-3000/2007-May/007855.html

   2. Python will become vulnerable to a new class of security exploits;
      code and submitted patches will be much harder to inspect.

      http://mail.python.org/pipermail/python-3000/2007-May/007855.html

   3. Humans will no longer be able to validate Python syntax.

      http://mail.python.org/pipermail/python-3000/2007-May/007855.html

   4. Unicode is young; its problems are not yet well understood and
      solved; tool support is weak.

      http://mail.python.org/pipermail/python-3000/2007-May/007855.html

   5. Languages with non-ASCII identifiers use different character sets
      and normalization schemes; PEP 3131's choices are non-obvious.

      http://mail.python.org/pipermail/python-3000/2007-May/007947.html
      http://mail.python.org/pipermail/python-3000/2007-May/007725.html

   6. The Unicode bidi algorithm yields an extremely confusing display
      order for RTL text when digits or operators are nearby.

      http://www.w3.org/International/iri-edit/draft-duerst-iri.html#anchor5
      http://mail.python.org/pipermail/python-3000/2007-May/007823.html


B. Should the default behaviour accept only ASCII identifiers, or
   should it accept identifiers containing non-ASCII characters?

   Arguments for ASCII only by default:

   1. Non-ASCII identifiers by default makes common practice/assumptions
      subtly/unknowingly wrong; rarely wrong is worse than obviously wrong.

      http://mail.python.org/pipermail/python-3000/2007-May/007992.html
      http://mail.python.org/pipermail/python-3000/2007-May/008009.html
      http://mail.python.org/pipermail/python-3000/2007-May/007961.html

   2. Better to raise a warning than to fail silently when encountering
      an probably unexpected situation.

      http://mail.python.org/pipermail/python-3000/2007-May/007993.html
      http://mail.python.org/pipermail/python-3000/2007-May/007945.html

   3. All of current usage is ASCII-only; the vast majority of future
      usage will be ASCII-only.

      http://mail.python.org/pipermail/python-3000/2007-May/007952.html
      http://mail.python.org/pipermail/python-3000/2007-May/007927.html

   3. It is the pockets of Unicode adoption that are parochial, not the
      ASCII advocates.

      http://mail.python.org/pipermail/python-3000/2007-May/008010.html

   4. Python should audit for ASCII-only identifiers for the same
      reasons that it audits for tab-space consistency

      http://mail.python.org/pipermail/python-3000/2007-May/007942.html

   5. Incremental change is safer.

      http://mail.python.org/pipermail/python-3000/2007-May/008000.html

   6. An ASCII-only default favors open-source development and sharing
      of source code.

      http://mail.python.org/pipermail/python-3000/2007-May/007988.html
      http://mail.python.org/pipermail/python-3000/2007-May/007990.html

   7. Existing projects won't have to waste any brainpower worrying
      about the implications of Unicode identifiers.

      http://mail.python.org/pipermail/python-3000/2007-May/007957.html


C. Should non-ASCII identifiers be optional?

   Various voices in support of a flag (although there's been debate
   over which should be the default, no one seems to be saying that
   there shouldn't be an off switch):

   http://mail.python.org/pipermail/python-3000/2007-May/007855.html
   http://mail.python.org/pipermail/python-3000/2007-May/007916.html
   http://mail.python.org/pipermail/python-3000/2007-May/007923.html
   http://mail.python.org/pipermail/python-3000/2007-May/007935.html
   http://mail.python.org/pipermail/python-3000/2007-May/007948.html


D. Should the identifier character set be configurable?

   Various voices proposing and supporting a selectable character set,
   so that users can get all the benefits of using their own language
   without the drawbacks of confusable/unfamiliar characters:

   http://mail.python.org/pipermail/python-3000/2007-May/007890.html
   http://mail.python.org/pipermail/python-3000/2007-May/007896.html
   http://mail.python.org/pipermail/python-3000/2007-May/007935.html
   http://mail.python.org/pipermail/python-3000/2007-May/007950.html
   http://mail.python.org/pipermail/python-3000/2007-May/007977.html
   http://mail.python.org/pipermail/python-3000/2007-May/007957.html
   http://mail.python.org/pipermail/python-3000/2007-May/008038.html
   http://mail.python.org/pipermail/python-3000/2007-June/008121.html


E. Which identifier characters should be allowed?

   1. What to do about bidi format control characters?

      http://mail.python.org/pipermail/python-3000/2007-May/007750.html
      http://mail.python.org/pipermail/python-3000/2007-May/007823.html
      http://mail.python.org/pipermail/python-3000/2007-May/007826.html

   2. What about other ID_Continue characters?  What about characters
      that look like punctuation?  What about other recommendations
      in UTS #39?  What about mixed-script identifiers?

      http://mail.python.org/pipermail/python-3000/2007-May/007836.html


F.  Which normalization form should be used, NFC or NFKC?

    http://mail.python.org/pipermail/python-3000/2007-May/007995.html


G.  Should source code be required to be in normalized form?

    http://mail.python.org/pipermail/python-3000/2007-May/007997.html
    http://mail.python.org/pipermail/python-3000/2007-June/008137.html


-- ?!ng


More information about the Python-3000 mailing list