[Python-3000] PEP 3131 roundup

Tue Jun 5 21:09:03 CEST 2007

> Here's a summary of some of the remaining open issues and unaddressed
> arguments regarding PEP 3131.  These are the ones I'm familiar with,
> so I don't claim this to be complete.  I hope it helps give some
> perspective on this huge thread, though.

Thanks, I added them all to the PEP. Not sure which of these you
would consider "open issues", or "unaddressed arguments"; I'll
indicate below how I see them dealt with by the PEP currently.

> A. Should identifiers be allowed to contain any Unicode letter?

Not an open issue; the PEP has been accepted.

>    1. Python will lose the ability to make a reliable round trip to
>       a human-readable display on screen or on paper.

Correct. Was already the case, though, because of comments and string
literals.

>    2. Python will become vulnerable to a new class of security exploits;
>       code and submitted patches will be much harder to inspect.

The first class is correct; I'd question the second part (in particular
the "much" part of it). It's now addressed in the PEP by being listed
in the discussion section.

>    3. Humans will no longer be able to validate Python syntax.

That's not true. Instead, they might not be able to do that for *all*
Python programs - however, that is the case already: if programs
are sufficiently complex, people cannot validate Python syntax today.
Addressed by being listed.

>    4. Unicode is young; its problems are not yet well understood and
>       solved; tool support is weak.

Now listed. I disagree that Unicode is young; it is roughly as old
as Python.

>    5. Languages with non-ASCII identifiers use different character sets
>       and normalization schemes; PEP 3131's choices are non-obvious.

I disagree. PEP 3131 follows UAX#31 literally, and makes that decision
very clear. If people still cannot see that, please provide wording to
make it more clear.

>    6. The Unicode bidi algorithm yields an extremely confusing display
>       order for RTL text when digits or operators are nearby.

Now listed.

> B. Should the default behaviour accept only ASCII identifiers, or
>    should it accept identifiers containing non-ASCII characters?

Added as an open issue.

> C. Should non-ASCII identifiers be optional?

How is that different from B?

> D. Should the identifier character set be configurable?

Still seems to be the same open issue.

> E. Which identifier characters should be allowed?
> 
>    1. What to do about bidi format control characters?

That was already listed as an open issue.

>    2. What about other ID_Continue characters?  What about characters
>       that look like punctuation?  What about other recommendations
>       in UTS #39?  What about mixed-script identifiers?
> 
>       http://mail.python.org/pipermail/python-3000/2007-May/007836.html

That was also listed as an open issue.

> F.  Which normalization form should be used, NFC or NFKC?

Now listed as an open issue.

> G.  Should source code be required to be in normalized form?

Should I add a section "Rejected ideas"? This is out of scope of the PEP.

Regards,
Martin