[Python-3000] Conservative Defaults (was: Re: Support for PEP 3131)

Mon Jun 4 05:45:11 CEST 2007

Jim Jewett writes:

 > > You're exaggerating the amount of work caused [by adding to the toolchain]
 > 
 > No, he isn't.

It is exaggeration.  AFAICS the work of auditing character sets can be
done by the same codec APIs that implement PEP 263.  The only question
is whether the additional work of parsing out the identifiers would
cause noticable inefficiency in codec operation.  AFAIK, parsing out
the identifiers is cheap (though possibly several times as expensive
as the UTF-8 -> unicode object conversion, if it needs to be done
once in the codec and once in the compiler).

 > Hopefully, I can set my own python to enforce ASCII IDs (rather than
 > ASCII strings and comments).  But if too many people start to assume
 > that distributed code can freely mix other scripts, I'll start to get
 > random failures.

This is unlikely to be a major problem, IMHO.  It definitely is a
consideration, though, and some people will face more difficulty than
others, perhaps a lot more.

 > Not seeing problems in Lisp would be a valid argument -- except that
 > the internationalized IDs are explicitly marked.  Not just the files;
 > the individual IDs.  You have to write |lowercase| to get an ID made
 > of unexpected characters (including explicitly lower-case letters).

This is not true of Emacs Lisp, which not only accepts non-ASCII
characters, but is case-sensitive.

 > noticed; python should (and currently does) meet a higher standard for
 > cross-platform interoperability.

As does Emacs.

 > The same one-step-at-a-time reasoning applies to unicode identifers.
 > Allowing IDs in your native language (or others that you explicitly
 > approve) is probably a good step.  Allowing IDs in *any* language by
 > default is probably going too far.

I don't really see that distinction.  IMO the scenarios where allowing
a native language makes sense are (a) localized (like a programming
class), and you won't run into anything else anyway, and (b)
internationalized, where you'll be sharing with others who have
enabled *their* native languages.

Those with stricter auditing requirements will be vetting production
code with more powerful external tools anyway.