Unicode program representation
Moshe Zadka
moshez at math.huji.ac.il
Mon Apr 3 18:15:53 EDT 2000
On Mon, 3 Apr 2000, Fredrik Lundh wrote:
> hmm. wouldn't that mean that we end up using different encodings
> in different parts of the script? feels a little scary, to say the least...
FWIW, I believe that Python's (official) parser should be strictly UTF-8.
It could be done in stages:
-- stage 1: warn about non-ASCII characters in scripts (warning framework
TBD)
-- stage 2: don't accept non-ASCII characters in scripts at all
-- stage 3: assume all scripts are UTF-8
This isn't a "socket.connect"-like "I've been using this feature for
years" issue: it's trivial to write a script to convert between anything
and UTF-8 (in Python, of course <wink>).
However, there are some non-trivial issues here: should Python identifiers
be able to include all characters Unicode defines as letters?
--
Moshe Zadka <mzadka at geocities.com>.
http://www.oreilly.com/news/prescod_0300.html
http://www.linux.org.il -- we put the penguin in .com
More information about the Python-list
mailing list