[Python-Dev] Pre-PEP: Python Character Model

M.-A. Lemburg mal@lemburg.com
Tue, 06 Feb 2001 11:49:00 +0100


[pre-PEP]

You have a lot of good points in there (also some inaccuracies) and
I agree that Python should move to using Unicode for text data
and arrays for binary data.

Some things you may be missing though is that Python already
has support for a few features you mention, e.g. codecs.open()
provide more or less what you have in mind with fopen() and
the compiler can already unify Unicode and string literals using
the -U command line option.

What you don't talk about in the PEP is that Python's stdlib isn't
even Unicode aware yet, and whatever unification steps we take,
this project will have to preceed it. The problem with making the
stdlib Unicode aware is that of deciding which parts deal with
text data or binary data -- the code sometimes makes assumptions
about the nature of the data and at other times it simply doesn't
care.

In this light I think you ought to focus Python 3k with your
PEP. This will also enable better merging techniques due to the
lifting of the type/class difference.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/