[I18n-sig] Re: [Python-Dev] Unicode debate

Just van Rossum just@letterror.com
Fri, 5 May 2000 10:25:37 +0100


At 11:02 PM +0200 04-05-2000, Fredrik Lundh wrote:
>Henry S. Thompson <ht@cogsci.ed.ac.uk> wrote:
>> I think I hear a moderate consensus developing that the 'ASCII
>> proposal' is a reasonable compromise given the time constraints.
>
>agreed.

This makes no sense: implementing the 7-bit proposal takes the more or less
the same time as implementing 8-bit downcasting. Or is it just the
bickering that's too time consuming? ;-)

I worry that if the current implementation goes into 1.6 more or less as it
is now there's no way we can ever go back (before P3K). Or will Unicode
support be marked "experimental" in 1.6? This is not so much about the
7-bit/8-bit proposal but about the dubious unicode() and unichr() functions
and the u"" notation:

- unicode() only takes strings, so is effectively a method of the string type.
- if narrow and wide strings are meant to be as similar as possible,
chr(256) should just return a wide char
- similarly, why is the u"" notation at all needed?

The current design is more complex than needed, and still offers plenty of
surprises. Making it simpler (without integrating the two string types) is
not a huge effort. Seeing the wide string type as independent of Unicode
takes no physical effort at all, as it's just in our heads.

Fixing str() so it can return wide strings might be harder, and can wait
until later. Would be too bad, though.

Just