[I18n-sig] Japanese commentary on the Pre-PEP (2 of 4)

Guido van Rossum guido@digicool.com
Tue, 20 Feb 2001 17:02:01 -0500

> I think that there is an important issue here. Python is documented as
> having character strings. The minimal unit of a string is supposed to be
> a character. "Literal" strings are documented as being strings of
> characters.

Sorry, you're reading way too much into the words here.  When I wrote
that, in my brain there was absolutely no difference between
characters and bytes, and in C the type name for a byte is 'char', so
I wrote 'character' -- but I was thinking '8-bit quantity'.

[starry-eyed romantic idealism skipped]

> It is certainly too early for Python to abandon the one-byte centric
> view of the world. It is NOT too early to start putting into place a
> transition plan to the future world that we will all be forced to live
> in. Part of that transition is teaching people that literal strings may
> one day allow characters greater than 128 (perhaps directly, perhaps
> through an escape mechanism).

No objection here.

> > ...
> >   Furthermore, Japanese programmers are accustomed to dealing with Japanese
> >   strings as byte sequences.  Japanese users have a real motivation to
> >  manipulate Japanese character strings as sequences of bytes.  Regardless
> >  of whether Unicode is supported or not, the byte sequence data type is
> >  necessary in order to represent Japanese characters.
> An explicit part of every proposal has been a continued support for
> rich, expressive byte-sequence manipulation.
> >  The present implementation of strings in Python, where a string represents
> >  a sequence of bytes, is one feature that makes Python easy for Japanese
> >  developers to use.  
> If Japanese programmers understand the difference between a byte and a
> character (which they must!), why would they be opposed to making that
> distinction explicit in code?

Maybe because, like me, they're thinking in historical terms where
'char' is just another word for byte?

--Guido van Rossum (home page: http://www.python.org/~guido/)