[I18n-sig] Pre-PEP: Proposed Python Character Model

Toby Dickenson tdickenson@geminidataloggers.com
Wed, 07 Feb 2001 11:03:18 +0000


On Tue, 6 Feb 2001 21:49:42 +0100, "Martin v. Loewis"
<martin@loewis.home.cs.tu-berlin.de> wrote:

>Hi Paul,
>
>Interesting remarks. I comment only on those where I disagree.
>
>>     1. Python should have a single string type.=20
>
>I disagree. There should be a character string type and a byte string
>type, at least. I would agree that a single character string type is
>desirable.

There is already a large body of code that mixes text and binary data
in the same type. If we have separate text/binary types, then we need
to plan a transition period to allow code to distinguish between the
two uses.

>>     Two of the most common constructs in computer science are strings =
of
>>     characters and strings of bytes. A string of bytes can be =
represented
>>     as a string of characters between 0 and 255. Therefore the only
>>     reason to have a distinction between Unicode strings and byte
>>     strings is for implementation simplicity and performance purposes.
>>     This distinction should only be made visible to the average Python
>>     programmer in rare circumstances.

I disagree. Many programmers will be satisfied when they read a byte
string from a text file, print it, and see "Hello World". Much better
that we distinguish the two types, so that it looks like binary data
when printed.



Toby Dickenson
tdickenson@geminidataloggers.com