[I18n-sig] Pre-PEP: Proposed Python Character Model

Toby Dickenson tdickenson@geminidataloggers.com
Thu, 8 Feb 2001 10:26:00 -0000


> > There is already a large body of code that mixes text and 
> binary data
> > in the same type. If we have separate text/binary types, 
> then we need
> > to plan a transition period to allow code to distinguish between the
> > two uses.
> 
> I think the current Unicode implementation has this property: Unicode
> is the type for representing character strings; the string type the
> one for representing byte strings.

The problem isnt so much in the current implementation; its in the code that
has been written to that implementation. At the moment it is unnatural to
write

print u"hello world"

rather than the easier

print "hello world"

even though the message is clearly text.


I think we agree that, eventually, we would like the simple notation for a
string literal to create a unicode string. What Im not sure about is whether
we can make that change soon. How often are string literals used to create
what is logically just binary data?