[Python-Dev] Unicode debate

Paul Prescod paul@prescod.net
Tue, 02 May 2000 11:05:20 -0500


Neil, I sincerely appreciate your informed input. I want to emphasize
one ideological difference though. :)

Neil Hodgson wrote:
> 
> ...
>
>    The two options being that literal is either assumed to be encoded in
> Latin-1 or UTF-8. 

I reject that characterization.

I claim that both strings contain Unicode characters but one can contain
Unicode charactes with higher digits. UTF-8 versus latin-1 does not
enter into it. Python strings should not be documented in terms of
encodings any more than Python ints are documented in terms of their
two's complement representation. Then we could describe the default
conversion from integers to floats in terms of their bit-representation.
Ugh!

I accept that the effect is similar to calling Latin-1 the "default"
that's a side effect of the simple logical model that we are proposing.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
It's difficult to extract sense from strings, but they're the only
communication coin we can count on. 
	- http://www.cs.yale.edu/~perlis-alan/quotes.html