[Python-Dev] Unicode, again

Lars Marius Garshol larsga@garshol.priv.no
04 Mar 2002 15:14:02 +0100


* Fred L. Drake, Jr.
| 
| Pure 7-bit ASCII is good as UTF-8; there may be some quirks with
| control characters, but I'm not aware of anything specific.  

There isn't. Those characters are valid Unicode characters and have
the same semantics in Unicode. There is no interaction with the UTF-8
encoding, as all characters below U+0080 are just treated as being
directly encoded in 8 bits.

| The difference between ASCII and UTF-9 starts at the 8th bit, so
| pure ASCII (not Latin-1) should be quite happy in a UTF-8 world.

UTF-9 is a typo for UTF-8, right?

-- 
Lars Marius Garshol, Ontopian         <URL: http://www.ontopia.net >
ISO SC34/WG3, OASIS GeoLang TC        <URL: http://www.garshol.priv.no >