a simple unicode question

Gabriel Genellina gagsl-py2 at yahoo.com.ar
Wed Oct 28 04:10:51 EDT 2009


En Wed, 28 Oct 2009 02:28:01 -0300, Chris Jones <cjns1989 at gmail.com>  
escribió:
> On Tue, Oct 27, 2009 at 06:21:11AM EDT, Lie Ryan wrote:
>> Chris Jones wrote:
>>> Best part of Unicode is that there are multiple encodings, right? ;-)
>> No, the best part about Unicode is there is no encoding!
>> Unicode does not define any encoding;
>
> RFC 3629:
> "ISO/IEC 10646 and Unicode define several encoding forms of their
> common repertoire: UTF-8, UCS-2, UTF-16, UCS-4 and UTF-32."
>
>> what it defines is code-points for  characters which is not related to
>> how characters are encoded in files or network transmission.
>
> In other words, Unicode is "not related to any encoding" .. and yet the
> UTF-8, UTF-16.. "encoding forms" are clearly "related" to Unicode.
>
> How is that possible?

Start reading "The Absolute Minimum Every Software Developer Absolutely,  
Positively Must Know About Unicode and Character Sets (No Excuses!)", by  
Joel Spolsky.
http://www.joelonsoftware.com/articles/Unicode.html

-- 
Gabriel Genellina




More information about the Python-list mailing list