[Python-3000] PEP 3138- String representation in Python 3000

M.-A. Lemburg mal at egenix.com
Thu May 22 14:27:19 CEST 2008


On 2008-05-22 13:58, Antoine Pitrou wrote:
> M.-A. Lemburg <mal <at> egenix.com> writes:
>> It's all a matter of perspective. You can say you're encoding Latin-1
>> to Unicode, or you can say your encoding Unicode to Latin-1.
> 
> Except that Latin-1 is an encoding while Unicode is not. So I don't see how you
> can encode to Unicode. Of course you can encode to UTF-8, UTF-16, etc. - which
> /are/ encodings (and, in this case, Python returns you a bytes object :-)).

Well, yes and no :-)

Unicode does encode a way to describe code points. The assignments
of integers to letters, symbols, etc. (ie. a "character set")
provides the encoding, so you can call it "encoding" as well.

OTOH, Unicode is the mother of all character sets so to speak (even
though in this case, many children existed before the mother was
formed ;-), so it has a special status.

In practice the terms "encoding" and "character set" are often
used interchangeably, just as most people talk about "characters"
when referring to "code points" and/or "glyphs", or happily mix
"UTF-8", "UTF-16" and "Unicode".

The Unicode consortium usually uses the terms "UCS2" and "UCS4"
when referring to Unicode as "character set", but even there
you have an ordering which makes it an encoding.

See my talk on Unicode for some clarification:

http://www.egenix.com/library/presentations/

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 22 2008)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


    eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
            Registered at Amtsgericht Duesseldorf: HRB 46611


More information about the Python-3000 mailing list