[Python-3000] PEP 3138- String representation in Python 3000

M.-A. Lemburg mal at egenix.com
Thu May 15 12:38:11 CEST 2008


On 2008-05-15 12:06, Paul Moore wrote:
> On 15/05/2008, Guido van Rossum <guido at python.org> wrote:
>> Consider code that gets an encoding passed in as a
>> variable e. It knows it has a bytes instance b. To encode b from bytes
>> to str (unicode), it can use s = b.decode(e).
> 
> To encode, you use .decode? It's nice to know it's not just me who has
> trouble keeping the terminology straight...

It's all a matter of perspective. You can say you're encoding Latin-1
to Unicode, or you can say your encoding Unicode to Latin-1.

Python's Unicode implementation regards PyUnicode as the "bigger" type
than PyString (*), since it can hold all possible code points, so when
going from the "bigger" type to the smaller one, you *encode*, whereas
when going from the smaller one to the bigger one, you *decode*.

For codecs in general, you have a source and a destination defining
the codec (= coding / decoding). When going from the source to the
destination you *encode*, the other way around is *decoding*.

(*) This is why coercion in Py2 goes from PyString to PyUnicode and
not the other way around.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, May 15 2008)
 >>> Python/Zope Consulting and Support ...        http://www.egenix.com/
 >>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::


    eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
     D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
            Registered at Amtsgericht Duesseldorf: HRB 46611


More information about the Python-3000 mailing list