[Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

Tue Feb 14 00:44:27 CET 2006

On 2/13/06, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
> Phillip J. Eby wrote:
> [snip..]
> >
> > In fact, the 'encoding' argument seems useless in the case of str objects,
> > and it seems it should default to latin-1 for unicode objects.  The only
> >
> -1 for having an implicit encode that behaves differently to other
> implicit encodes/decodes that happen in Python. Life is confusing enough
> already.

But adding an encoding doesn't help. The str.encode() method always
assumes that the string itself is ASCII-encoded, and that's not good
enough:

>>> "abc".encode("latin-1")
'abc'
>>> "abc".decode("latin-1")
u'abc'
>>> "abc\xf0".decode("latin-1")
u'abc\xf0'
>>> "abc\xf0".encode("latin-1")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position
3: ordinal not in range(128)
>>>

The right way to look at this is, as Phillip says, to consider
conversion between str and bytes as not an encoding but a data type
change *only*.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)