[Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]
Michael Foord
fuzzyman at voidspace.org.uk
Tue Feb 14 00:53:16 CET 2006
Guido van Rossum wrote:
> On 2/13/06, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
>
>> Phillip J. Eby wrote:
>> [snip..]
>>
>>> In fact, the 'encoding' argument seems useless in the case of str objects,
>>> and it seems it should default to latin-1 for unicode objects. The only
>>>
>>>
>> -1 for having an implicit encode that behaves differently to other
>> implicit encodes/decodes that happen in Python. Life is confusing enough
>> already.
>>
>
> But adding an encoding doesn't help. The str.encode() method always
> assumes that the string itself is ASCII-encoded, and that's not good
> enough:
>
>
Sorry - I meant for the unicode to bytes case. A default encoding that
behaves differently to the current to implicit encodes/decodes would be
confusing IMHO.
I agree that string to bytes shouldn't change the value of the bytes.
The least confusing description of a non-unicode string is 'byte-string'.
Michael Foord
>>>> "abc".encode("latin-1")
>>>>
> 'abc'
>
>>>> "abc".decode("latin-1")
>>>>
> u'abc'
>
>>>> "abc\xf0".decode("latin-1")
>>>>
> u'abc\xf0'
>
>>>> "abc\xf0".encode("latin-1")
>>>>
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position
> 3: ordinal not in range(128)
>
>
> The right way to look at this is, as Phillip says, to consider
> conversion between str and bytes as not an encoding but a data type
> change *only*.
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>
>
More information about the Python-Dev
mailing list