[Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]
Ian Bicking
ianb at colorstudy.com
Sat Feb 18 00:13:51 CET 2006
Martin v. Löwis wrote:
> Users do
>
> py> "Martin v. Löwis".encode("utf-8")
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11:
> ordinal not in range(128)
>
> because they want to convert the string "to Unicode", and they have
> found a text telling them that .encode("utf-8") is a reasonable
> method.
>
> What it *should* tell them is
>
> py> "Martin v. Löwis".encode("utf-8")
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> AttributeError: 'str' object has no attribute 'encode'
I think it would be even better if they got "ValueError: utf8 can only
encode unicode objects". AttributeError is not much more clear than the
UnicodeDecodeError.
That str.encode(unicode_encoding) implicitly decodes strings seems like
a flaw in the unicode encodings, quite seperate from the existance of
str.encode. I for one really like s.encode('zlib').encode('base64') --
and if the zlib encoding raised an error when it was passed a unicode
object (instead of implicitly encoding the string with the ascii
encoding) that would be fine.
The pipe-like nature of .encode and .decode works very nicely for
certain transformations, applicable to both unicode and byte objects.
Let's not throw the baby out with the bath water.
--
Ian Bicking / ianb at colorstudy.com / http://blog.ianbicking.org
More information about the Python-Dev
mailing list