[Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]

Aahz aahz at pythoncraft.com
Sat Feb 18 06:13:44 CET 2006


On Fri, Feb 17, 2006, "Martin v. L?wis" wrote:
> Josiah Carlson wrote:
>>
>> How are users confused?
> 
> Users do
> 
> py> "Martin v. L?wis".encode("utf-8")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xf6 in position 11:
> ordinal not in range(128)
> 
> because they want to convert the string "to Unicode", and they have
> found a text telling them that .encode("utf-8") is a reasonable
> method.

The problem is that they don't understand that "Martin v. L?wis" is not
Unicode -- once all strings are Unicode, this is guaranteed to work.
While it's not absolutely true, my experience of watching Unicode
confusion is that the simplest approach for newbies is: encode FROM
Unicode, decode TO Unicode.  Most people when they start playing with
Unicode think of it as just another text encoding rather than suddenly
replacing "the universe" as the most base form of text.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"19. A language that doesn't affect the way you think about programming,
is not worth knowing."  --Alan Perlis


More information about the Python-Dev mailing list