[docs] [issue20686] Confusing statement about unicode strings in tutorial introduction

Daniel U. Thibault report at bugs.python.org
Thu Mar 20 21:00:38 CET 2014


Daniel U. Thibault added the comment:

>>> mystring="äöü"
>>> myustring=u"äöü"

>>> mystring
'\xc3\xa4\xc3\xb6\xc3\xbc'
>>> myustring
u'\xe4\xf6\xfc'

>>> str(mystring)
'\xc3\xa4\xc3\xb6\xc3\xbc'
>>> str(myustring)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)

>>> f = open('workfile', 'w')
>>> f.write(mystring)
>>> f.close()
>>> f = open('workufile', 'w')
>>> f.write(myustring)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
>>> f.close()

workfile contains C3 A4 C3 B6 C3 BC

So the Unicode string (myustring) does indeed try to convert to ASCII when written to file. But not when just printed.

It seems really strange that non-Unicode strings (mystring) should actually be more flexible than Unicode strings...

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue20686>
_______________________________________


More information about the docs mailing list