[issue5398] strftime("%B") returns a String unusable with unicode

Ezio Melotti report at bugs.python.org
Sun Mar 1 14:00:48 CET 2009


Ezio Melotti <ezio.melotti at gmail.com> added the comment:

I don't have the de_DE locale to reproduce that, but the cause is most
likely this:
1) datetime( 2009, 3, 1 ).strftime("%B") should return märz as a UTF-8
encoded string, i.e. 'm\xc3\xa4rz'
2) when you mix Unicode and encoded strings, the encoded strings are
automagically decoded to Unicode using the default codec, i.e. ASCII (on
Py2)
3) The ASCII codec is not able to decode '\xc3' (its value is 195, and
195 > 127) and a UnicodeDecodeError is raised.

The solution is to decode the string explicitly using UTF-8:
>>> month = 'm\xc3\xa4rz'
>>> u'' + month
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1:
ordinal not in range(128)
>>> u'' + month.decode('utf-8')
u'm\xe4rz'
>>>

----------
nosy: +ezio.melotti

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue5398>
_______________________________________


More information about the Python-bugs-list mailing list