[New-bugs-announce] [issue24025] str(bytes_obj) should raise an error

Marc-Andre Lemburg report at bugs.python.org
Wed Apr 22 15:23:32 CEST 2015


New submission from Marc-Andre Lemburg:

In Python 2, the unicode() constructor does not accept bytes arguments, unless an encoding argument is given:

>>> unicode(u'abcäöü'.encode('utf-8'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128)

In Python 3, the str() constructor masks this programming error by returning the repr() of the bytes object:

>>> str('abcäöü'.encode('utf-8'))
"b'abc\\xc3\\xa4\\xc3\\xb6\\xc3\\xbc'"

I think it would be more helpful to point the programmer to the most probably missing encoding argument by raising an error.

Also note that you get a different output with encoding argument set:

>>> str('abcäöü'.encode('utf-8'), 'utf-8')
'abcäöü'

I know this is documented, but it is still not very helpful and can easily hide errors.

----------
components: Interpreter Core, Unicode
messages: 241800
nosy: ezio.melotti, haypo, lemburg
priority: normal
severity: normal
status: open
title: str(bytes_obj) should raise an error
versions: Python 3.5, Python 3.6

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue24025>
_______________________________________


More information about the New-bugs-announce mailing list