[issue35318] Check accuracy of str() doc string for its encoding argument

New submission from Raymond Hettinger <raymond.hettinger@gmail.com>: The "encoding" parameter is documented to default to sys.getdefaultencoding(). That may be true but there doesn't seem to be a way to use that default because objects will all have a __str__ or __repr__ that will be run instead. -------------------------------------------
import sys sys.getdefaultencoding() # Default encoding is utf-8 'utf-8' buffer = b'lim x \xe2\x9f\xb6 \xe2\x88\x9e, 1/sin\xc2\xb2(x)' str(buffer, 'utf-8') # Explicit argument decodes properly 'lim x ⟶ ∞, 1/sin²(x)' str(buffer) # Despite the default, repr is shown "b'lim x \\xe2\\x9f\\xb6 \\xe2\\x88\\x9e, 1/sin\\xc2\\xb2(x)'"
print(str.__doc__) str(object='') -> str str(bytes_or_buffer[, encoding[, errors]]) -> str
Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.__str__() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to 'strict'. ---------- assignee: docs@python components: Documentation messages: 330454 nosy: docs@python, rhettinger priority: normal severity: normal status: open title: Check accuracy of str() doc string for its encoding argument versions: Python 3.6, Python 3.7, Python 3.8 _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue35318> _______________________________________

Serhiy Storchaka <storchaka+cpython@gmail.com> added the comment:
str(buffer, errors='strict') 'lim x ⟶ ∞, 1/sin²(x)'
---------- nosy: +serhiy.storchaka _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue35318> _______________________________________

Raymond Hettinger <raymond.hettinger@gmail.com> added the comment: We may need to reword this a bit to show that the default system encoding only applies if "errors" is specified; otherwise, the argument pattern is mysterious. ---------- _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue35318> _______________________________________

Serhiy Storchaka <storchaka+cpython@gmail.com> added the comment: Is not it exactly what the docsting says?
If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.__str__() (if defined) or repr(object).
---------- _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue35318> _______________________________________

Martin Panter <vadmium+py@gmail.com> added the comment: Closing in favour of Issue 39574 where a new wording is proposed ---------- nosy: +martin.panter resolution: -> duplicate stage: -> resolved status: open -> closed superseder: -> str.__doc__ is misleading _______________________________________ Python tracker <report@bugs.python.org> <https://bugs.python.org/issue35318> _______________________________________
participants (3)
-
Martin Panter
-
Raymond Hettinger
-
Serhiy Storchaka