[docs] [issue26220] Unicode HOWTO references a question mark that isn't in snippet

Quentin Pradet report at bugs.python.org
Wed Jan 27 12:15:09 EST 2016

New submission from Quentin Pradet:

>From https://docs.python.org/3.6/howto/unicode.html#the-string-type:

> The following examples show the differences::
>     >>> b'\x80abc'.decode("utf-8", "strict")  #doctest: +NORMALIZE_WHITESPACE
>     Traceback (most recent call last):
>         ...
>     UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0:
>       invalid start byte
>     >>> b'\x80abc'.decode("utf-8", "replace")
>     '\ufffdabc'
>     >>> b'\x80abc'.decode("utf-8", "backslashreplace")
>     '\\x80abc'
>     >>> b'\x80abc'.decode("utf-8", "ignore")
>     'abc'
> (In this code example, the Unicode replacement character has been replaced by
> a question mark because it may not be displayed on some systems.)

I think the whole sentence after the snippet can be removed because this is exactly what Python 3.2+ outputs. It looks like the commit which added this sentence dates from Python 3.1: https://github.com/python/cpython/commit/34d4c82af56ebc1b65514a118f0ec7feeb8e172f, but another commit around Python 3.3 removed it: https://github.com/python/cpython/commit/63172c46706ae9b2a3bc80d639504a57fff4e716.

assignee: docs at python
components: Documentation
messages: 259034
nosy: Quentin.Pradet, docs at python
priority: normal
severity: normal
status: open
title: Unicode HOWTO references a question mark that isn't in snippet
versions: Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 3.6

Python tracker <report at bugs.python.org>

More information about the docs mailing list