[New-bugs-announce] [issue11186] pydoc: HTMLDoc.index() doesn't support PEP 383
report at bugs.python.org
Fri Feb 11 13:55:09 CET 2011
New submission from STINNER Victor <victor.stinner at haypocalc.com>:
If you have an undecodable filenames on UNIX, Python 3 escapes undecodable bytes using surrogates. pydoc: HTMLDoc.index() uses indirectly os.listdir() which does such operation, and later filenames are encoded to UTF-8 (the whole HTML content is encoded to UTF-8).
In practice, you cannot import such .py file, you run them using "python script.py", so we can maybe just ignore modules with undecodable filenames. For example:
return any((0xD800 <= ord(ch) <= 0xDFFF) for ch in filename)
Or we can escape the surrogate characters, but I don't know how. Write "\uDC80" in a HTML document is not a good idea, especially in an URL (e.g. Firefox replaces \ by / in URLs).
assignee: docs at python
components: Documentation, Library (Lib)
nosy: docs at python, haypo
title: pydoc: HTMLDoc.index() doesn't support PEP 383
versions: Python 3.1, Python 3.2, Python 3.3
Python tracker <report at bugs.python.org>
More information about the New-bugs-announce