[New-bugs-announce] [issue8611] Python3 doesn't support locale different than utf8 and an non-ASCII path (POSIX)
report at bugs.python.org
Tue May 4 15:30:50 CEST 2010
New submission from STINNER Victor <victor.stinner at haypocalc.com>:
Python3 is unable to start (bootstrap failure) on a POSIX system if the locale encoding is different than utf8 and the Python path (standard library path where the encoding module is stored) contains a non-ASCII character. (Windows and Mac OS X are not affected by this issue because the file system encoding is hardcoded.)
- Py_FileSystemDefaultEncoding == NULL
- calculate_path(): sys.path is filled with directory names decoded with the locale encoding
- find_module() encodes each path using PyUnicode_AsEncodedString(..., Py_FileSystemDefaultEncoding, NULL): use "utf-8" encoding because Py_FileSystemDefaultEncoding is NULL
=> error because the path is not encoded and decoded with the same encoding
We cannot encodes a path with the locale encoding because we need find_module() to load the encoding codec, and loading the codec needs find_module()... (bootstrap error :-))
We should decodes the path using a fixed encoding (eg. ASCII or utf-8), use the same encoding to encodes paths in find_module(), and then reencode paths of all objects storing filenames:
- sys.path list items
- sys.modules dict keys
- sys.modules values: each module have __file__ and/or __path__ attributes
- all code objects (co_filename)
- (maybe some other?)
The error occurs in an early stage of Py_InitializeEx(), so the object list is limited and we control this list (eg. site is not loaded yet).
- #8610: "Python3/POSIX: errors if file system encoding is None"
- #8242: "Improve support of PEP 383 (surrogates) in Python3: meta-issue"
components: Interpreter Core, Unicode
title: Python3 doesn't support locale different than utf8 and an non-ASCII path (POSIX)
versions: Python 3.2
Python tracker <report at bugs.python.org>
More information about the New-bugs-announce