[New-bugs-announce] [issue8611] Python3 doesn't support locale different than utf8 and an non-ASCII path (POSIX)

STINNER Victor report at bugs.python.org
Tue May 4 15:30:50 CEST 2010

New submission from STINNER Victor <victor.stinner at haypocalc.com>:

Python3 is unable to start (bootstrap failure) on a POSIX system if the locale encoding is different than utf8 and the Python path (standard library path where the encoding module is stored) contains a non-ASCII character. (Windows and Mac OS X are not affected by this issue because the file system encoding is hardcoded.)

 - Py_FileSystemDefaultEncoding == NULL
 - calculate_path(): sys.path is filled with directory names decoded with the locale encoding
 - find_module() encodes each path using PyUnicode_AsEncodedString(..., Py_FileSystemDefaultEncoding, NULL): use "utf-8" encoding because Py_FileSystemDefaultEncoding is NULL

=> error because the path is not encoded and decoded with the same encoding

We cannot encodes a path with the locale encoding because we need find_module() to load the encoding codec, and loading the codec needs find_module()... (bootstrap error :-))

We should decodes the path using a fixed encoding (eg. ASCII or utf-8), use the same encoding to encodes paths in find_module(), and then reencode paths of all objects storing filenames:

 - sys.path list items
 - sys.modules dict keys
 - sys.modules values: each module have __file__ and/or __path__ attributes
 - all code objects (co_filename)
 - (maybe some other?)

The error occurs in an early stage of Py_InitializeEx(), so the object list is limited and we control this list (eg. site is not loaded yet).

Related issues:
 - #8610: "Python3/POSIX:  errors if file system encoding is None"
 - #8242: "Improve support of PEP 383 (surrogates) in Python3: meta-issue"

components: Interpreter Core, Unicode
messages: 104932
nosy: haypo
priority: normal
severity: normal
status: open
title: Python3 doesn't support locale different than utf8 and an non-ASCII path (POSIX)
versions: Python 3.2

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list