RFC: PEP 587 "Python Initialization Configuration": 2nd version

Hi,
Thanks to Steve Dower's feedback, I enhanced and completed my PEP 587. Main changes:
* It is now possible to read the configuration and then modify the read configuration. For example, new directories can be added to PyConfig.module_search_paths (see the example below and the example in the PEP) * PyConfig is now "dynamic" by default: strings are duplicated and PyConfig_Clear() must be called to release memory * PyConfig now only uses wchar_t* for strings (unicode): char* (bytes) is no longer used. I had to hack CPython internals for that :-) * I added a "_config_version" private field to PyPreConfig and PyConfig to prepare the backward compatibility for future changes. * I removed the Open Question section: all known issues have been fixed.
During the Language Summit, Brett Cannon said that Steve Dower declined the offer to be the BDFL-delegate for this PEP. Thomas Wouters proposed himself to be the new BDFL-delegate.
Example to read the configuration, append a directory to sys.path (module_search_paths) and then initialize Python with this configuration:
void init_python(void) { PyInitError err; PyConfig config = PyConfig_INIT;
err = PyConfig_Read(&config); if (_Py_INIT_FAILED(err)) { goto fail; }
err = PyWideStringList_Append(&config.module_search_paths, L"/path/to/more/modules"); if (_Py_INIT_FAILED(err)) { goto fail; }
err = Py_InitializeFromConfig(&config); if (_Py_INIT_FAILED(err)) { goto fail; }
PyConfig_Clear(&config); return;
fail: PyConfig_Clear(&config); Py_ExitInitError(err); }
The HTML version will be online shortly: https://www.python.org/dev/peps/pep-0587/
Full text below.
Victor
PEP: 587 Title: Python Initialization Configuration Author: Nick Coghlan ncoghlan@gmail.com, Victor Stinner vstinner@redhat.com Discussions-To: python-dev@python.org Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 27-Mar-2019 Python-Version: 3.8
Abstract ========
Add a new C API to configure the Python Initialization providing finer control on the whole configuration and better error reporting.
Rationale =========
Python is highly configurable but its configuration evolved organically: configuration parameters is scattered all around the code using different ways to set them (mostly global configuration variables and functions). A straightforward and reliable way to configure Python is needed. Some configuration parameters are not accessible from the C API, or not easily.
The C API of Python 3.7 Initialization takes ``wchar_t*`` strings as input whereas the Python filesystem encoding is set during the initialization.
This PEP is a partial implementation of PEP 432 which is the overall design. New fields can be added later to ``PyConfig`` structure to finish the implementation of the PEP 432 (add a new partial initialization which allows to configure Python using Python objects to finish the full initialization).
Python Initialization C API ===========================
This PEP proposes to add the following new structures, functions and macros.
New structures (4):
* ``PyConfig`` * ``PyInitError`` * ``PyPreConfig`` * ``PyWideStringList``
New functions (16):
* ``Py_PreInitialize(config)`` * ``Py_PreInitializeFromArgs(config, argc, argv)`` * ``Py_PreInitializeFromWideArgs(config, argc, argv)`` * ``PyWideStringList_Append(list, item)`` * ``PyConfig_DecodeLocale(config_str, str)`` * ``PyConfig_SetString(config_str, str)`` * ``PyConfig_Read(config)`` * ``PyConfig_SetArgv(config, argc, argv)`` * ``PyConfig_SetWideArgv(config, argc, argv)`` * ``PyConfig_Clear(config)`` * ``Py_InitializeFromConfig(config)`` * ``Py_InitializeFromArgs(config, argc, argv)`` * ``Py_InitializeFromWideArgs(config, argc, argv)`` * ``Py_UnixMain(argc, argv)`` * ``Py_ExitInitError(err)`` * ``Py_RunMain()``
New macros (9):
* ``PyPreConfig_INIT`` * ``PyConfig_INIT`` * ``Py_INIT_OK()`` * ``Py_INIT_ERR(MSG)`` * ``Py_INIT_NO_MEMORY()`` * ``Py_INIT_EXIT(EXITCODE)`` * ``Py_INIT_IS_ERROR(err)`` * ``Py_INIT_IS_EXIT(err)`` * ``Py_INIT_FAILED(err)``
This PEP also adds ``_PyRuntimeState.preconfig`` (``PyPreConfig`` type) and ``PyInterpreterState.config`` (``PyConfig`` type) fields to these internal structures. ``PyInterpreterState.config`` becomes the new reference configuration, replacing global configuration variables and other private variables.
PyWideStringList ----------------
``PyWideStringList`` is a list of ``wchar_t*`` strings.
Example to initialize a string from C static array::
static wchar_t* argv[2] = { L"-c", L"pass", }; PyWideStringList config_argv = PyWideStringList_INIT; config_argv.length = Py_ARRAY_LENGTH(argv); config_argv.items = argv;
``PyWideStringList`` structure fields:
* ``length`` (``Py_ssize_t``) * ``items`` (``wchar_t**``)
Methods:
* ``PyInitError PyWideStringList_Append(PyWideStringList *list, const wchar_t *item)``: Append *item* to *list*.
If *length* is non-zero, *items* must be non-NULL and all strings must be non-NULL.
PyInitError -----------
``PyInitError`` is a structure to store an error message or an exit code for the Python Initialization. For an error, it stores the C function name which created the error.
Example::
PyInitError alloc(void **ptr, size_t size) { *ptr = PyMem_RawMalloc(size); if (*ptr == NULL) { return Py_INIT_NO_MEMORY(); } return Py_INIT_OK(); }
int main(int argc, char **argv) { void *ptr; PyInitError err = alloc(&ptr, 16); if (Py_INIT_FAILED(err)) { Py_ExitInitError(err); } PyMem_Free(ptr); return 0; }
``PyInitError`` fields:
* ``exitcode`` (``int``): argument passed to ``exit()`` on Unix and to ``ExitProcess()`` on Windows. Only set by ``Py_INIT_EXIT()``. * ``err_msg`` (``const char*``): error message * private ``_func`` field: used by ``Py_INIT_ERR()`` to store the C function name which created the error. * private ``_type`` field: for internal usage only.
Macro to create an error:
* ``Py_INIT_OK()``: success * ``Py_INIT_ERR(err_msg)``: initialization error with a message * ``Py_INIT_NO_MEMORY()``: memory allocation failure (out of memory) * ``Py_INIT_EXIT(exitcode)``: exit Python with the specified exit code
Other macros and functions:
* ``Py_INIT_IS_ERROR(err)``: Is the result an error? * ``Py_INIT_IS_EXIT(err)``: Is the result an exit? * ``Py_INIT_FAILED(err)``: Is the result an error or an exit? Similar to ``Py_INIT_IS_ERROR(err) || Py_INIT_IS_EXIT(err)``. * ``Py_ExitInitError(err)``: Call ``exit(exitcode)`` on Unix or ``ExitProcess(exitcode)`` if the result is an exit, call ``Py_FatalError(err_msg)`` if the result is an error. Must not be called if the result is a success.
Pre-Initialization with PyPreConfig -----------------------------------
``PyPreConfig`` structure is used to pre-initialize Python:
* Set the memory allocator * Configure the LC_CTYPE locale * Set the UTF-8 mode
Example using the pre-initialization to enable the UTF-8 Mode::
PyPreConfig preconfig = PyPreConfig_INIT; preconfig.utf8_mode = 1;
PyInitError err = Py_PreInitialize(&preconfig); if (Py_INIT_FAILED(err)) { Py_ExitInitError(err); }
/* at this point, Python will speak UTF-8 */
Py_Initialize(); /* ... use Python API here ... */ Py_Finalize();
Functions to pre-initialize Python:
* ``PyInitError Py_PreInitialize(const PyPreConfig *config)`` * ``PyInitError Py_PreInitializeFromArgs(const PyPreConfig *config, int argc, char **argv)`` * ``PyInitError Py_PreInitializeFromWideArgs(const PyPreConfig *config, int argc, wchar_t **argv)``
If Python should be pre-initialized explicitly first and then initialized with command line arguments, it is possible to pass these command line arguments to the pre-initialization since they impact the encodings. For example, ``-X utf8`` enables the UTF-8 Mode.
These functions can be called with *config* set to ``NULL``. The caller is responsible to handle error using ``Py_INIT_FAILED()`` and ``Py_ExitInitError()``.
``PyPreConfig`` fields:
* ``allocator`` (``char*``): name of the memory allocator (ex: ``"malloc"``) * ``coerce_c_locale_warn`` (``int``): if non-zero, emit a warning if the C locale is coerced. * ``coerce_c_locale`` (``int``): if equals to 2, coerce the C locale; if equals to 1, read the LC_CTYPE to decide if it should be coerced. * ``dev_mode`` (``int``): see ``PyConfig.dev_mode`` * ``isolated`` (``int``): see ``PyConfig.isolated`` * ``legacy_windows_fs_encoding`` (``int``, Windows only): if non-zero, set the Python filesystem encoding to ``"mbcs"``. * ``use_environment`` (``int``): see ``PyConfig.use_environment`` * ``utf8_mode`` (``int``): if non-zero, enable the UTF-8 mode
There is also a private field which is for internal-usage only:
* ``_config_version`` (``int``): Configuration version, used for ABI compatibility
The C locale coercion (PEP 538) and the UTF-8 Mode (PEP 540) are disabled by default in ``PyPreConfig``. Set ``coerce_c_locale``, ``coerce_c_locale_warn`` and ``utf8_mode`` to ``-1`` to let Python enable them depending on the user configuration.
Initialization with PyConfig ----------------------------
The ``PyConfig`` structure contains all parameters to configure Python.
Example::
PyInitError err; PyConfig config = PyConfig_INIT;
err = PyConfig_SetString(&config.program_name, L"my_program"); if (_Py_INIT_FAILED(err)) { Py_ExitInitError(err); }
err = Py_InitializeFromConfig(&config); PyConfig_Clear(&config);
if (Py_INIT_FAILED(err)) { Py_ExitInitError(err); }
``PyConfig`` methods:
* ``PyInitError PyConfig_SetString(wchar_t **config_str, const wchar_t *str)``: Set a config wide string field from *str* (copy the string) * ``PyInitError PyConfig_DecodeLocale(wchar_t **config_str, const char *str)``: Decode *str* using ``Py_DecodeLocale()`` and set the result into ``*config_str``. Pre-initialize Python if needed to ensure that encodings are properly configured. * ``PyInitError PyConfig_SetArgv(PyConfig *config, int argc, char **argv)``: Set command line arguments (decode bytes). Pre-initialize Python if needed to ensure that encodings are properly configured. * ``PyInitError PyConfig_SetWideArgv(PyConfig *config, int argc, wchar_t **argv)``: Set command line arguments (wide characters). * ``PyInitError PyConfig_Read(PyConfig *config)``: Read all Python configuration * ``void PyConfig_Clear(PyConfig *config)``: Release memory
Functions to initialize Python:
* ``PyInitError Py_InitializeFromConfig(const PyConfig *config)``
These functions can be called with *config* set to ``NULL``. The caller is responsible to handler error using ``Py_INIT_FAILED()`` and ``Py_ExitInitError()``.
PyConfig fields:
* ``argv`` (``PyWideStringList``): ``sys.argv`` * ``base_exec_prefix`` (``wchar_t*``): ``sys.base_exec_prefix`` * ``base_prefix`` (``wchar_t*``): ``sys.base_prefix`` * ``buffered_stdio`` (``int``): if equals to 0, enable unbuffered mode, make stdout and stderr streams to be unbuffered. * ``bytes_warning`` (``int``): if equals to 1, issue a warning when comparing ``bytes`` or ``bytearray`` with ``str``, or comparing ``bytes`` with ``int``. If equal or greater to 2, raise a ``BytesWarning`` exception. * ``check_hash_pycs_mode`` (``wchar_t*``): ``--check-hash-based-pycs`` command line option value (see PEP 552) * ``dev_mode`` (``int``): Development mode * ``dll_path`` (``wchar_t*``, Windows only): Windows DLL path * ``dump_refs`` (``int``): if non-zero, display all objects still alive at exit * ``exec_prefix`` (``wchar_t*``): ``sys.exec_prefix`` * ``executable`` (``wchar_t*``): ``sys.executable`` * ``faulthandler`` (``int``): if non-zero, call ``faulthandler.enable()`` * ``filesystem_encoding`` (``wchar_t*``): Filesystem encoding, ``sys.getfilesystemencoding()`` * ``filesystem_errors`` (``wchar_t*``): Filesystem encoding errors, ``sys.getfilesystemencodeerrors()`` * ``use_hash_seed`` (``int``), ``hash_seed`` (``unsigned long``): randomized hash function seed * ``home`` (``wchar_t*``): Python home * ``import_time`` (``int``): if non-zero, profile import time * ``inspect`` (``int``): enter interactive mode after executing a script or a command * ``install_signal_handlers`` (``int``): install signal handlers? * ``interactive`` (``int``): interactive mode * ``legacy_windows_stdio`` (``int``, Windows only): if non-zero, use ``io.FileIO`` instead of ``WindowsConsoleIO`` for ``sys.stdin``, ``sys.stdout`` and ``sys.stderr``. * ``malloc_stats`` (``int``): if non-zero, dump memory allocation statistics at exit * ``module_search_path_env`` (``wchar_t*``): ``PYTHONPATH`` environment variale value * ``use_module_search_paths`` (``int``), ``module_search_paths`` (``PyWideStringList``): ``sys.path`` * ``optimization_level`` (``int``): compilation optimization level * ``parser_debug`` (``int``): if non-zero, turn on parser debugging output (for expert only, depending on compilation options). * ``prefix`` (``wchar_t*``): ``sys.prefix`` * ``program_name`` (``wchar_t*``): Program name * ``program`` (``wchar_t*``): ``argv[0]`` or an empty string * ``pycache_prefix`` (``wchar_t*``): ``.pyc`` cache prefix * ``quiet`` (``int``): quiet mode (ex: don't display the copyright and version messages even in interactive mode) * ``run_command`` (``wchar_t*``): ``-c COMMAND`` argument * ``run_filename`` (``wchar_t*``): ``python3 SCRIPT`` argument * ``run_module`` (``wchar_t*``): ``python3 -m MODULE`` argument * ``show_alloc_count`` (``int``): show allocation counts at exit? * ``show_ref_count`` (``int``): show total reference count at exit? * ``site_import`` (``int``): import the ``site`` module at startup? * ``skip_source_first_line`` (``int``): skip the first line of the source * ``stdio_encoding`` (``wchar_t*``), ``stdio_errors`` (``wchar_t*``): encoding and encoding errors of ``sys.stdin``, ``sys.stdout`` and ``sys.stderr`` * ``tracemalloc`` (``int``): if non-zero, call ``tracemalloc.start(value)`` * ``user_site_directory`` (``int``): if non-zero, add user site directory to ``sys.path`` * ``verbose`` (``int``): if non-zero, enable verbose mode * ``warnoptions`` (``PyWideStringList``): options of the ``warnings`` module to build filters * ``write_bytecode`` (``int``): if non-zero, write ``.pyc`` files * ``xoptions`` (``PyWideStringList``): ``sys._xoptions``
There are also private fields which are for internal-usage only:
* ``_config_version`` (``int``): Configuration version, used for ABI compatibility * ``_frozen`` (``int``): Emit warning when computing the path configuration? * ``_install_importlib`` (``int``): Install importlib?
More complete commented example modifying the configuration before calling ``PyConfig_Read()`` and then modify the read configuration::
PyInitError init_python(const char *program_name) { PyInitError err; PyConfig config = PyConfig_INIT;
/* Set the program name before reading the configuraton (decode byte string from the locale encoding) */ err = PyConfig_DecodeLocale(&config.program_name, program_name); if (_Py_INIT_FAILED(err)) { goto fail; }
/* Read all configuration at once */ err = PyConfig_Read(&config); if (_Py_INIT_FAILED(err)) { goto fail; }
/* Append our custom search path to sys.path */ err = PyWideStringList_Append(&config.module_search_paths, L"/path/to/more/modules"); if (_Py_INIT_FAILED(err)) { goto fail; }
/* Override executable computed by PyConfig_Read() */ err = PyConfig_SetString(&config.executable, L"my_executable"); if (_Py_INIT_FAILED(err)) { goto fail; }
err = Py_InitializeFromConfig(&config);
/* Py_InitializeFromConfig() copied config which must now be cleared to release memory */ PyConfig_Clear(&config);
return err;
fail: PyConfig_Clear(&config); Py_ExitInitError(err); }
.. note:: ``PyConfig`` does not have any field for extra inittab functions: ``PyImport_AppendInittab()`` and ``PyImport_ExtendInittab()`` functions are still relevant.
Initialization with static PyConfig -----------------------------------
When no ``PyConfig`` method is used but only ``Py_InitializeFromConfig()``, the caller is responsible for managing ``PyConfig`` memory which means that static strings and static string lists can be used rather than using dynamically allocated memory. It can be used for most simple configurations.
Example of Python initialization enabling the isolated mode::
PyConfig config = PyConfig_INIT; config.isolated = 1;
PyInitError err = Py_InitializeFromConfig(&config); if (Py_INIT_FAILED(err)) { Py_ExitInitError(err); } /* ... use Python API here ... */ Py_Finalize();
In this example, ``PyConfig_Clear()`` is not needed since ``config`` does not contain any dynamically allocated string: ``Py_InitializeFromConfig`` is responsible for filling other fields and manage the memory.
For convenience, two other functions are provided:
* ``PyInitError Py_InitializeFromArgs(const PyConfig *config, int argc, char **argv)`` * ``PyInitError Py_InitializeFromWideArgs(const PyConfig *config, int argc, wchar_t **argv)``
These functions can be used with static ``PyConfig``.
Pseudo-code of ``Py_InitializeFromArgs()``::
PyInitError init_with_args(const PyConfig *src_config, int argc, char **argv) { PyInitError err; PyConfig config = PyConfig_INIT;
/* Copy strings and string lists * (memory dynamically allocated on the heap) */ err = _PyConfig_Copy(&config, src_config); if (Py_INIT_FAILED(err)) { goto exit; }
/* Set config.argv: decode argv bytes. Pre-initialize Python if needed to ensure that the encodings are properly configured. */ err = PyConfig_SetArgv(&config, argc, argv); if (Py_INIT_FAILED(err)) { goto exit; }
err = Py_InitializeFromConfig(&config);
exit: PyConfig_Clear(&config); return err; }
where ``_PyConfig_Copy()`` is an internal function. The actual implementation of ``Py_InitializeFromArgs()`` is more complex.
Py_UnixMain() -------------
Python 3.7 provides a high-level ``Py_Main()`` function which requires to pass command line arguments as ``wchar_t*`` strings. It is non-trivial to use the correct encoding to decode bytes. Python has its own set of issues with C locale coercion and UTF-8 Mode.
This PEP adds a new ``Py_UnixMain()`` function which takes command line arguments as bytes::
int Py_UnixMain(int argc, char **argv)
Py_RunMain() ------------
The new ``Py_RunMain()`` function executes the command (``PyConfig.run_command``), the script (``PyConfig.run_filename``) or the module (``PyConfig.run_module``) specified on the command line or in the configuration, and then finalizes Python. It returns an exit status that can be passed to the ``exit()`` function.
Example of custom Python executable always running in isolated mode::
#include <Python.h>
int main(int argc, char *argv[]) { PyConfig config = PyConfig_INIT; config.isolated = 1;
PyInitError err = Py_InitializeFromArgs(&config, argc, argv); if (Py_INIT_FAILED(err)) { Py_ExitInitError(err); }
/* put more configuration code here if needed */
return Py_RunMain(); }
The example is a basic implementation of the "System Python Executable" discussed in PEP 432.
Memory allocations and Py_DecodeLocale() ----------------------------------------
Python memory allocation functions like ``PyMem_RawMalloc()`` must not be used before Python pre-initialization. Calling directly ``malloc()`` and ``free()`` is always safe.
For ``PyPreConfig`` and static ``PyConfig``, the caller is responsible to manage dynamically allocated strings, but static strings and static string lists are fine.
Dynamic ``PyConfig`` requires to call ``PyConfig_Clear()`` to release memory.
``Py_DecodeLocale()`` must not be called before the pre-initialization.
When using dynanic configuration, ``PyConfig_DecodeLocale()`` must be used instead of ``Py_DecodeLocale()``.
Backwards Compatibility =======================
This PEP only adds a new API: it leaves the existing API unchanged and has no impact on the backwards compatibility.
Annex: Python Configuration ===========================
Priority and Rules ------------------
Priority of configuration parameters, highest to lowest:
* ``PyConfig`` * ``PyPreConfig`` * Configuration files * Command line options * Environment variables * Global configuration variables
Priority of warning options, highest to lowest:
* ``PyConfig.warnoptions`` * ``PyConfig.dev_mode`` (add ``"default"``) * ``PYTHONWARNINGS`` environment variables * ``-W WARNOPTION`` command line argument * ``PyConfig.bytes_warning`` (add ``"error::BytesWarning"`` if greater than 1, or add ``"default::BytesWarning``)
Rules on ``PyConfig`` and ``PyPreConfig`` parameters:
* If ``isolated`` is non-zero, ``use_environment`` and ``user_site_directory`` are set to 0 * If ``legacy_windows_fs_encoding`` is non-zero, ``utf8_mode`` is set to 0 * If ``dev_mode`` is non-zero, ``allocator`` is set to ``"debug"``, ``faulthandler`` is set to 1, and ``"default"`` filter is added to ``warnoptions``. But ``PYTHONMALLOC`` has the priority over ``dev_mode`` to set the memory allocator.
Configuration Files -------------------
Python configuration files:
* ``pyvenv.cfg`` * ``python._pth`` (Windows only) * ``pybuilddir.txt`` (Unix only)
Global Configuration Variables ------------------------------
Global configuration variables mapped to ``PyPreConfig`` fields:
======================================== ================================ Variable Field ======================================== ================================ ``Py_LegacyWindowsFSEncodingFlag`` ``legacy_windows_fs_encoding`` ``Py_LegacyWindowsFSEncodingFlag`` ``legacy_windows_fs_encoding`` ``Py_UTF8Mode`` ``utf8_mode`` ``Py_UTF8Mode`` ``utf8_mode`` ======================================== ================================
Global configuration variables mapped to ``PyConfig`` fields:
======================================== ================================ Variable Field ======================================== ================================ ``Py_BytesWarningFlag`` ``bytes_warning`` ``Py_DebugFlag`` ``parser_debug`` ``Py_DontWriteBytecodeFlag`` ``write_bytecode`` ``Py_FileSystemDefaultEncodeErrors`` ``filesystem_errors`` ``Py_FileSystemDefaultEncoding`` ``filesystem_encoding`` ``Py_FrozenFlag`` ``_frozen`` ``Py_HasFileSystemDefaultEncoding`` ``filesystem_encoding`` ``Py_HashRandomizationFlag`` ``use_hash_seed``, ``hash_seed`` ``Py_IgnoreEnvironmentFlag`` ``use_environment`` ``Py_InspectFlag`` ``inspect`` ``Py_InteractiveFlag`` ``interactive`` ``Py_IsolatedFlag`` ``isolated`` ``Py_LegacyWindowsStdioFlag`` ``legacy_windows_stdio`` ``Py_NoSiteFlag`` ``site_import`` ``Py_NoUserSiteDirectory`` ``user_site_directory`` ``Py_OptimizeFlag`` ``optimization_level`` ``Py_QuietFlag`` ``quiet`` ``Py_UnbufferedStdioFlag`` ``buffered_stdio`` ``Py_VerboseFlag`` ``verbose`` ``_Py_HasFileSystemDefaultEncodeErrors`` ``filesystem_errors`` ``Py_BytesWarningFlag`` ``bytes_warning`` ``Py_DebugFlag`` ``parser_debug`` ``Py_DontWriteBytecodeFlag`` ``write_bytecode`` ``Py_FileSystemDefaultEncodeErrors`` ``filesystem_errors`` ``Py_FileSystemDefaultEncoding`` ``filesystem_encoding`` ``Py_FrozenFlag`` ``_frozen`` ``Py_HasFileSystemDefaultEncoding`` ``filesystem_encoding`` ``Py_HashRandomizationFlag`` ``use_hash_seed``, ``hash_seed`` ``Py_IgnoreEnvironmentFlag`` ``use_environment`` ``Py_InspectFlag`` ``inspect`` ``Py_InteractiveFlag`` ``interactive`` ``Py_IsolatedFlag`` ``isolated`` ``Py_LegacyWindowsStdioFlag`` ``legacy_windows_stdio`` ``Py_NoSiteFlag`` ``site_import`` ``Py_NoUserSiteDirectory`` ``user_site_directory`` ``Py_OptimizeFlag`` ``optimization_level`` ``Py_QuietFlag`` ``quiet`` ``Py_UnbufferedStdioFlag`` ``buffered_stdio`` ``Py_VerboseFlag`` ``verbose`` ``_Py_HasFileSystemDefaultEncodeErrors`` ``filesystem_errors`` ======================================== ================================
``Py_LegacyWindowsFSEncodingFlag`` and ``Py_LegacyWindowsStdioFlag`` are only available on Windows.
Command Line Arguments ----------------------
Usage::
python3 [options] python3 [options] -c COMMAND python3 [options] -m MODULE python3 [options] SCRIPT
Command line options mapped to pseudo-action on ``PyConfig`` fields:
================================ ================================ Option ``PyPreConfig`` field ================================ ================================ ``-X dev`` ``dev_mode = 1`` ``-X utf8=N`` ``utf8_mode = N`` ================================ ================================
Command line options mapped to pseudo-action on ``PyConfig`` fields:
================================ ================================ Option ``PyConfig`` field ================================ ================================ ``-b`` ``bytes_warning++`` ``-B`` ``write_bytecode = 0`` ``-c COMMAND`` ``run_module = COMMAND`` ``--check-hash-based-pycs=MODE`` ``_check_hash_pycs_mode = MODE`` ``-d`` ``parser_debug++`` ``-E`` ``use_environment = 0`` ``-i`` ``inspect++`` and ``interactive++`` ``-I`` ``isolated = 1`` ``-m MODULE`` ``run_module = MODULE`` ``-O`` ``optimization_level++`` ``-q`` ``quiet++`` ``-R`` ``use_hash_seed = 0`` ``-s`` ``user_site_directory = 0`` ``-S`` ``site_import`` ``-t`` ignored (kept for backwards compatibility) ``-u`` ``buffered_stdio = 0`` ``-v`` ``verbose++`` ``-W WARNING`` add ``WARNING`` to ``warnoptions`` ``-x`` ``skip_source_first_line = 1`` ``-X XOPTION`` add ``XOPTION`` to ``xoptions`` ``-X dev`` ``dev_mode = 1`` ``-X faulthandler`` ``faulthandler = 1`` ``-X importtime`` ``import_time = 1`` ``-X pycache_prefix=PREFIX`` ``pycache_prefix = PREFIX`` ``-X show_alloc_count`` ``show_alloc_count = 1`` ``-X show_ref_count`` ``show_ref_count = 1`` ``-X tracemalloc=N`` ``tracemalloc = N`` ================================ ================================
``-h``, ``-?`` and ``-V`` options are handled outside ``PyConfig``.
Environment Variables ---------------------
Environment variables mapped to ``PyPreConfig`` fields:
================================= ============================================= Variable ``PyPreConfig`` field ================================= ============================================= ``PYTHONCOERCECLOCALE`` ``coerce_c_locale``, ``coerce_c_locale_warn`` ``PYTHONDEVMODE`` ``dev_mode`` ``PYTHONLEGACYWINDOWSFSENCODING`` ``legacy_windows_fs_encoding`` ``PYTHONMALLOC`` ``allocator`` ``PYTHONUTF8`` ``utf8_mode`` ================================= =============================================
Environment variables mapped to ``PyConfig`` fields:
================================= ==================================== Variable ``PyConfig`` field ================================= ==================================== ``PYTHONDEBUG`` ``parser_debug`` ``PYTHONDEVMODE`` ``dev_mode`` ``PYTHONDONTWRITEBYTECODE`` ``write_bytecode`` ``PYTHONDUMPREFS`` ``dump_refs`` ``PYTHONEXECUTABLE`` ``program_name`` ``PYTHONFAULTHANDLER`` ``faulthandler`` ``PYTHONHASHSEED`` ``use_hash_seed``, ``hash_seed`` ``PYTHONHOME`` ``home`` ``PYTHONINSPECT`` ``inspect`` ``PYTHONIOENCODING`` ``stdio_encoding``, ``stdio_errors`` ``PYTHONLEGACYWINDOWSSTDIO`` ``legacy_windows_stdio`` ``PYTHONMALLOCSTATS`` ``malloc_stats`` ``PYTHONNOUSERSITE`` ``user_site_directory`` ``PYTHONOPTIMIZE`` ``optimization_level`` ``PYTHONPATH`` ``module_search_path_env`` ``PYTHONPROFILEIMPORTTIME`` ``import_time`` ``PYTHONPYCACHEPREFIX,`` ``pycache_prefix`` ``PYTHONTRACEMALLOC`` ``tracemalloc`` ``PYTHONUNBUFFERED`` ``buffered_stdio`` ``PYTHONVERBOSE`` ``verbose`` ``PYTHONWARNINGS`` ``warnoptions`` ================================= ====================================
``PYTHONLEGACYWINDOWSFSENCODING`` and ``PYTHONLEGACYWINDOWSSTDIO`` are specific to Windows.
``PYTHONDEVMODE`` is mapped to ``PyPreConfig.dev_mode`` and ``PyConfig.dev_mode``.
Annex: Python 3.7 API =====================
Python 3.7 has 4 functions in its C API to initialize and finalize Python:
* ``Py_Initialize()``, ``Py_InitializeEx()``: initialize Python * ``Py_Finalize()``, ``Py_FinalizeEx()``: finalize Python
Python can be configured using scattered global configuration variables (like ``Py_IgnoreEnvironmentFlag``) and using the following functions:
* ``PyImport_AppendInittab()`` * ``PyImport_ExtendInittab()`` * ``PyMem_SetAllocator()`` * ``PyMem_SetupDebugHooks()`` * ``PyObject_SetArenaAllocator()`` * ``Py_SetPath()`` * ``Py_SetProgramName()`` * ``Py_SetPythonHome()`` * ``Py_SetStandardStreamEncoding()`` * ``PySys_AddWarnOption()`` * ``PySys_AddXOption()`` * ``PySys_ResetWarnOptions()``
There is also a high-level ``Py_Main()`` function.
Copyright =========
This document has been placed in the public domain.

On Thursday, May 02, 2019 Victor Stinner vstinner@redhat.com wrote:
According to this
- ``run_command`` (``wchar_t*``): ``-c COMMAND`` argument
- ``run_filename`` (``wchar_t*``): ``python3 SCRIPT`` argument
- ``run_module`` (``wchar_t*``): ``python3 -m MODULE`` argument
this
``-c COMMAND`` ``run_module = COMMAND``
should read "run_command = COMMAND". Typo, not?

Le jeu. 2 mai 2019 à 16:20, Edwin Zimmerman edwin@211mainstreet.net a écrit :
``-c COMMAND`` ``run_module = COMMAND``
should read "run_command = COMMAND". Typo, not?
Oops, you're right: it's a typo. Now fixed:
``-c COMMAND`` ``run_command = COMMAND``
Victor

2019年5月3日(金) 4:59 Victor Stinner vstinner@redhat.com:
- PyConfig now only uses wchar_t* for strings (unicode): char* (bytes)
is no longer used. I had to hack CPython internals for that :-)
I prefer char* to wchar_t* on Unix. Since UTF-8 dominated Unix world in these decades, wchar_t* is less usable on Unix nowadays.
Is it impossible to use just char* on Unix and wchar_t* on Windows?

Hi INADA-san,
This PEP is the result of 2 years of refactoring to *simplify* the *implementation*. I agree that bytes string is the native type on Unix. But. On Windows, Unicode is the native type. On Python 3, Unicode is the native type. One key of the simplified implementation is the unique PyConfig structure. It means that all platforms have to use the same types.
I love the idea of using only wchar_t* for PyConfig because it makes Python initialization more reliable. The question of the encoding used to decode byte strings and any possible decoding error (very unlikely thanks to surrogateescape) is better defined: it occurs when you set the parameter, not "later during init".
The PEP adds Py_UnixMain() for most trivial use cases, and PyConfig_DecodeLocale() and PyConfig_SetArgs() for more advanced cases.
Victor
Le samedi 4 mai 2019, Inada Naoki songofacandy@gmail.com a écrit :
2019年5月3日(金) 4:59 Victor Stinner vstinner@redhat.com:
- PyConfig now only uses wchar_t* for strings (unicode): char* (bytes)
is no longer used. I had to hack CPython internals for that :-)
I prefer char* to wchar_t* on Unix. Since UTF-8 dominated Unix world in these decades, wchar_t* is less usable on Unix nowadays.
Is it impossible to use just char* on Unix and wchar_t* on Windows?
-- Inada Naoki songofacandy@gmail.com

Hi,
First of all, I just found an old issue that we will solved by my PEP 587 :-)
Add Py_SetFatalErrorAbortFunc: Allow embedding program to handle fatal errors https://bugs.python.org/issue30560
I studied code of applications embedding Python. Most of them has to decode bytes strings to get wchar_t* to set home, argv, program name, etc. I'm not sure that they use the "correct" encoding, especially since Python 3.7 got UTF-8 Mode (PEP 540) and C locale coercion (PEP 538).
I tried to convert the source code of each project into pseudo-code which looks like C code used in CPython.
I removed all error handling code: look at each reference, the original code is usually way more complex.
Some project has to wrap each function of the Python C API manually, which adds even more boilerplate code.
Some project set/unset environment varaibles. Others prefer global configuration variables like Py_NoSiteFlag.
It seems like Py_FrozenFlag is commonly used. Maybe I should make the flag public and try to find it a better name:
/* If greater than 0, suppress _PyPathConfig_Calculate() warnings.
If set to -1 (default), inherit Py_FrozenFlag value. */ int _frozen;
About pyinstaller which changes C standard stream buffering: Py_Initialize() now also does that when buffered_stdio=0. See config_init_stdio() in Python/coreconfig.c. Moreover, this function now *always* set standard streams to O_BINARY mode on Windows. I'm not sure if it's correct or not.
Blender -------
Pseudo-code of BPY_python_start::
BLI_strncpy_wchar_from_utf8(program_path_wchar, BKE_appdir_program_path()); Py_SetProgramName(program_path_wchar); PyImport_ExtendInittab(bpy_internal_modules); Py_SetPythonHome(py_path_bundle_wchar); Py_SetStandardStreamEncoding("utf-8", "surrogateescape"); Py_NoSiteFlag = 1; Py_FrozenFlag = 1; Py_Initialize();
Ref: https://git.blender.org/gitweb/gitweb.cgi/blender.git/blob/HEAD:/source/blen...
fontforge ---------
Pseudo-code of fontforge when Python is used to run a script::
Py_Initialize() for init_file in init_files: PyRun_SimpleFileEx(init_file) exitcode = Py_Main(arg, argv) Py_Finalize() exit(exitcode)
Ref: https://bugs.python.org/issue36204#msg337256
py2app ------
Pseudo-code::
unsetenv("PYTHONOPTIMIZE"); unsetenv("PYTHONDEBUG"); unsetenv("PYTHONDONTWRITEBYTECODE"); unsetenv("PYTHONIOENCODING"); unsetenv("PYTHONDUMPREFS"); unsetenv("PYTHONMALLOCSTATS"); setenv("PYTHONDONTWRITEBYTECODE", "1", 1); setenv("PYTHONUNBUFFERED", "1", 1); setenv("PYTHONPATH", build_python_path(), 1);
setlocale(LC_ALL, "en_US.UTF-8"); mbstowcs(w_program, c_program, PATH_MAX+1); Py_SetProgramName(w_program);
Py_Initialize()
argv_new[0] = _Py_DecodeUTF8_surrogateescape(script, strlen(script)); ... PySys_SetArgv(argc, argv_new);
PyRun_SimpleFile(fp, script); Py_Finalize();
Ref: https://bitbucket.org/ronaldoussoren/py2app/src/default/py2app/apptemplate/s...
See also: https://bitbucket.org/ronaldoussoren/py2app/src/default/py2app/bundletemplat...
OpenOffice ----------
Pseudo-code of ``PythonInit``::
mbstowcs(wide, home, PATH_MAX + 1); Py_SetPythonHome(wide); setenv("PYTHONPATH", getenv("PYTHONPATH") + ":" + path_bootstrap); PyImport_AppendInittab("pyuno", PyInit_pyuno); Py_DontWriteBytecodeFlag = 1; Py_Initialize();
Ref: pyuno/source/loader/pyuno_loader.cxx, see: https://docs.libreoffice.org/pyuno/html/pyuno__loader_8cxx_source.html
vim ---
Pseudo-code::
mbstowcs(py_home_buf, p_py3home); Py_SetPythonHome(py_home_buf); PyImport_AppendInittab("vim", Py3Init_vim); Py_Initialize();
Ref: https://github.com/vim/vim/blob/master/src/if_python3.c
pyinstaller -----------
Pseudo-code::
pyi_locale_char2wchar(progname_w, status->archivename) SetProgramName(progname_w);
pyi_locale_char2wchar(pyhome_w, status->mainpath) SetPythonHome(pyhome_w);
pypath_w = build_path(); Py_SetPath(pypath_w);
Py_NoSiteFlag = 1; Py_FrozenFlag = 1; Py_DontWriteBytecodeFlag = 1; Py_NoUserSiteDirectory = 1; Py_IgnoreEnvironmentFlag = 1; Py_VerboseFlag = 0; Py_OptimizeFlag = 1;
if (unbuffered) { #ifdef _WIN32 _setmode(fileno(stdin), _O_BINARY); _setmode(fileno(stdout), _O_BINARY); #endif setbuf(stdin, (char *)NULL); setbuf(stdout, (char *)NULL); setbuf(stderr, (char *)NULL); }
Py_Initialize();
PySys_SetPath(pypath_w);
PySys_SetArgvEx(argc, wargv, 0);
Ref: https://github.com/pyinstaller/pyinstaller/blob/1844d69f5aa1d64d3feca912ed16...
Victor

On 10May2019 1832, Victor Stinner wrote:
Hi,
First of all, I just found an old issue that we will solved by my PEP 587 :-)
Add Py_SetFatalErrorAbortFunc: Allow embedding program to handle fatal errors https://bugs.python.org/issue30560
Yes, this should be a feature of any redesigned embedding API.
I studied code of applications embedding Python. Most of them has to decode bytes strings to get wchar_t* to set home, argv, program name, etc. I'm not sure that they use the "correct" encoding, especially since Python 3.7 got UTF-8 Mode (PEP 540) and C locale coercion (PEP 538).
Unless you studied Windows-only applications embedding Python, _all_ of them will have had to decode strings into Unicode, since that's what our API expects.
All of the Windows-only applications I know of that embed Python are closed source, and none are owned by Red Hat. I'm going to assume you missed that entire segment of the ecosystem :)
But it also seems like perhaps we just need to expose a single API that does "decode this like CPython would" so that they can call it? We don't need a whole PEP or a widely publicised and discussed redesign of embedding to add this, and since it would solve a very real problem then we should just do it.
I tried to convert the source code of each project into pseudo-code which looks like C code used in CPython.
Thanks, this is helpful!
My take: * all the examples are trying to be isolated from the system Python install (except Vim?) * all the examples want to import some of their own modules before running user code * nobody understands how to configure embedded Python :)
Also from my own work with/on other projects: * embedders need to integrate native thread management with Python threads * embedders want to use their own files/libraries * embedders want to totally override getpath, not augment/configure it
Cheers, Steve

)Le lun. 13 mai 2019 à 18:28, Steve Dower steve.dower@python.org a écrit :
My take:
- all the examples are trying to be isolated from the system Python
install (except Vim?)
"Isolation" means different things:
* ignore configuration files * ignore environment variables * custom path configuration (sys.path, sys.executable, etc.)
It seems like the most common need is to have a custom path configuration.
Py_IsolatedFlag isn't used. Only py2app manually ignores a few environment variables.
- all the examples want to import some of their own modules before
running user code
Well, running code between Py_Initialize() and running the final Python code is not new, and my PEP doesn't change anything here: it's still possible, as it was previously. You can use PyRun_SimpleFile() after Py_Initialize() for example.
Maybe I misunderstood your point.
- nobody understands how to configure embedded Python :)
Well, that's the problem I'm trying to solve by designing an homogeneous API, rather than scattered global configuration variables, environment variables, function calls, etc.
Also from my own work with/on other projects:
- embedders need to integrate native thread management with Python threads
Sorry, I see the relationship with the initialization.
- embedders want to use their own files/libraries
That's the path configuration, no?
- embedders want to totally override getpath, not augment/configure it
On Python 3.7, Py_SetPath() is the closest thing to configure path configuration. But I don't see how to override sys.executable (Py_GetProgramFullPath), sys.prefix, sys.exec_prefix, nor (internal) dll_path.
In the examples that I found, SetProgramName(), SetPythonHome() and Py_SetPath() are called.
My PEP 587 allows to completely ignore getpath.c/getpath.c easily by setting explicitly:
* use_module_search_path, module_search_paths * executable * prefix * exec_prefix * dll_path (Windows only)
If you set these fields, you fully control where Python looks for modules. Extract of the C code:
/* Do we need to calculate the path? */ if (!config->use_module_search_paths || (config->executable == NULL) || (config->prefix == NULL) #ifdef MS_WINDOWS || (config->dll_path == NULL) #endif || (config->exec_prefix == NULL)) { _PyInitError err = _PyCoreConfig_CalculatePathConfig(config); if (_Py_INIT_FAILED(err)) { return err; } }
OpenOffice doesn't bother with complex code, it just appends a path to PYTHONPATH:
setenv("PYTHONPATH", getenv("PYTHONPATH") + ":" + path_bootstrap);
It can use PyWideStringList_Append(&config.module_search_paths, path_bootstrap), as shown in one example of my PEP.
Victor -- Night gathers, and now my watch begins. It shall not end until my death.

In response to all of your responses:
No need to take offense, I was merely summarising the research you posted in a way that looks more like scenarios or requirements. It's a typical software engineering task. Being able to collect snippets and let people draw their own conclusions is one thing, but those of us (including yourself) who are actively working in this area are totally allowed to present our analysis as well.
Given the raw material, the summary, and the recommendations, anyone else can do the same analysis and join the discussion, and that's what we're doing. But you can't simply present raw material and assume that people will naturally end up at the same conclusion - that's how you end up with overly simplistic plans where everyone "agrees" because they projected their own opinions into it, then are surprised when it turns out that other people had different opinions.
Cheers, Steve
On 13May2019 1452, Victor Stinner wrote:
)Le lun. 13 mai 2019 à 18:28, Steve Dower steve.dower@python.org a écrit :
My take:
- all the examples are trying to be isolated from the system Python
install (except Vim?)
"Isolation" means different things:
- ignore configuration files
- ignore environment variables
- custom path configuration (sys.path, sys.executable, etc.)
It seems like the most common need is to have a custom path configuration.
Py_IsolatedFlag isn't used. Only py2app manually ignores a few environment variables.
- all the examples want to import some of their own modules before
running user code
Well, running code between Py_Initialize() and running the final Python code is not new, and my PEP doesn't change anything here: it's still possible, as it was previously. You can use PyRun_SimpleFile() after Py_Initialize() for example.
Maybe I misunderstood your point.
- nobody understands how to configure embedded Python :)
Well, that's the problem I'm trying to solve by designing an homogeneous API, rather than scattered global configuration variables, environment variables, function calls, etc.
Also from my own work with/on other projects:
- embedders need to integrate native thread management with Python threads
Sorry, I see the relationship with the initialization.
- embedders want to use their own files/libraries
That's the path configuration, no?
- embedders want to totally override getpath, not augment/configure it
On Python 3.7, Py_SetPath() is the closest thing to configure path configuration. But I don't see how to override sys.executable (Py_GetProgramFullPath), sys.prefix, sys.exec_prefix, nor (internal) dll_path.
In the examples that I found, SetProgramName(), SetPythonHome() and Py_SetPath() are called.
My PEP 587 allows to completely ignore getpath.c/getpath.c easily by setting explicitly:
- use_module_search_path, module_search_paths
- executable
- prefix
- exec_prefix
- dll_path (Windows only)
If you set these fields, you fully control where Python looks for modules. Extract of the C code:
/* Do we need to calculate the path? */ if (!config->use_module_search_paths || (config->executable == NULL) || (config->prefix == NULL)
#ifdef MS_WINDOWS || (config->dll_path == NULL) #endif || (config->exec_prefix == NULL)) { _PyInitError err = _PyCoreConfig_CalculatePathConfig(config); if (_Py_INIT_FAILED(err)) { return err; } }
OpenOffice doesn't bother with complex code, it just appends a path to PYTHONPATH:
setenv("PYTHONPATH", getenv("PYTHONPATH") + ":" + path_bootstrap);
It can use PyWideStringList_Append(&config.module_search_paths, path_bootstrap), as shown in one example of my PEP.
Victor
Night gathers, and now my watch begins. It shall not end until my death.

On 10May2019 1832, Victor Stinner wrote:
I studied code of applications embedding Python. Most of them has to decode bytes strings to get wchar_t* to set home, argv, program name, etc. I'm not sure that they use the "correct" encoding, especially since Python 3.7 got UTF-8 Mode (PEP 540) and C locale coercion (PEP 538).
It looks like Py_DecodeLocale() is available very early on - why wouldn't we recommend using this function? It seems to be nearly a drop-in replacement for mbtowcs in the samples, and if memory allocation is a big deal perhaps we could just add a version that writes to a buffer?
That would provide a supported workaround for the encoding issues and unblock people hitting trouble right now, yes?
Cheers, Steve

On Tue, May 14, 2019, 19:52 Steve Dower steve.dower@python.org wrote:
It looks like Py_DecodeLocale() is available very early on - why wouldn't we recommend using this function? It seems to be nearly a drop-in replacement for mbtowcs in the samples, and if memory allocation is a big deal perhaps we could just add a version that writes to a buffer?
Actually, it is recommended in the docs https://docs.python.org/3/c-api/init.html#c.Py_SetPythonHome
Sebastian
participants (5)
-
Edwin Zimmerman
-
Inada Naoki
-
Sebastian Koslowski
-
Steve Dower
-
Victor Stinner