[New-bugs-announce] [issue34707] Python not reentrant

john skaller report at bugs.python.org
Sun Sep 16 22:04:35 EDT 2018


New submission from john skaller <skaller at users.sourceforge.net>:

Executive Summary: Python currently is not properly re-entrant. This comment applies to the CAPI and particularly embedding. A fix is not possible in Python 3.x but should be scheduled for Python 4. On Linux all binary plugins are broken as well.

The fault is exhibited by the need to first call PyInitialise(). This is clearly wrong because there is nowhere to put the initialised data. The correct sequence should be to first create an interpreter handle, and then initialise that. Other API calls exhibit the same fault. For example PyErr_Occured().

Use of thread local storage is NOT enough.

A general embedding scenario is this: a thunk program is used to dynamically load a shared library and execute a function in it. That function may load other shared libraries. Note carefully there is no global data, the libraries are pure code. [This is not an imagined scenario, my whole system works this way]

The same library may be loaded several times. For example, A can load B and C, and both B and C can load D. Proper visibility control means A cannot see any symbols of D.

In this scenario if D wishes to run a Python interpreter, it must call PyInitialise(), and it will be called twice, since D is called twice, once from A, and once from B. Indeed, if the top level spawns multiple threads, it can be called many more times than that. 

Remember the libraries are pure code and fully reentrant. There is no way to record if a function has been called already.

In order for Python to be fully re-entrant there is a simple test: if the C code of the Python library contains ANY global variables at all then Python is wrong. Global variables INCLUDE thread local storage. ALL data and ALL functions must hang off a handle so that all functionality and behaviour is fully isolated to each handle.

Exceptions to the rule: poorly designed OS such as Unix have some non-reentrant features. The worst of these in Unix is signal handling. It is not possible to handle signals without a global variable to communicate between the signal handler and application. The right way to do this would have been to use a polling service to detect the signal. In any case systems like Python do have to work with badly designed API's sometimes and therefore these special cases do form legitimate exceptions to the requirement that the API be re-entrant. My recommendation is to provide a cheat API which looks re-entrant but actually isn't, because it delegates to a hidden lower level which isn't, of necessity. YMMV: how to handle bad underlying API's should be open for discussion.

Other consequences: On linux at least ALL plugin extensions are built incorrectly. The correct way to build a plugin requires explicitly linking against the Python library, so that symbols in the Python API can be found. These symbols must NOT be found in the application because this is, quite simply, not possible, if the application does not include those symbols. In my scenario, the top level application is three lines of C than does nothing other than load a library and run a fixed function in it. And that library has no idea that one of the libraries IT loads may call another library which happens to want to run some Python code. Indeed my system can *generate* Python modules, and compile and link them against the Python library, but it cannot load any existing plugins on Linux, because those plugins were incorrectly built and do not link to the Python library as they should. They expect to find symbols in the symbol table magically provided but those symbols are not there.

On OSX, however, it works. That is because on OSX, a --framework is used to contain the Python library and all plugins HAVE to be linked against the framework. I expect the Windows builds to work too, for the same reason (but I'm not sure).

This issue is related to the lack of re-entrancy because the same principle is broken in both cases. If you need a service, you must ask for it, and when you get it, it is exclusively yours.

----------
components: Interpreter Core
messages: 325508
nosy: skaller
priority: normal
severity: normal
status: open
title: Python not reentrant
type: behavior
versions: Python 3.8

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue34707>
_______________________________________


More information about the New-bugs-announce mailing list