Allow calling PyMem_Malloc() without the GIL held in Python 3.4

Hi, I would like to remove the "GIL must be held" restriction from PyMem_Malloc(). In my opinion, the restriction was motived by a bug in Python, bug fixed by the issue #3329. Let me explain why. The PyMem_Malloc() function is a thin wrapper to malloc(). It returns NULL if the size is lager than PY_SSIZE_T_MAX and have a well defined behaviour for PyMem_Malloc(0) (don't return NULL). So it is surprising to read in Include/pymem.h: "The GIL must be held when using these APIs." The reason is more surprising: in debug mode, PyMem_Malloc() is no more a thin wrapper to malloc(), but it calls internally PyObject_Malloc(), the "Python allocator" (called pymalloc). (Many other checks are done in debug mode, but it's unrelated to my point.) The problem is that PyObject_Malloc() is not thread-safe, the GIL must be held. Short history: fb45791150d1 (Mar 23 2002) "gives Python a debug-mode pymalloc" f294fdd18b5b (Mar 28 2002) removes the "check API family" e16dbf875303 (Apr 22 2002) redirects indirectly PyMem_Malloc() to PyObject_Malloc() in debug mode b6aff7a59803 (Sep 28 2009) reintroduces API checks So the GIL issue is almost as old as the debug mode for Python memory allocators. My patch attached to http://bugs.python.org/issue3329 changes the design of the debug memory allocators: they are now wrapper (hooks) on the underlying memory allocator (PyMem: malloc, PyObject: pymalloc), instead of always redirecting to pymalloc (ex: PyObject_Malloc). Using my patch, PyMem_Malloc() now always calls malloc(), even in debug mode. Removing the "GIL must be held" restriction is now safe. Do you agree? May it cause backward compatibility issue? PyMem_Malloc() and PyMem_MALLOC() call malloc(), except if the Python source code was manually modified. Does this use case concern many developers? Removing the GIL restriction would help to replace direct calls to malloc() with PyMem_Malloc(). Using PyMem_SetAllocators(), an application would be able to replace memory allocators, and these allocators would be used "everywhere". => see http://bugs.python.org/issue18203 Victor

-----Original Message----- I would like to remove the "GIL must be held" restriction from PyMem_Malloc(). In my opinion, the restriction was motived by a bug in Python, bug fixed by the issue #3329. Let me explain why.
...
Removing the GIL restriction would help to replace direct calls to malloc() with PyMem_Malloc(). Using PyMem_SetAllocators(), an application would be able to replace memory allocators, and these allocators would be used "everywhere". => see http://bugs.python.org/issue18203
To keep this interesting, I have a somewhat different opinion to Victor :) have put comments in the original defect, but would like to repeat them here. IMHO, keeping the GIL restriction on PyMem_MALLOC is useful. 1) It allows it to be replaced with PyObject_MALLOC(). Or anything else. In particular, an implementer is free to add memory profiling support and other things without worrying about implementation details. Requiring it to be GIL free severely limits what it can do. For example, it would be forbidden to delegate to PyObject_MALLOC when debugging. The approach CCP has taken (we have replaced all raw malloc calls with api calls) is this: a) Add a "raw" api, PyMem_MALLOC_RAW. This is guaranteed to be thread safe and call directly to the external memory api of python, as set by Py_SetAllocator() b) Replace calls to malloc() in the source code with PyMem_MALLOC/PyMem_MALLOC_RAW as appropriate (in our case, using an include file with #defines to mimimize changes) There are only two or three places in the source code that require non-GIL protected malloc. IMHO, requiring PyMem_MALLOC to be threadsafe just to cater to those three places is an overkill, and does more harm than good by limiting our options. Cheers! Kristján

I commited the new API (little bit different than my last patch on issue #3329): http://hg.python.org/cpython/rev/6661a8154eb3 The documentation will be available in a few minutes at: http://docs.python.org/3/c-api/memory.html 2013/6/14 Kristján Valur Jónsson <kristjan@ccpgames.com>:
Removing the GIL restriction would help to replace direct calls to malloc() with PyMem_Malloc(). Using PyMem_SetAllocators(), an application would be able to replace memory allocators, and these allocators would be used "everywhere". => see http://bugs.python.org/issue18203
To keep this interesting, I have a somewhat different opinion to Victor :) have put comments in the original defect, but would like to repeat them here. IMHO, keeping the GIL restriction on PyMem_MALLOC is useful. 1) It allows it to be replaced with PyObject_MALLOC(). Or anything else. In particular, an implementer is free to add memory profiling support and other things without worrying about implementation details. Requiring it to be GIL free severely limits what it can do. For example, it would be forbidden to delegate to PyObject_MALLOC when debugging.
For my own pytracemalloc tool, holding the GIL while calling PyMem_Malloc() is required to be able to retrieve the Python filename and line number of the caller. So you convinced me :-) I am also worried by the backward compatibility, even if I expect that only a very few developers replaced Python memory allocators. A custom memory allocator may not be thread-safe, so the GIL can also be convinient. I added new functions in the "final" API: PyMem_RawMalloc(), PyMem_RawRealloc(), PyMem_RawFree(). These functions are just wrapper for malloc(), realloc() and free(). The GIL does not need to be hold. No process is done before/after at all. Behaviour of PyMem_RawMalloc(0) is platform depend for example. "size > PY_SSIZE_T_MAX" check is not done neither, but it may be interesting to add this check for security reasons (it is already in place for PyMem_Malloc and PyObject_Malloc). Using these new functions instead of malloc/realloc/free is interesting because the internal functions can be replaced with PyMem_SetRawAllocators() and many checks are added in debug mode (ex: check for buffer under- and overflow). PyObject_Malloc() was not documented, so I did not document PyObject_SetAllocators(). In the final API, I added a new PyMemAllocators structure to simplify the API. I also made _PyObject_SetArenaAllocators() private because I don't like its API (it is not homogenous with PyMem_SetAllocators) and it is concerned by less use cases. I prefer to wait a little before making this API public. I didn't use "#ifndef Py_LIMITED_API", so all new functions are part of the stable API. Is it correct? Victor
participants (2)
-
Kristján Valur Jónsson
-
Victor Stinner