2013/6/15 Antoine Pitrou <solipsis@pitrou.net>:
On Sat, 15 Jun 2013 03:54:50 +0200 Victor Stinner <victor.stinner@gmail.com> wrote:
The addition of PyMem_RawMalloc() is motivated by the issue #18203 (Replace calls to malloc() with PyMem_Malloc()). The goal is to be able to setup a custom allocator for *all* allocation made by Python, so malloc() should not be called directly. PyMem_RawMalloc() is required in places where the GIL is not held (ex: in os.getcwd() on Windows).
We already had this discussion on IRC and this argument isn't very convincing to me. If os.getcwd() doesn't hold the GIL while allocating memory, then you should fix it to hold the GIL while allocating memory.
The GIL is released for best performances, holding the GIL would have an impact on performances. PyMem_RawMalloc() is needed when PyMem_Malloc() cannot be used because the GIL was released. For example, for the issue #18227 (reuse the custom allocator in external libraries), PyMem_Malloc() is usually not appropriate. PyMem_RawMalloc() should also be used instead of PyMem_Malloc() in the Python startup sequence, because PyMem_Malloc() requires the GIL whereas the GIL does not exist yet. PyMem_RawMalloc() also provides more accurate memory usage if it can be replaced or hooked (with PyMem_SetRawAllocators). The issue #18203 explains why I would like to replace direct call to malloc() with PyMem_Malloc() or PyMem_RawMalloc().
I don't like the idea of adding of third layer of allocation APIs. The dichotomy between PyObject_Malloc and PyMem_Malloc is already a bit gratuitous (i.e. not motivated by any actual real-world concern, as far as I can tell).
In Python 3.3, PyMem_Malloc() cannot be used instead of malloc() where the GIL is not held. Instead of adding PyMem_RawMalloc(), an alternative is to remove the "the GIL must be held" restriction from PyMem_Malloc() by changing PyMem_Malloc() to make it always call malloc() (instead of PyObject_Malloc() in debug mode). With such change, a debug hook cannot rely on the GIL anymore: it cannot inspect Python objects, get a frame or traceback, etc. To still get accurate debug report, PyMem_Malloc() should be replaced with PyObject_Malloc(). I don't understand yet the effect of such change on backport compatibility. May it break applications?
As for the debug functions you added: PyMem_GetRawAllocators(), PyMem_SetRawAllocators(), PyMem_GetAllocators(), PyMem_SetAllocators(), PyMem_SetupDebugHooks(), _PyObject_GetArenaAllocators(), _PyObject_SetArenaAllocators(). Well, do we need all *7* of them? Can't you try to make that 2 or 3?
Get/SetAllocators of PyMem, PyMem_Raw and PyObject can be grouped into 2 functions (get and set) with an argument to select the API. It is what I proposed initially. I changed this when I had to choose a name for the name of the argument ("api", "domain", something else?) because there were only two choices. With 3 family of functions (PyMem, PyMem_Raw and PyObject), it becomes again interesting to have generic functions. The arena case is different: pymalloc only uses two functions to allocate areneas: void* alloc(size_t) and void release(void*, size_t). The release function has a size argument, which is unusual, but require to implement it using munmap(). VirtualFree() on Windows requires also the size. An application can choose to replace PyObject_Malloc() with its own allocator, but in my experience, it has an important impact on performance (Python is slower). To benefit of pymalloc with a custom memory allocator, _PyObject_SetArenaAllocators() can be used. I kept _PyObject_SetArenaAllocators() private because I don't like its API, it is not homogenous with the other SetAllocators functions. I'm not sure that it would be used, so I prefer to keep it private until it is tested by some projects. "Private" functions can be used by applications, it's just that Python doesn't give any backward compatibility warranty. Am I right? Victor