[Python-checkins] peps: PEP 445

victor.stinner python-checkins at python.org
Tue Jun 18 14:17:58 CEST 2013


http://hg.python.org/peps/rev/7cc36550c084
changeset:   4941:7cc36550c084
user:        Victor Stinner <victor.stinner at gmail.com>
date:        Tue Jun 18 14:14:17 2013 +0200
summary:
  PEP 445

files:
  pep-0445.txt |  168 +++++++++++++++++++++++++-------------
  1 files changed, 110 insertions(+), 58 deletions(-)


diff --git a/pep-0445.txt b/pep-0445.txt
--- a/pep-0445.txt
+++ b/pep-0445.txt
@@ -34,12 +34,6 @@
     allocator APIs (builtin Python debug hooks)
   - force allocation to fail to test handling of ``MemoryError`` exception
 
-API:
-
-* Setup a custom memory allocator for all memory allocated by Python
-* Hook memory allocator functions to call extra code before and/or after
-  the underlying allocator function
-
 
 Proposal
 ========
@@ -47,15 +41,29 @@
 API changes
 -----------
 
-* Add a new ``PyMemAllocators`` structure
-
 * Add new GIL-free memory allocator functions:
 
   - ``void* PyMem_RawMalloc(size_t size)``
   - ``void* PyMem_RawRealloc(void *ptr, size_t new_size)``
   - ``void PyMem_RawFree(void *ptr)``
 
-* Add new functions to get and set memory allocators:
+* Add a new ``PyMemAllocators`` structure::
+
+    typedef struct {
+        /* user context passed as the first argument to the 3 functions */
+        void *ctx;
+
+        /* allocate memory */
+        void* (*malloc) (void *ctx, size_t size);
+
+        /* allocate memory or resize a memory buffer */
+        void* (*realloc) (void *ctx, void *ptr, size_t new_size);
+
+        /* release memory */
+        void (*free) (void *ctx, void *ptr);
+    } PyMemAllocators;
+
+* Add new functions to get and set memory block allocators:
 
   - ``void PyMem_GetRawAllocators(PyMemAllocators *allocators)``
   - ``void PyMem_SetRawAllocators(PyMemAllocators *allocators)``
@@ -63,17 +71,32 @@
   - ``void PyMem_SetAllocators(PyMemAllocators *allocators)``
   - ``void PyObject_GetAllocators(PyMemAllocators *allocators)``
   - ``void PyObject_SetAllocators(PyMemAllocators *allocators)``
+
+* Add new functions to get and set memory mapping allocators:
+
   - ``void _PyObject_GetArenaAllocators(void **ctx_p, void* (**malloc_p) (void *ctx, size_t size), void (**free_p) (void *ctx, void *ptr, size_t size))``
   - ``void _PyObject_SetArenaAllocators(void *ctx, void* (*malloc) (void *ctx, size_t size), void (*free) (void *ctx, void *ptr, size_t size))``
 
-* Add a new function to setup Python builtin debug hooks when memory
+* Add a new function to setup the builtin Python debug hooks when memory
   allocators are replaced:
 
   - ``void PyMem_SetupDebugHooks(void)``
 
+.. note::
 
-Use these new APIs
-------------------
+   The builtin Python debug hooks were introduced in Python 2.3 and implement the
+   following checks:
+
+   * Newly allocated memory is filled with the byte 0xCB, freed memory is filled
+     with the byte 0xDB.
+   * Detect API violations, ex: ``PyObject_Free()`` called on a memory block
+     allocated by ``PyMem_Malloc()``
+   * Detect write before the start of the buffer (buffer underflow)
+   * Detect write after the end of the buffer (buffer overflow)
+
+
+Make usage of these new APIs
+----------------------------
 
 * ``PyMem_Malloc()`` and ``PyMem_Realloc()`` always call ``malloc()`` and
   ``realloc()``, instead of calling ``PyObject_Malloc()`` and
@@ -156,12 +179,12 @@
    are not thread-safe.
 
 
-Use case 2: Replace Memory Allocators, overriding pymalloc
-----------------------------------------------------------
+Use case 2: Replace Memory Allocators, override pymalloc
+--------------------------------------------------------
 
 If your allocator is optimized for allocation of small objects (less than 512
-bytes) with a short liftime, you can replace override pymalloc (replace
-``PyObject_Malloc()``).
+bytes) with a short lifetime, pymalloc can be overriden: replace
+``PyObject_Malloc()``.
 
 Dummy Example wasting 2 bytes per allocation::
 
@@ -210,7 +233,7 @@
 Use case 3: Setup Allocator Hooks
 ---------------------------------
 
-Example to setup hooks on memory allocators::
+Example to setup hooks on all memory allocators::
 
     struct {
         PyMemAllocators pymem;
@@ -249,11 +272,11 @@
     void setup_hooks(void)
     {
         PyMemAllocators alloc;
-        static int registered = 0;
+        static int installed = 0;
 
-        if (registered)
+        if (installed)
             return;
-        registered = 1;
+        installed = 1;
 
         alloc.malloc = hook_malloc;
         alloc.realloc = hook_realloc;
@@ -284,30 +307,27 @@
 Performances
 ============
 
-The `Python benchmarks suite <http://hg.python.org/benchmarks>`_ (-b 2n3): some
-tests are 1.04x faster, some tests are 1.04 slower, significant is between 115
-and -191. I don't understand these output, but I guess that the overhead cannot
-be seen with such test.
+Results of the `Python benchmarks suite <http://hg.python.org/benchmarks>`_ (-b
+2n3): some tests are 1.04x faster, some tests are 1.04 slower, significant is
+between 115 and -191. I don't understand these output, but I guess that the
+overhead cannot be seen with such test.
 
-pybench: "+0.1%" (diff between -4.9% and +5.6%).
+Results of pybench benchmark: "+0.1%" slower globally (diff between -4.9% and
++5.6%).
 
-The full output is attached to the issue #3329.
+The full reports are attached to the issue #3329.
 
 
 Alternatives
 ============
 
-Only one get and one set function
----------------------------------
+Only have one generic get/set function
+--------------------------------------
 
 Replace the 6 functions:
 
-* ``PyMem_GetRawAllocators()``
-* ``PyMem_GetAllocators()``
-* ``PyObject_GetAllocators()``
-* ``PyMem_SetRawAllocators(allocators)``
-* ``PyMem_SetAllocators(allocators)``
-* ``PyObject_SetAllocators(allocators)``
+* ``PyMem_GetRawAllocators()``, ``PyMem_GetAllocators()``, ``PyObject_GetAllocators()``
+* ``PyMem_SetRawAllocators(allocators)``, ``PyMem_SetAllocators(allocators)``, ``PyObject_SetAllocators(allocators)``
 
 with 2 functions with an additional *domain* argument:
 
@@ -321,16 +341,21 @@
 * ``PYALLOC_PYOBJECT``
 
 
+``_PyObject_GetArenaAllocators()`` and ``_PyObject_SetArenaAllocators()`` are
+not merged and kept private because their prototypes are different and they are
+specific to pymalloc.
+
+
 Add a new PYDEBUGMALLOC environment variable
 --------------------------------------------
 
-To be able to use Python builtin debug hooks even when a custom memory
-allocator is set, an environment variable ``PYDEBUGMALLOC`` can be added to
-setup these debug function hooks, instead of adding the new function
-``PyMem_SetupDebugHooks()``. If the environment variable is present,
-``PyMem_SetRawAllocators()``, ``PyMem_SetAllocators()`` and
-``PyObject_SetAllocators()`` will reinstall automatically the hook on top of
-the new allocator.
+To be able to use the Python builtin debug hooks even when a custom memory
+allocator replaces the default Python allocator, an environment variable
+``PYDEBUGMALLOC`` can be added to setup these debug function hooks, instead of
+adding the new function ``PyMem_SetupDebugHooks()``. If the environment
+variable is present, ``PyMem_SetRawAllocators()``, ``PyMem_SetAllocators()``
+and ``PyObject_SetAllocators()`` will reinstall automatically the hook on top
+of the new allocator.
 
 An new environment variable would make the Python initialization even more
 complex. The `PEP 432 <http://www.python.org/dev/peps/pep-0432/>`_ tries to
@@ -343,8 +368,8 @@
 To have no overhead in the default configuration, customizable allocators would
 be an optional feature enabled by a configuration option or by macros.
 
-Not having to recompile Python makes debug hooks easy to use in practice.
-Extensions modules don't have to be compiled with or without macros.
+Not having to recompile Python makes debug hooks easier to use in practice.
+Extensions modules don't have to be recompiled with macros.
 
 
 Pass the C filename and line number
@@ -354,10 +379,11 @@
 and line number of a memory allocation.
 
 Passing a filename and a line number to each allocator makes the API more
-complex: pass 3 new arguments instead of just a context argument, to each
-allocator function. GC allocator functions should also be patched,
-``_PyObject_GC_Malloc()`` is used in many C functions for example. Such changes
-add too much complexity, for a little gain.
+complex: pass 3 new arguments, instead of just a context argument, to each
+allocator function. The GC allocator functions should also be patched.
+``_PyObject_GC_Malloc()`` is used in many C functions for example and so
+objects of differenet types would have the same allocation location. Such
+changes add too much complexity for a little gain.
 
 
 No context argument
@@ -369,41 +395,67 @@
 * ``void* realloc(void *ptr, size_t new_size)``
 * ``void free(void *ptr)``
 
-The context is a convenient way to reuse the same allocator for different APIs
-(ex: PyMem and PyObject).
+It is likely for an allocator hook to be reused for ``PyMem_SetAllocators()``
+and ``PyObject_SetAllocators()``, but the hook must call a different function
+depending on the allocator. The context is a convenient way to reuse the same
+allocator or hook for different APIs.
 
 
 PyMem_Malloc() GIL-free
 -----------------------
 
-There is no real reason to require the GIL when calling ``PyMem_Malloc()``.
+``PyMem_Malloc()`` must be called with the GIL held because in debug mode, it
+calls indirectly ``PyObject_Malloc()`` which requires the GIL to be held.  This
+PEP proposes to "fix" ``PyMem_Malloc()`` to make it always call ``malloc()``.
+So the "GIL must be held" restriction may be removed no ``PyMem_Malloc()``.
 
 Allowing to call ``PyMem_Malloc()`` without holding the GIL might break
 applications which setup their own allocator or their allocator hooks. Holding
-the GIL is very convinient to develop a custom allocator or a hook (no need to
-care of other threads, no need to handle mutexes, etc.).
+the GIL is very convinient to develop a custom allocator: no need to care of
+other threads nor mutexes. It is also convinient for an allocator hook: Python
+internals can be safetly inspected.
+
+Calling ``PyGILState_Ensure()`` in a memory allocator may have unexpected
+behaviour, especially at Python startup and at creation of a new Python thread
+state.
 
 
 Don't add PyMem_RawMalloc()
 ---------------------------
 
-Replace ``malloc()`` with ``PyMem_Malloc()``, but if the GIL is not held: keep
-``malloc()`` unchanged.
+Replace ``malloc()`` with ``PyMem_Malloc()``, but only if the GIL is held.
+Otherwise, keep ``malloc()`` unchanged.
 
 The ``PyMem_Malloc()`` is sometimes already misused. For example, the
 ``main()`` and ``Py_Main()`` functions of Python call ``PyMem_Malloc()``
 whereas the GIL do not exist yet. In this case, ``PyMem_Malloc()`` should
-be replaced with ``malloc()``.
+be replaced with ``malloc()`` (or ``PyMem_RawMalloc()``).
 
 If an hook is used to the track memory usage, the ``malloc()`` memory will not
 be seen. Remaining ``malloc()`` may allocate a lot of memory and so would be
 missed in reports.
 
 
-CCP API
--------
 
-XXX To be done (Kristján Valur Jónsson) XXX
+Use existing debug tools to analyze the memory
+----------------------------------------------
+
+There are many existing debug tools to analyze the memory. Some examples:
+`Valgrind <http://valgrind.org/>`_,
+`Purify <http://ibm.com/software/awdtools/purify/>`_,
+`Clang AddressSanitizer <http://code.google.com/p/address-sanitizer/>`_,
+`failmalloc <http://www.nongnu.org/failmalloc/>`_,
+etc.
+
+The problem is retrieve the Python object related to a memory pointer to read
+its type and/or content. Another issue is to retrieve the location of the
+memory allocation: the C backtrace is usually useless (same reasoning than
+macros using ``__FILE__`` and ``__LINE__``), the Python filename and line
+number (or even the Python traceback) is more useful.
+
+Classic tools are unable to introspect the Python internal to collect such
+information. Being able to setup a hook on allocators called with the GIL held
+allow to read a lot of useful data from Python internals.
 
 
 External libraries

-- 
Repository URL: http://hg.python.org/peps


More information about the Python-checkins mailing list