[Python-Dev] Re: Activating pymalloc

Vladimir Marangozov vladimir.marangozov@optimay.com
Mon, 18 Mar 2002 12:57:19 +0100


Hi,

[Tim]
>
> [me]
> > 2. I agree that the macro chain (especially on the pymalloc side)
> >    is not so useful at the end, so maybe all PyCore_ macros can be
> >    removed. The function names of the allocator can be cast in
> >    stone and then _THIS_xxx in obmalloc.c replaced with them.
> 
> Have you looked at Neil's patch?

I had a quick look. I'm not happy with it for two reasons:

- it removes all those useful comments in pymem.h and objimpl.h
- the overall picture of the memory APIs in less clear than before
  (and there are no docs trying to clarify the issue)

So let's step back for a while, assume we start from scratch with
the APIs, put forward the main memory concepts again, agree on all
APIs then see what are the diffs with the current state.

Here is a 1st shot of it which makes succinctly the round around
the clock on the issue. Hopefully you'll be able to work out the
details on python-dev.

======================================================================

Prelude
-------

We want to introduce a Python-specific allocator that can operate
on heaps managed by the Python core, and only by the Python core.

We would like to differentiate two types of Python heaps:
raw memory heap and object memory heap (argumentation is left aside).

Accessing these heaps is done through different memory APIs.

Naming conventions
------------------

The proposal is to use the following prefixes for the memory APIs:

- PyMem_xxx     for raw memory
- PyObject_xxx  for object memory

  (PyMemObject_xxx would be another suggestion for object memory
   but PyObject_ is enough)


Raw memory
----------

For raw memory allocation, the proposal is to use the following
two APIs:

- PyMem_{MALLOC, REALLOC, FREE}          - raw malloc
- PyMem_{NEW, _DEL}                      - type-oriented raw malloc

and their function counterparts for extension modules.

These are defined in pymem.h


Object memory
-------------

For object memory allocation, the proposal is to use the following
API for allocating storage from the object heap:

- PyObject_{MALLOC, REALLOC, FREE}       - object malloc

and their function counterparts for extension modules.


For creating an arbitrary (typed) Python object, which is not subject
to GC, the proposal is to use the following API:

- PyObject_{New, NewVar, Del}

Note that creating an object is a 2-step process:
 a) storage allocation + b) initialization (Py_NewReference).

For creating a Python object subject to GC, the proposal is to use
the following API:

- PyObject_GC_{New, NewVar, Del}

These APIs are defined in objimpl.h


Switching allocators
--------------------

Without a python-specific malloc, the libc standard malloc is used.

For the raw allocator, the macros PyMem_{MALLOC, REALLOC, FREE}
expand to {malloc, realloc, free} respectively (in pymem.h).

For the object allocator, the macros PyObject_{MALLOC, REALLOC, FREE}
expand to {malloc, realloc, free} respectively (in objimpl.h).


Assuming we would like to use a specific allocator, we can make the
two sets of macros above expand to this allocator independently.

Changing mallocs automatically should be doable with a configure
option, so that the macro expansion above is automated.


Python allocator
----------------

A Python allocator is a specialized allocator that tries to optimize
memory management according to Python's specific memory needs and
allocation patterns.

Such allocator would be called only by the Python core via the APIs
described above: PyMem_ for raw memory, and PyObject_ for object memory.

Therefore, the proposal for this allocator is to export the following
API:

- PyCore_{Malloc, Realloc, Free}

(typically this means that pymalloc's functions are named this way).


Controversial issues
--------------------

Practically, releasing memory is done the same way with all APIs,
so PyMem_FREE equals PyMem_DEL.  Similarly, there will be
implementation equivalences for the object APIs.

It is believed that these are implementation details for each
memory API. Therefore, the rule of thumb is to always use the
same API for a given memory block. The examples section in the
docs tries to illustrate this rule.


API Summary
-----------

Raw malloc API:       PyMem_{MALLOC, REALLOC, FREE}
                      PyMem_{NEW, DEL}

                      PyMem_{Malloc, Realloc, Free}
                      PyMem_{New, Del}

Object malloc API:    PyObject_{MALLOC, REALLOC, FREE}
                      PyObject_{Malloc, Realloc, Free}

                      PyObject_{New, NewVar, Del}
                      PyObject_GC_{New, NewVar, Del}

Python's internal
malloc API:           PyCore_{Malloc, Realloc, Free}

======================================================================

All in all, this is pretty close to what we have now and future
comments & patches should be made according to the global picture
like the one above, and preferably well documented in the source
in addition to the docs.


About tuning pymalloc: I would suggest reviving the profiler which
sleeps in the patches list. You'll find it invaluable for gathering
the stats directly from Python. After some profiling, you'll have
good reasons to change a parameter or two (like the small block
threshold). Spacing the size classes by 4 bytes is not good (well,
*was* not good at the time). Spacing them by 16 might be good now.
I don;t know -- just try it. Changing the pool size might be good
as well. But if the performace deltas you get are relatively small,
this would be a platform-specific effect, so just leave things as
they are.


Vladimir