[Python-Dev] "Global freepool"

INADA Naoki songofacandy at gmail.com
Thu Jun 1 04:56:58 EDT 2017


Hi,

As you said, I think PyObject_Malloc() is fast enough.
But PyObject_Free() is somewhat complex.

Actually, there are some freelists (e.g. tuple, dict, frame) and
they improve performance significantly.


My "global unified freelist" idea is unify them.  The merit is:

* Unify _PyXxx_DebugMallocStats().  Some freelists provide
  it but some doesn't.

* Unify PyXxx_ClearFreeList().  Some freelists doesn't provide
  it and it may disturb returning memory to system.

* Potential better CPU cache hit ratio by unifying LRU if some
  freelists has same memory block size.

This idea is partially implemented in https://github.com/methane/cpython/pull/3
But there are no significant difference about speed or memory usage.

Regards,

On Thu, Jun 1, 2017 at 5:40 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Thu, 1 Jun 2017 09:57:04 +0200
> Victor Stinner <victor.stinner at gmail.com> wrote:
>>
>> By the way, Naoki INADA also proposed a different idea:
>>
>> "Global freepool: Many types has it’s own freepool. Sharing freepool
>> can increase memory and cache efficiency. Add PyMem_FastFree(void*
>> ptr, size_t size) to store memory block to freepool, and PyMem_Malloc
>> can check global freepool first."
>
> This is already exactly how PyObject_Malloc() works.  Really, the fast
> path for PyObject_Malloc() is:
>
>         size = (uint)(nbytes - 1) >> ALIGNMENT_SHIFT;
>         pool = usedpools[size + size];
>         if (pool != pool->nextpool) {
>             /*
>              * There is a used pool for this size class.
>              * Pick up the head block of its free list.
>              */
>             ++pool->ref.count;
>             bp = pool->freeblock;
>             assert(bp != NULL);
>             if ((pool->freeblock = *(block **)bp) != NULL) {
>                 UNLOCK();
>                 return (void *)bp;   // <- fast path!
>             }
>
>
> I don't think you can get much faster than that in a generic allocation
> routine (unless you have a compacting GC where allocating memory is
> basically bumping a single global pointer). IMHO the main thing the
> private freelists have is that they're *private* precisely, so they can
> avoid a couple of conditional branches.
>
> Regards
>
> Antoine.
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com


More information about the Python-Dev mailing list