[Python-Dev] CPython optimization: storing reference counters outside of objects

Sturla Molden sturla at molden.no
Mon May 23 18:39:07 CEST 2011

Den 23.05.2011 06:59, skrev "Martin v. Löwis":
> My expectation is that your approach would likely make the issues
> worse in a multi-CPU setting. If you put multiple reference counters
> into a contiguous block of memory, unrelated reference counters will
> live in the same cache line. Consequentially, changing one reference
> counter on one CPU will invalidate the cached reference counters of
> that cache line on other CPU, making your problem a) actually worse.

In a multi-threaded setting with concurrent thread accessing reference 
counts, this would certainly worsen the situation.

In a single-threaded setting, this will likely be an improvement.

CPython, however, has a GIL. Thus there is only one concurrently active 
thread with access to reference counts. On a thread switch in the 
interpreter, I think the performance result will depend on the nature of 
the Python code: If threads share a lot of objects, it could help to 
reduce the number of dirty cache lines. If threads mainly work on 
private objects, it would likely have the effect you predict. Which will 
dominate is hard to tell.

Instead, we could use multiple heaps:

Each Python thread could manage it's own heap for malloc and free (cf. 
HeapAlloc and HeapFree in Windows). Objects local to one thread only 
reside in the locally managed heap.

When an object becomes shared by seveeral Python threads, it is moved 
from a local heap to the global heap of the process. Some objects, such 
as modules, would be stored directly onto the global heap.

This way, objects only used by only one thread would never dirty cache 
lines used by other threads.

This would also be a way to reduce the CPython dependency on the GIL. 
Only the global heap would need to be protected by the GIL, whereas the 
local heaps would not need any global synchronization.

(I am setting follow-up to the Python Ideas list, it does not belong on 
Python dev.)

Sturla Molden

More information about the Python-Dev mailing list