On 23May2019 0542, Inada Naoki wrote:
1. perf shows 95% of CPU time is eaten by _PyObject_Free, not kernel space. 2. This loop is cleary hot: https://github.com/python/cpython/blob/51aa35e9e17eef60d04add9619fe2a7eb9383...
I can attach the process by gdb and I confirmed many arenas have same nfreepools.
It's relatively easy to test replacing our custom allocators with the system ones, yes? Can we try those to see whether they have the same characteristic? Given the relative amount of investment over the last 19 years [1], I wouldn't be surprised if most system ones are at least as good for our needs now. Certainly Windows HeapAlloc has had serious improvements in that time to help with fragmentation and small allocations. Cheers, Steve [1]: https://github.com/python/cpython/blob/51aa35e9e17eef60d04add9619fe2a7eb9383...