[Inada Naoki]
So I tried to use LIKELY/UNLIKELY macro to teach compiler hot part. But I need to use "static inline" for pymalloc_alloc and pymalloc_free yet [1].
[Neil Schemenauer]
I think LIKELY/UNLIKELY is not helpful if you compile with LTO/PGO enabled.
I like adding those regardless of whether compilers find them helpful: they help _people_ reading the code focus on what's important to speed. While not generally crucial, speed is important in these very low-level, very heavily used functions. Speaking of which, another possible teensy win: pymalloc's allocation has always started with: if (nbytes == 0) { return 0; } if (nbytes > SMALL_REQUEST_THRESHOLD) { return 0; } size = (uint)(nbytes - 1) >> ALIGNMENT_SHIFT; But it could be a bit leaner: size_t fatsize = (nbytes - 1) >> ALIGNMENT_SHIFT; if (UNLIKELY(fatsize >= NB_SMALL_SIZE_CLASSES)) { return 0;' } size = (uint)fatsize; The `nbytes == 0` case ends up mapping to a very large size class then, although C may not guarantee that. But it doesn't matter: if it maps to "a real" size class, that's fine. We'll return a unique pointer into a pymalloc pool then, and "unique pointer" is all that's required. An allocation requesting 0 bytes does happen at times, but it's very rare. It just doesn't merit its own dedicated test-&-branch.
Good work looking into this. Should be some relatively easy performance win.
Ditto!