And one more random clue.
The memcrunch.py attached to the earlier-mentioned bug report does benefit a lot from changing to a "use the arena with the smallest address" heuristic, leaving 86.6% of allocated bytes in use by objects at the end (this is with the arena-thrashing fix, and the current 256K/4K arena/pool sizes). Which isn't surprising, else the person who opened that report wouldn't have attached that specific script ;-)
However, when I increase its number of starting objects by a factor of 200, changing the heuristic barely helped at all. It managed to reclaim a few dozen more arenas by the end (out of 13,556 total arenas allocated), but left only 56.1% of arena space in use by objects.
For the later program I posted, which by accident-rather-than-design is even worse for arena cycling, changing the heuristic didn't help at all. Either way, no arenas were released (although one empty arena is retained in the usable_arenas list), and less than 10% of arena space was in use at the end of phase 10.
A few things to note, then:
- For truly effective RAM releasing, we would almost certainly need to make major changes, to release RAM at an OS page level. 256K arenas were already too fat a granularity.
- Whether a program benefits from what we do now appears to rely by far most on its patterns of allocation/deallocation, rather than on the arena-releasing heuristic OR on the arena size.
- While the by-lowest-address heuristic is obviously more effective in one known artificial program so far ;-) , it's more expensive than what we do now. That's a full-blown sorting problem. Ordering by number of free pools is not: that's essentially a bucket-distribution problem, since the number of _possible_ values is small. The data structure changes I already checked in truly take constant time, in all cases, to restore sorted order when an arena's number of free pools changes. When sorting by address, the best that can be done in general is to take O(log(A)) time to insert a new arena (where A is the number of arenas). And a linked list is no good for that either (unless, e.g., it's made fancier, like a skip list).
The quick experiment I tried used one-at-time search to put an arena in the by-address-ordered list (the submitted patch also did), and we already know from experience that can become a timing disaster when hundreds of thousands of arenas are in use. OTOH, that only needs to be done once each time an arena is added to usable_arenas, not every time an arena's number of free pools changes.
- I'm not inclined to pursue changing the arena-freeing heuristic.
- Changes to keep track of OS page-level use would likely require starting over from scratch. But would also make using far larger pools practical (on 64-bit boxes).
- I still haven't found a program where changing from 256K to 1M arenas makes a lick of appreciable difference in arena recycling effectiveness: for a given program chewing on a given problem size, "does pretty well" or "does horribly" has been the outcome either way.