
Armin Rigo wrote:
If you put all objects of the same type into a pool, you really want all objects to have the same side, inside a pool. With that assumption, garbage objects can be reallocated without causing fragmentation. If objects in a pool have different sizes, it is not possible to have an efficient reallocation strategy.
"Not easy" would have been more appropriate. It is still basically what malloc() does.
That's why I said "efficient". What malloc basically does is not efficient. It gets worse if, at reallocation time, you are not only bound by size, but also by type. E.g. if you have deallocated a tuple of 10 elements, and then reallocate a tuple of 6, the wasted space can only hold a tuple of 1 element, nothing else.
One way would be to use Python's current memory allocator, by adapting it to sort objects into pools not only according to size but also according to type. What seems to me like a good solution would be to use one relatively large "arena" per type and Python's memory allocator to subdivide each arena. If each arena starts at a pointer address which is properly aligned, then *(p&MASK) gives you the type of any object, and possibly even without much cache-miss overhead because there are not so many arenas in total (probably only 1-2 per type in common cases, and arenas can be large).
So where do you put strings with 100,000 elements (characters)? Or any other object that exceeds an arena in size? Regards, Martin