[Python-ideas] Improving memory usage in shared-key attribute dicts
Steven D'Aprano
steve at pearwood.info
Thu Oct 30 23:42:21 CET 2014
On Thu, Oct 30, 2014 at 07:02:27PM +0000, Hill, Bruce wrote:
> Thanks to PEP 412: "Key-Sharing Dictionary", CPython attribute
> dictionaries can share keys between multiple instances, so the memory
> cost of new attribute dicts comes primarily from the values array. In
> the current implementation, the keys array and the values array are
> always kept to be the same size. This is done so that once the key's
> location in the key array has been tracked down, the same array offset
> can be used on the value array to find the value.
>
> Rather than storing values in a sparse array of the same size as the
> keys array, it would make more sense to store values in a compact
> array. When a dict uses key sharing, there is an unused field in the
> PyDictKeyEntry struct ("me_value"), which could be repurposed to hold
> an index into the value array (perhaps by converting "me_value" into a
> payload union in PyDictKeyEntry). Since the sparse arrays in
> dictobject.c never use more than (2n+1)/3 of their entries, this
> change would reduce the memory footprint of each shared-key dict by
> roughly 1/3 (or more) and also improve data locality.
How does this compare to the "Alternate Implementation" described in PEP
412?
http://python.org/dev/peps/pep-0412/#id17
--
Steven
More information about the Python-ideas
mailing list