[New-bugs-announce] [issue46675] Allow more than 16 items in split-keys dicts and "virtual" object dicts.

Mark Shannon report at bugs.python.org
Mon Feb 7 06:52:04 EST 2022

New submission from Mark Shannon <mark at hotpy.org>:

https://bugs.python.org/issue45340 and https://github.com/python/cpython/pull/28802 allowed "virtual" object dicts (see faster-cpython/ideas#72  for full details).

In order for this to work, we need to keep the insertion order on the values. The initial version (https://github.com/python/cpython/pull/28802) used a 64 bit value as a vector of 16 4-bit values, which allows only 16 items per values array.

Stats gathered from the standard benchmark suite and informal evidence from elsewhere suggests that this causes a significant (5% and upwards) of these dicts to be materialized due to exceeding the 16 item limit.

An alternative design that would allow up to ~254 items in the values array is to make the insertion order vector an array of bytes. The capacity is 254 as we need a byte for size, and another for capacity.
This will increase the size of the values a bit for sizes from 7 to 15, but save a lot of memory for sizes 17+, as keys could still be shared.

    No need to materialize dicts of size 16+, saving ~3/4 of the memory per dict and helping specialization.

    Extra memory write to store a value*
    1 extra word for values of size 7 to 14, 2 extra for size 15.
    Some extra complexity.

*In a hypothetical optimized JIT, the insertion order vector would be stored as a single write for several writes, so this would make no difference.

assignee: Mark.Shannon
messages: 412735
nosy: Mark.Shannon
priority: normal
severity: normal
status: open
title: Allow more than 16 items in split-keys dicts and "virtual" object dicts.
versions: Python 3.11

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list