[Python-Dev] A new dictionary implementation

Mon Feb 13 13:31:38 CET 2012

Hi All,

My new dictionary implementation has been further updated in light of
various comments and questions.

The code is here: https://bitbucket.org/markshannon/cpython_new_dict

The resizing code has been streamlined. This means that resizing
by doubling does not suffer any real performance penalty versus
quadrupling. Doubling uses less memory for most benchmarks (it never 
uses more memory).

Michael Foord wrote:
>> 2to3, which seems to be the only "realistic" benchmark that runs on Py3,
>> shows no change in speed and uses 10% less memory. 
> In your first version 2to3 used 28% less memory. Do you know why it's 
> worse in this version?
> 

The answer is that the second version used quadrupling rather than
doubling for resizing.

All tests pass.
test_sys has been altered to allow for the different size of the dictionary.
One test in test_pprint has been disabled. This test is broken anyway,
see http://bugs.python.org/issue13907.

In general, for the new dictionary implementation, with doubling at a
resize, speed is unchanged and memory usage is reduced.

On "average": ~1% slow down, ~10% reduction in memory use.

Full set of benchmarks for new dict with doubling and quadrupling attached.
Unfortunately the benchmarking program introduces systematic errors
for timings, but it's the best we have at the moment.
Note that the json benchmark is unstable and should be ignored.
The GC benchmark might be unstable as well, I haven't experimented.
The memory usage numbers seems to be more reliable.

Revised PEP to follow.

Cheers,
Mark.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: new_dict_benchmarks.txt
URL: <http://mail.python.org/pipermail/python-dev/attachments/20120213/a343d71f/attachment.txt>