Mailman 3 Experimenting with dict performance, and an immutable dict - Python-ideas

July 21, 2020

      Let me first say that the code and discussion, as per title, is also about
possible performance improvements to the dict base type.

TL;DR I implemented a frozendict using CPython 3.9 code. It seems that an
immutable dict *could* be faster than dict in some cases. Furthermore, some
optimization could be applied to dict too.

Long explaining:

Since now I have some time, I decided to experiment a little with the
creation of an immutable dict in Python.

Unluckily, I started this experiment many months ago, so the CPython code I
used is old. Maybe some or all of my considerations are outdated.

Initially, I wrote a quick and dirty implementation:
https://github.com/Marco-Sulla/cpython/commit/fde4e6d236b19636063f8afedea8c5...

The code was very simple, and the performance was identical to dict. So in
theory, adding a frozendict to CPython is not hard. But there's more.

After the first implementation, I started to try to improve the performance
of frozendict. The result of the improvements are here:

https://github.com/Marco-Sulla/cpython/blob/master/frozendict/test/bench.txt

For benchmarks, I used simply timeit, with autorange and repeat and, as
suggested in the module documentation, I got the minimum of the results.
Here is the code:

https://github.com/Marco-Sulla/cpython/blob/master/frozendict/test/bench.py

I have not tested with an optimized build, since optimization data is
collected using the unit tests, and I didn't write tests for frozendict in
the official CPython test framework.

The tests and benchmarks were done using CPython 3.9a0. CPU and other pc
resources were not restricted using pyperf or similar tools, to see the
"real" speed. CPython was compiled using gcc and g++.

In benchmarks, I compared methods and operators using dict and frozendict.
The benchmark uses dicts with all integers and dicts with all strings.
Furthermore, I tested dicts with size 8 (the most optimized size in
CPython) and 1000 elements (maybe too much, but I wanted to see how they
perform with a high number of elements).

Every benchmark has a line in the output. The Name describes the
benchmarked method, operator or code snippet. Size is the size of the dict,
8 or 1000. Keys are the keys type, str or int. Type is the dictionary type,
dict or frozendict.

In Name, the "o" represents the object itself (dict or frozendict). "d", in
benchmark with dict, is "o"; in benchmarks with frozendict is an equivalent
instance of type dict.

Some consideration:

1. frozendict is very fast, as expected, at copying. But it's also faster
at creation, using a (not frozen) dict, kwargs or a sequence2. Speedups
range from 20% to 45%.
2. frozendict is also a bit faster when you iterate over it, especially
over values, where is ~15% faster
3. hash seems really fast, but this is because it's cached the first time
hash() is invoked
4. where frozendict is slower is when you unpickle it and with fromkeys and
repr. This is because I wrote a very naif implementation of these methods,
without optimizing them. The other methods have a comparable speed.

Here is the code:
https://github.com/Marco-Sulla/cpython

Here is the diff between the CPython code and my commits:
https://github.com/python/cpython/compare/master...Marco-Sulla:master

About code

I coded the implementation doing a simple copy/paste of the existing dict
functions, modifying their code and renaming them. This way I'm sure dict
continues to work as before, and I can compare the speed gain.

Some of the optimization I adopted can be implemented in `dict` too. For
example, instead of creating an empty dict and resizing it, I create it
with the "maximum" size and I fill it. It seems to work, even if I did not
explore the possibility that a mutable object can change while a frozendict
creation is in progress.

Some problems with optimizing dict and maintaining a frozendict:

1. duplication of code. To gain a significant boost, I had to copy and
paste a lot of code. Functions can be remerged again, but maybe the speedup
will be reduced.
2. split vs combined dicts. As I wrote, split dicts seem to be faster in
reading than combined dicts. For example, iterating over values is faster
with a split dict, as expected.
But writing was not tested; furthermore, some of the optimizations can be
adopted for dicts too, so the convenience of a split table can be lowered.
dict continues to maintain both split and combined tables, so this could be
not a problem. But the code could be less and more fast if only a table
layout is supported
3. the CPython code I used is old, so some of the improvements I adopted
could be already implemented

About frozendict

Apart the considerations done in the [PEP 416](
https://www.python.org/dev/peps/pep-0416/), that was rejected since there
was little gain from its implementation, I think that frozendict can be
useful as a substitute of MappingProxyType, that is really slow.
MappingProxyType is not much used, but it's present in critical parts of
CPython code, for example in _sre. We have to see if a mapping proxy type
*can* be substituted with an immutable map in some critical part of CPython
code.

Furthermore, frozendicts could be used for implementing "immutable" classes
and modules, and can be used as a faster dict if its content does not
change.

Experimenting with dict performance, and an immutable dict

Marco Sulla

Todd

Guido van Rossum

Marco Sulla

Inada Naoki

Marco Sulla

Inada Naoki

Marco Sulla

Antoine Pitrou

Guido van Rossum

Wes Turner

Todd

Guido van Rossum

Marco Sulla

Inada Naoki

Marco Sulla

Inada Naoki

Marco Sulla

Antoine Pitrou

Guido van Rossum

Wes Turner

tags

participants (6)