[New-bugs-announce] [issue44555] Dictionary operations are LINEAR for any dictionary (for a particular code).

Daniel Fleischman report at bugs.python.org
Fri Jul 2 17:20:05 EDT 2021


New submission from Daniel Fleischman <danielfleischman at gmail.com>:

The following code takes quadratic time on the size of the dictionary passed, regardless of the dictionary (explanation below):

```
def slow_dictionary(d):
    while len(d) > 0:
        # Remove first element
        key = next(iter(d))
        del d[key]
```

The problem is that when an element is deleted a NULL/NULL placeholder is set in its place (https://github.com/python/cpython/blob/818628c2da99ba0376313971816d472c65c9a9fc/Objects/dictobject.c#L1534) and when we try to find the first element with `next(iter(d))` the code needs to skip over all the NULL elements until it finds the first non-NULL element (https://github.com/python/cpython/blob/818628c2da99ba0376313971816d472c65c9a9fc/Objects/dictobject.c#L1713).

I'm not sure of what is the best way to fix it, but note that simply adding a field to the struct with the position of the first non-NULL element is not enough, since a code that always deletes the SECOND element of the dictionary would still have linear operations.

An easy (but memory-wasteful) fix would be to augment the struct PyDictKeyEntry with the indices of the next/previous non empty elements, and augment _dictkeysobject with the index of the first and last non empty elements (in other words, maintain an underlying linked list of the non empty entries). With this we can always iterate in O(1) per entry.

(I tested it only on version 3.9.2, but I would be surprised if it doesn't happen in other versions as well).

----------
components: Interpreter Core
messages: 396880
nosy: danielfleischman
priority: normal
severity: normal
status: open
title: Dictionary operations are LINEAR for any dictionary (for a particular code).
type: performance
versions: Python 3.10, Python 3.11, Python 3.6, Python 3.7, Python 3.8, Python 3.9

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue44555>
_______________________________________


More information about the New-bugs-announce mailing list