Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

Nov. 5, 2017

...
On Nov 4, 2017, at 7:04 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
When I asked Damien George about this for MicroPython, he indicated
that they'd have to choose between guaranteed order and O(1) lookups
given their current dict implementation. That surprised me a bit
(since PyPy and CPython both *saved* memory by switching to their
guaranteed order implementations, hence the name "compact dict
representation"), but my (admittedly vague) understand is that the
presence of a space/speed trade-off in their case has something to do
with MicroPython deliberately running with a much higher chance of
hash collisions in general (since the data sets it deals with are
naturally smaller).
So if we make the change, MicroPython will likely go along with it,
but it may mean that dict lookups there become O(N), and folks will be
relying on "N" being consistently small due to memory constraints (but
some typically O(N) algorithms will still become O(N^2) when run on
MicroPython).
I don't think that situation should change the decision, but I do
think it would be helpful if folks that understand CPython's dict
implementation could take a look at MicroPython's dict implementation
and see if it might be possible for them to avoid having to make that
trade-off and instead be able to use a naturally insertion ordered
hashmap implementation.
I've just looked at the MicroPython dictionary implementation and think they won't have a problem implementing O(1) compact dicts with ordering.

The likely reason for the confusion is that they are already have an option for an "ordered array" dict variant that does a brute-force linear search.  However, their normal hashed lookup is very similar to ours and is easily amenable to being compact and ordered.

See:  https://github.com/micropython/micropython/blob/77a48e8cd493c0b0e0ca2d2ad58a...

Pretty much any implementation hashed lookup of keys and values is amenable to being compact and ordered.  Whatever existing logic that looks up an entry becomes a lookup into a table of indices which in turn references a sequential array of keys and values.  This logic is independent of hashing scheme or density, and it has no effect on the number of probes or collision rate.

The cost is an extra level of indirection and an extra array of indices (typically very small). The benefit is faster iteration over the smaller dense key/value array, overall memory savings resulting in improved cache utilization, and the side-effect of remembering insertion order.

Summary:  I think MicroPython will be just fine and if needed I will help create the patch that implements compact-and-ordered behavior.

Raymond

Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

Raymond Hettinger