[Python-Dev] More compact dictionaries with faster iteration

Maciej Fijalkowski fijall at gmail.com
Sun May 19 16:09:01 CEST 2013

On Sun, May 19, 2013 at 7:27 AM, Raymond Hettinger
<raymond.hettinger at gmail.com> wrote:
> On May 15, 2013, at 4:32 AM, Christian Tismer <tismer at stackless.com> wrote:
> What is the current status of this discussion?
> I'd like to know whether it is a considered alternative implementation.
> As far as I can tell, I'm the only one working on it (and a bit slowly at
> that).
> My plan is to implement it for frozensets to see how it works out.
> Frozensets are a nice first experiment for several reasons:
> * The current implementation is cleaner than dictionaries
>    (which have become more complicated due to key-sharing).
> * It will be easy to benchmark (by racing sets vs frozen sets)
>    for an apples-to-apples comparison.
> * There is no need to have a list-like over-allocation scheme
>    since frozensets can't grow after they are created.
>    That will guarantee a significant space savings and
>    it will simplify the coding.
> * I wrote the code for setobject.c so I know all the ins-and-outs.
> There is also a discussion in python-ideas right now where this
> alternative is mentioned, and I think especially for small dicts
> as **kwargs, it could be a cheap way to introduce order.
> The compaction of keys and values into a dense array was
> intended to save space, improve cache performance, and
> improve iteration speed.  The ordering was just a side-effect
> and one that is easily disturbed if keys ever get deleted.
> So a compacted dict might be a cheap way to introduce order
> for kwargs, but it would need special handling if the user decided
> to delete keys.
> BTW, I'm +1 on the idea for ordering keyword-args.  It makes
> it easier to debug if the arguments show-up in the order they
> were created.  AFAICT, no purpose is served by scrambling them
> (which is exacerbated by the new randomized hashing security feature).
> Raymond

The completely ordered dict is easy to get too - you mark deleted
entries instead of removing them (then all the keys are in order) and
every now and then you just compact the whole thing by removing all
the delted entries, presumably on the resize or so.


More information about the Python-Dev mailing list