[Python-Dev] More compact dictionaries with faster iteration

Mon Dec 10 17:15:11 CET 2012

On 10/12/12 20:40, Armin Rigo wrote:

> As a side note, your suggestion also enables order-preserving
> dictionaries: iter() would automatically yield items in the order they
> were inserted, as long as there was no deletion.  People will
> immediately start relying on this "feature"...  and be confused by the
> behavior of deletion. :-/

If we want to avoid the attractive nuisance of iteration order being
almost, but not quite, order-preserving, there is a simple fix: when
iterating over the dict, instead of starting at the start of the table,
start at some arbitrary point in the middle and wrap around. That will
increase the cost of iteration slightly, but avoid misleading behaviour.

I think all we need do is change the existing __iter__ method from this:

     def __iter__(self):
         for key in self.keylist:
             if key is not UNUSED:
                 yield key

to this:

     # Untested!
     def __iter__(self):
         # Choose an arbitrary starting point.
         # 103 is chosen because it is prime.
         n = (103 % len(self.keylist)) if self.keylist else 0
         for key in self.keylist[n:] + self.keylist[:n]:
             # I presume the C version will not duplicate the keylist.
             if key is not UNUSED:
                 yield key

This mixes the order of iteration up somewhat, just enough to avoid
misleading people into thinking that this is order-preserving. The
order will depend on the size of the dict, not the history. For
example, with keys a, b, c, ... added in that order, the output is:

len=1  key=a
len=2  key=ba
len=3  key=bca
len=4  key=dabc
len=5  key=deabc
len=6  key=bcdefa
len=7  key=fgabcde
len=8  key=habcdefg
len=9  key=efghiabcd

which I expect is mixed up enough to discourage programmers from
jumping to conclusions about dicts having guaranteed order.

-- 
Steven