[Python-Dev] Hashing proposal: change only string-only dicts

"Martin v. Löwis" martin at v.loewis.de
Wed Jan 18 18:52:23 CET 2012


Am 18.01.2012 13:30, schrieb Barry Warsaw:
> On Jan 18, 2012, at 08:19 AM, Martin v. Löwis wrote:
> 
>> My concern is not about breaking doctests: this proposal will also break
>> them. My concern is about applications that assume that hash(s) is
>> stable across runs, and we do have reports that it will break
>> applications.
> 
> I am a proponent of doctests, and thus use them heavily.  I can tell you that
> the issue of dict hashing (non-)order has been well known for *years* and I
> have convenience functions in my own doctests to sort and print dict
> elements.

Indeed. So that breakage may actually be less than people expect.

As for cases that still rely on dict order: none of the proposed
solutions preserve full compatibility in dict order. The only solution
(not actually proposed so far) is to add an AVL tree into the hash
table, to track keys that collide on hash values (rather than hash
slots). Such a tree would be only used if there is an actual collision,
which, in practical dict usage, never occurs.

I've been seriously considering implementing a balanced tree inside
the dict (again for string-only dicts, as ordering can't be guaranteed
otherwise). However, this would be a lot of code for a security fix.
It *would* solve the issue for good, though.

Regards,
Martin


More information about the Python-Dev mailing list