[Python-Dev] Counting collisions for the win

Mon Jan 23 07:41:51 CET 2012

On 23 January 2012 16:49, Lennart Regebro <regebro at gmail.com> wrote:

> On Mon, Jan 23, 2012 at 06:02, Paul McMillan <paul at mcmillan.ws> wrote:
> >> We may use a different salt per dictionary.
> >
> > If we're willing to re-hash everything on a per-dictionary basis. That
> > doesn't seem reasonable given our existing usage.
>
> Well, if we get crazy amounts of collisions, re-hashing with a new
> salt to get rid of those collisions seems quite reasonable to me...

Actually, this looks like it has the seed of a solution in it. I haven't
scrutinised the following beyond "it sounds like it could work" - it could
well contain nasty flaws.

Assumption: We only get an excessive number of collisions during an attack
(directly or indirectly).
Assumption: Introducing a salt into hashes will change those hashes
sufficiently to mitigate the attack (all discussion of randomising hashes
makes this assumption).

1. Keep the current hashing (for all dictionaries) i.e. just using
hash(key).

2. Count collisions.

3. If any key hits X collisions change that dictionary to use a random salt
for hashes (at least for str and unicode keys). This salt would be
remembered for the dictionary.

Consequence: The dictionary would need to be rebuilt when an attack was
detected.
Consequence: Hash caching would no longer occur for this dictionary, making
most operations more expensive.
Consequence: Anything relying on the iteration order of a dictionary which
has suffered excessive conflicts would fail.

4. (Optional) in 3.3, provide a way to get a dictionary with random salt
(i.e. not wait for attack).

Tim Delaney
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20120123/96fe8680/attachment.html>