[Python-Dev] Counting collisions for the win

Steven D'Aprano steve at pearwood.info
Fri Jan 20 05:00:48 CET 2012

Victor Stinner wrote:

> The last solution is very simple: count collision and raise an
> exception if it hits a limit. ...
> According to my basic tests, a limit of 35 collisions
> requires a dictionary with more than 10,000,000 integer keys to raise
> an error. I am not talking about the attack, but valid data.

You might think that 10 million keys is a lot of data, but that's only about 
100 MB worth. I already see hardware vendors advertising computers with 6 GB 
RAM as "entry level", e.g. the HP Pavilion starts with 6GB expandable to 16GB. 
I expect that there are already people using Python who will unpredictably hit 
that limit by accident, and the number will only grow as computers get more 

With a limit of 35 collisions, it only takes 35 keys to to force a dict to 
raise an exception, if you are an attacker able to select colliding keys. 
We're trying to defend against an attacker who is able to force collisions, 
not one who is waiting for accidental collisions. I don't see that causing the 
dict to raise an exception helps matters: it just changes the attack from 
"keep the dict busy indefinitely" to "cause an exception and crash the 

This moves responsibility from dealing with collisions out of the dict to the 
application code. Instead of solving the problem in one place (the built-in 
dict) now every application that uses dicts has to identify which dicts can be 
attacked, and deal with the exception.

That pushes the responsibility for security onto people who are the least 
willing or able to deal with it: the average developer, who neither 
understands nor cares about security, or if they do care, they can't convince 
their manager to care.

I suppose an exception is an improvement over the application hanging 
indefinitely, but I'd hardly call it a fix.

Ruby uses randomized hashes. Are there any other languages with a dict or 
mapping class that raises on too many exceptions?


More information about the Python-Dev mailing list