[Python-Dev] Algoritmic Complexity Attack on Python

Sat, 31 May 2003 09:17:16 -0400

At 05:02 PM 5/30/03 -0500, Scott A Crosby wrote:

>The python community will have to judge whether the performance
>difference of the current hash is worth the risk of the attack.

Note that the "community" doesn't really have to judge.  An individual 
developer can, if they have an application they deem vulnerable, do 
something like this:

class SafeString(str):
     def __hash__(self):
         # code to return a hash code

safe = SafeString(string_from_untrusted_source)

and then use only these "safe" strings as keys for a given dictionary.  Or, 
with a little more work, they can subclass 'dict' and make a dictionary 
that converts its keys to "safe" strings.

As far as current vulnerability goes, I'd say that the most commonly 
available attack point for this would be CGI programs that accept POST 
operations.  A POST can supply an arbitrarily large number of form field keys.

If you can show that the Python 'cgi' module is vulnerable to such an 
attack, in a dramatic disproportion to the size of the data transmitted 
(since obviously it's as much of a DoS to flood a script with a large 
quantity of data), then it might be worth making changes to the 'cgi' 
module, or at least warning the developers of alternatives to CGI (e.g. 
Zope, Quixote, SkunkWeb, CherryPy, etc.) that alternate hashes might be a 
good idea.

But based on the discussion so far, I'm not sure I see how this attack 
would produce an effect that was dramatically disproportionate to the 
amount of data transmitted.