At 05:02 PM 5/30/03 -0500, Scott A Crosby wrote:
The python community will have to judge whether the performance difference of the current hash is worth the risk of the attack.
Note that the "community" doesn't really have to judge. An individual developer can, if they have an application they deem vulnerable, do something like this: class SafeString(str): def __hash__(self): # code to return a hash code safe = SafeString(string_from_untrusted_source) and then use only these "safe" strings as keys for a given dictionary. Or, with a little more work, they can subclass 'dict' and make a dictionary that converts its keys to "safe" strings. As far as current vulnerability goes, I'd say that the most commonly available attack point for this would be CGI programs that accept POST operations. A POST can supply an arbitrarily large number of form field keys. If you can show that the Python 'cgi' module is vulnerable to such an attack, in a dramatic disproportion to the size of the data transmitted (since obviously it's as much of a DoS to flood a script with a large quantity of data), then it might be worth making changes to the 'cgi' module, or at least warning the developers of alternatives to CGI (e.g. Zope, Quixote, SkunkWeb, CherryPy, etc.) that alternate hashes might be a good idea. But based on the discussion so far, I'm not sure I see how this attack would produce an effect that was dramatically disproportionate to the amount of data transmitted.