Defensive programming

Sun Jun 1 13:23:41 EDT 2003

Paul Rubin <http://phr.cx@NOSPAM.invalid> wrote in message news:<7xptlxq2yz.fsf at ruckus.brouhaha.com>...
> Jack Diederich <jack at performancedrivers.com> writes:
> > The author brought this up on python-dev and the 'bots did indeed
> > disect the arguments.  The verdict was that it is interesting but not a
> > problem (or at least a solvable problem) in practice.  Specific problems are 
> > best solved at specific layers, and DoS attacks that eat CPU are best solved
> > at the operating system level.
> 
> I think there's something to be said for using collision resistant
> hashes whenever any kind of hash is needed.  Why fall back on the OS
> to get you out of trouble, when you can avoid getting in trouble in
> the first place?  I haven't read the papers yet though.

What in the world are collision resistant hashes?  It seems to me that
a certain number of collisions are all but guaranteed for any given
hash table size, depending on the range of the input.  For example, if
you are hashing strings consisting of size 11 consisting of [A-Z] (DOS
style file names, for example) into a table of size 5693 (a reasonable
value, being a prime and all), then I would expect there to be a
sequence of filenames that could generate a collision on every insert.
 The simple reason is that there are 5693 spaces in the table and 3.67
x 10^15 possible inputs, so at least 6.44 x 10 ^ 11 inputs must hash
to one of those 5693 spaces in the table by the pigeonhole principle.

Of course, the hash table would have to grow many times if you tried
to insert 6.44 x 10 ^ 11 filenames into it, but that would just chew
more CPU.