[Python-Dev] Counting collisions for the win
Guido van Rossum
guido at python.org
Fri Jan 20 19:20:08 CET 2012
This is the first objection I have seen against collision-counting that
On Fri, Jan 20, 2012 at 7:55 AM, Frank Sievertsen <pydev at sievertsen.de>wrote:
> I still see at least two ways to create a DOS attack even with the
> I assumed that it's possible to send ~500KB of payload to the
> 1. It's fully deterministic which slots the dict will lookup.
> Since we don't count slot-collisions, but only hash-value-collisions
> this can be exploited easily by creating strings with the hash-values
> along the lookup-way of an arbitrary (short) string.
> So first we pick an arbitrary string. Then calculate which slots it will
> visit on the way to the first empty slot. Then we create strings with
> hash-values for these slots.
> This attack first injects the strings to fill all the slots that the
> one short string will want to visit. Then it adds THE SAME string
> again and again. Since the entry is already there, nothing will be added
> and no additional collisions happen, no exception raised.
> $ ls -l super.txt
> -rw-r--r-- 1 fx5 fx5 520000 20. Jan 10:19 super.txt
> $ tail -n3 super.txt
> $ wc -l super.txt
> 90000 super.txt
> $ time python -c 'dict((unicode(l[:-1]), 0) for l in open("super.txt"))'
> real 0m52.724s
> user 0m51.543s
> sys 0m0.028s
> 2. The second attack actually attacks that 1000 allowed string
> comparisons are still a lot of work.
> First I added 999 strings that collide with a one-byte string "a". In
> some applications a zero-byte string might work even better. Then I
> can add a many thousand of the "a"'s, just like the first attack.
> $ ls -l 1000.txt
> -rw-r--r-- 1 fx5 fx5 500000 20. Jan 16:15 1000.txt
> $ head -n 3 1000.txt
> $ wc -l 1000.txt
> 247000 1000.txt
> $ tail -n 3 1000.txt
> $ time python -c 'dict((unicode(l[:-1]), 0) for l in open("1000.txt"))'
> real 0m17.408s
> user 0m15.897s
> sys 0m0.008s
> Of course the first attack is far more efficient. One could argue
> that 16 seconds is not enough for an attack. But maybe it's possible
> to send 1MB, have zero-bytes strings, and since for example django
> does 5 lookups per query-string this will keep it busy for ~80 seconds on
> my pc.
> What to do now?
> I think it's not smart to reduce the number of allowed collisions
> AND count all slot-collisions at the same time.
> Python-Dev mailing list
> Python-Dev at python.org
> Unsubscribe: http://mail.python.org/**mailman/options/python-dev/**
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-Dev