[Python-Dev] Algoritmic Complexity Attack on Python
Scott A Crosby
scrosby@cs.rice.edu
29 May 2003 18:29:45 -0500
On Thu, 29 May 2003 18:31:56 -0400, "Tim Peters" <tim@zope.com> writes:
> [Jeremy Hylton]
> >> I just too a minute too look at this. I downloaded the python-attack
> >> file from your Web site. I loading all the strings and then inserted
> >> them into a dictionary. I also generated a list of 10,000 random
> >> strings and inserted them into a dictionary.
>
> [Scott Crosby]
> > Ok. It should have taken almost a minute instead of .08 seconds in the
> > attack version. My file is broken. I'll be constructing a new one
> > later this evening. If you test perl with the perl files, you'll see
> > what should have occured in this case.
>
> Note that the 10,000 strings in the file map to 400 distinct 32-bit hash
> codes under Python's hash. It's not enough to provoke worst-case behavior
Yes. Jeremey has made me aware of this. I appear to have made a
mistake when inserting python's hash code into my program that finds
generators. The fact that I find so many collisions, but I don't have
everything colliding indicates what the problem is.
> in Python just to collide on the low-order bits: all 32 bits contribute to
> the probe sequence (just colliding on the initial bucket slot doesn't have
> much effect).
Correct. My program, as per the paper, generates full 32 bit hash
collisions. It doesn't generate bucket collisions.
> As is, it's effectively creating 400 collision chains, ranging in
> length from 7 to 252, with a mean length of 25 and a median of 16.
Yes. I don't know python so I had no test program to verify that my
attack file was correct. Its not. Now that I have one, I'll quash the
bug and release a new file later this evening.
Scott