[Python-Dev] Algoritmic Complexity Attack on Python

Tim Peters tim_one@email.msn.com
Thu, 29 May 2003 23:00:35 -0400


[Scott A Crosby]
> These ways require me having the ability to feed a program, an
> expression, or a regular expression into the victim's python
> interpreter.

I think you underestimate the perversity of regular expressions in
particular.  Many programs use (fixed) regexps to parse input, and it's
often possible to construct inputs that take horribly long times to match,
or, especially, to fail to match.  Blowing the hardware stack (provoking a
memory fault) is also usually easy.  The art of writing robust regexps for
use with a backtracking engine is obscure and difficult (see Friedl's
"Mastering Regular Expressions" (O'Reilly) for a practical intro to the
topic), and regexps are ubiquitous now.

> The attack I discuss only require that it hash some arbitrary input by
> the attacker, so these attacks apply in many more cases.

While a regexp attack only requires that the program parse one user-supplied
string <wink>

>>     http://www.cs.dartmouth.edu/~doug/mdmspe.pdf

> This is a very cool paper in exactly the same vein as ours. Thanks.

It is a cool paper, and you're welcome.  I don't think it's in the same
vein, though -- McIlroy presented it as an interesting discovery, not as "a
reason" for people to get agitated about programs using quicksort.  The most
likely reason you didn't find references to it before is because nobody in
real life cares much about this attack possibility.

>> I'm uninterested in trying to "do something" about these.  If
>> resource-hogging is a serious potential problem in some context, then
>> resource limitation is an operating system's job, and any use of
>> Python (or Perl, etc) in such a context should be under the watchful
>> eyes of OS subsystems that track actual resource usage.

> I disagree. Changing the hash function eliminates these attacks on
> hash tables.

It depends on how much access an attacker has, and, as you said before,
you're not aware of any specific application in Python that *can* be
attacked this way.  I'm not either.

In any case, the universe of resource attacks is much larger than just
picking on hash functions, so plugging a hole in those alone wouldn't do
anything to ease my fears -- provided I had such fears, which is admittedly
a stretch <wink>.  If I did have such fears, I'd want the OS to alleviate
them all at once.