[Python-Dev] Algoritmic Complexity Attack on Python

Guido van Rossum guido@python.org
Sun, 01 Jun 2003 09:16:07 -0400


> [Guido]
> > Of course, such programs are already vulnerable to changes in the hash
> > implementation between Python versions (which has happened before).

[Tim]
> Last Thursday, Jim and I speculated about adding a persistent set
> type to Zope, implemented via an IOBTree under the covers, mapping
> int hash codes to lists of elements.  I'm sure someone else has
> already done something "like that"; I know I have in the past, but
> only to run monstrously large hash-function evaluation tests, where
> I needed to accumulate results across multiple runs.

I hope I can dissuade you from using Python's default hash codes; they
differ across platforms with different word lengths, too.  I don't
know enough about the requirements for this proposed persistent set
type; perhaps it would be acceptable to require that the set elements
define a method that returns a unique identifier which is an immutable
object using only Python built-in types?

> It's curious that persistence schemes are fragile in areas Python
> doesn't really care about, like how objects of different types
> compare to each other, how class instances that don't define __cmp__
> compare to each other, and in exactly which bit pattern
> hash(whatever) returns.

I doubt this is a coincidence.  When designing Python's run time, I
was very well aware of process boundaries and lifetime, and used them
to limit the scope of universal truths.  I wasn't really thinking of
persistence.

> In Python 3 it may be nice to define the results of such things more
> carefully in platform- and release-independent ways -- or maybe to
> raise exceptions instead of making results up, forcing users to
> confront the issues explicitly.  Lots of tradeoffs here ...

For comparisons, I've long considered the current "everything can be
ordered with respect to everything else" paradigm broken, and for 3.0
I plan to break this, allowing only == and != comparisons between
objects of different types that don't specifically cater to cross-type
comparisons (the default being unequal if the types differ).  Ordering
comparisons will default to an exception.

But this won't affect hashing; I don't think I'd like to see the hash
function fixed by the language standard, so the hash function may
still vary between releases (or between runs, for Python
implementations that follow Scott's recommendation).

--Guido van Rossum (home page: http://www.python.org/~guido/)