[Python-Dev] Algoritmic Complexity Attack on Python
Tim Peters
tim.one@comcast.net
Sun, 01 Jun 2003 22:12:11 -0400
[Guido]
> I hope I can dissuade you from using Python's default hash codes;
It's easy to dissuade me; it's Jim we have to worry about <wink>.
> they differ across platforms with different word lengths, too. I don't
> know enough about the requirements for this proposed persistent set
> type; perhaps it would be acceptable to require that the set elements
> define a method that returns a unique identifier which is an immutable
> object using only Python built-in types?
I hope so, but don't know the use cases well enough to guess whether that
would be considered good enough.
> ...
> I doubt this is a coincidence. When designing Python's run time, I
> was very well aware of process boundaries and lifetime, and used them
> to limit the scope of universal truths. I wasn't really thinking of
> persistence.
Neither was I <wink>.
> ...
> For comparisons, I've long considered the current "everything can be
> ordered with respect to everything else" paradigm broken, and for 3.0
> I plan to break this, allowing only == and != comparisons between
> objects of different types that don't specifically cater to cross-type
> comparisons (the default being unequal if the types differ). Ordering
> comparisons will default to an exception.
That would go a long way toward keeping people out of trouble with
persistent BTree-based structures.
> But this won't affect hashing; I don't think I'd like to see the hash
> function fixed by the language standard, so the hash function may
> still vary between releases (or between runs, for Python
> implementations that follow Scott's recommendation).
For Python 3 I hope we (you) can consider another line of flexibility too:
sometimes when I build a hash table, I want an __eq__ that isn't "the
natural" __eq__ for the key objects. For example, using a dict to partition
objects by equivalence class wants to use the equivalence relation of
interest at the moment. This implies a specific dict may want to use a
"non-standard" __hash__ too. Hiding the real objects in wrapper objects
works for this purpose, of course, but sometimes it would be a lot more
convenient to pass hash and eq functions to a dict's constructor.