Sets in Python

Bryan Olson fakeaddress at nowhere.org
Thu Sep 20 05:58:05 CEST 2007


Karthik Gurusamy wrote:
> While it's easy to explain the behavior, I think the decision to dis-
> allow mutable items as keys is a bit arbitrary. 

Furthermore, it's not really true.

     class Blurf (object):
         def __init__(self, intval):
             self.seti(intval)
         def seti(self, intval):
             self.i = intval


Blurf sure looks mutable, yet Python will happily use Blurfs as
dict keys. The trick is that the dict key is the Blurf's identity,
not its state. We could switch to using state as the key by
implementing methods __hash__ and either __cmp__ or __eq__.


[...]
> why not allow similar behavior with lists/other sequence/even other
> collections. As long as two objects compare equal the hash-result must
> be the same. 

That's a good rule to follow in programming one's own classes.
Keep __hash__, __cmp__ and __eq__ consistent.


> I guess this takes us to defining the equality operation
> for lists-- which I think has a very obvious definition (ie same
> length and the ith element of each list compare equal).

Already done. The "==" operator compares lists by their contents;
the "is" operator, by identity.

> So if the list changes, it will result in a different hash and we will
> get a hash-miss. I doubt this is in anyway less intuitive than dis-
> allowing mutable items as keys.

If it's any consolation, you can build your own:

     class HashableList (list):
         def __hash__(self):
             return tuple(self).__hash__()

You'd probably also want to implement slicing.


Bad news: Python 3000 has no immutable type for byte-strings.
The new bytes type cannot serve for dict keys or set members.
Many things one would want to hash are unhashable -- for
example, the results of the hash functions in hashlib.


-- 
--Bryan



More information about the Python-list mailing list