Sets in Python

Karthik Gurusamy kar1107 at gmail.com
Wed Sep 19 22:58:03 CEST 2007


On Sep 19, 6:16 am, Sion Arrowsmith <si... at chiark.greenend.org.uk>
wrote:
> sapsi  <saptarshi.g... at gmail.com> wrote:
> > Why can't lists be hashed?
>
> Several people have answered "because they're mutable" without
> explaining why mutability precludes hashing. So:
>
> Consider a dict (dicts have been in Python a *lot* longer than
> sets, and have the same restriction) which allowed lists as
> keys:
>
> d = {}
> k = [1, 2]
> d[k] = None
>
> Now, if I were to do:
>
> k.append(3)
>
> what would you expect:
>
> d.keys()
>
> to return? Did d magically rehash k when it was modified? Did d[k]
> take a copy of k, and if so, how deep was the copy (consider
> d[[1, k]] = None followed by a modification to k)? Leaving the hash
> unchanged and relying on collision detection to resolve won't work,
> since you may go directly for d[[1, 2, 3]] and not spot that
> there's already an entry for it since it's been hashed under [1, 2].
>
> "Practicality beats purity" and the design decision was to simply
> sidestep these issues by disallowing mutable dict keys. And as the
> set implementation is based on the dict implementation, it applies
> to sets to.

While it's easy to explain the behavior, I think the decision to dis-
allow mutable items as keys is a bit arbitrary. There is no need for
dict to recompute hash (first of all, a user doesn't even need to know
if underneath 'hashing' is used -- the service is just a mapping
between one item to another item).

Since we know hashing is used, all that is needed is, a well-defined
way to construct a hash out of a mutable. "Given a sequence, how to
get a hash" is the problem. If later the given sequence is different,
that's not the dict's problem.

>>> d = {}
a = 10
>>> d[a] = 'foo'
>>> d[5+5] = 'bar'
>>> d[10]
'bar'

aren't the '5+5' which is 10, is different from the previous line's
a?.. so
why not allow similar behavior with lists/other sequence/even other
collections. As long as two objects compare equal the hash-result must
be the same. I guess this takes us to defining the equality operation
for lists-- which I think has a very obvious definition (ie same
length and the ith element of each list compare equal).

So if the list changes, it will result in a different hash and we will
get a hash-miss. I doubt this is in anyway less intuitive than dis-
allowing mutable items as keys.

Karthik


>
> --
> \S -- si... at chiark.greenend.org.uk --http://www.chaos.org.uk/~sion/
>    "Frankly I have no feelings towards penguins one way or the other"
>         -- Arthur C. Clarke
>    her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump





More information about the Python-list mailing list