Tuple question

Bengt Richter bokr at oz.net
Sun Sep 5 21:35:23 CEST 2004

On Sun, 05 Sep 2004 10:35:43 -0400, Roy Smith <roy at panix.com> wrote:

>I asked:
>> How is a class instance's mutability any less of disqualifier for
>> key-ness than a list's mutability?
>Benjamin Niemann <pink at odahoda.de> wrote:
>> a = [1, 2, 3]
>> b = [1, 2, 3]
>> if a == b:
>> 	print "List equality is based on content"
>Tuple (and string) equality is based on content too.  So what?  I can 
>give my data class an __eq__ method, and then my class instance equality 
>would also based on content.
>So, to restate my original question, why should my mutable, 
>content-based-eqality class instance be a valid dictionary key, when a 
>list is not?  Which part of a list's behavior makes it inherently 
>unusable as a key?  I'm not asking about design philosophy, I'm asking 
>about observable behavior.

I don't think a list is _inherently_ unusable, but an immutable sequence
is usable in a different way because of what can be assumed.
I suspect it has something to do with optimizing lookup. For immutables,
equal id should mean equal hash and equal value. If id's are not equal,
hashes only need to be computed once for an immutable, since they can
be cached in the immutables's internal representation (trading a little
space for computation time). If hashes are equal but id's are not, you
either have duplicate tuples/immutables or a rare collision. If you are
forced to compare values in the rare-collision case, I guess you are down
to comparing vectors of pointers, and comparison would then be similar
for tuples and lists. Different lengths would be early out non-equal. Etc.

It does seem like you could allow lists as keys, but it would mean
a performance hit when using them, even if you managed to get type-dependent
dispatching in the internal logic for free. You could still cache a list hash
internally, but you would have to invalidate it on list mutation, which would
add cost to mutation. Optimization tradeoffs ripple in surprising ways, and only
pay off if they are good in real usage patterns. Not easy to get right.

As it is, you could take a big hit and subclass dict to fake it, or you could
sublass list to provide tuple-like hashing and comparing ;-)

Bengt Richter

More information about the Python-list mailing list