Dictionary keys (again) (was Re: lambda)

Nick Coghlan ncoghlan at iinet.net.au
Thu Jan 20 04:49:16 EST 2005


Antoon Pardon wrote:
> As you said yourself, people use shortcuts when they express themselves.
> With 'mutable key' I mean a mutable object that is used as a key.
> That doesn't contradict that the key itself can't change because
> it is inaccessible.

Yeah - this was the point of terminology that was getting confused, I think. And 
your interpretation was closer to the strict sense of the words.

>>Well, the Zen of Python states "In the face of ambiguity, refuse the temptation 
>>to guess".
> 
> But python does guess in the case of the value. If I want to establish a
> relationship between "foo" and [3, 5, 8] then I can be in big trouble
> if that list gets mutated in something else.

However, the actual promise a Python dict makes is that it maps from keys by 
value to values by identity.

That is, "k1 == k2" implies "d[k1] is d[k2]". An identity test on mutable 
objects is entirely unambiguous (hence they can be values), but comparison is 
not (hence they cannot be keys). Restricting keys to immutable objects 
eliminates the ambiguity. Copying keys would also prevent mutation and eliminate 
the ambiguity. The restriction approach is a *definite* performance enhancement 
over the copying approach, though (this is probably the origin of the mantra 
"the restriction to mutable keys is a performance enhancement"). The resulting 
semantics of the restriction to immutable objects are also significantly less 
surprising than those of the copying approach.

An identity dict would make the promise that "k1 is k2" implies "d[k1] is 
d[k2]". In this case, the mutability of the key is irrelevant, so the behaviour 
is unambiguous without any additional restrictions.

> But they don't have to be immutable within the dictionary itself, that
> is just an implementation detail. The only thing that matters is that
> the dictionary can reproduce the relationship. How that is done is
> immaterial. 

Given the Python interpreter's current definition of "immutable" as "defines 
__hash__", it's a little hard to have an unhashable object as a key :)

>>In short, sane hash tables require immutable keys, and how mutable keys acquire 
>>the requisite immutability is going to be application dependent.
> 
> 
> No they don't, that is just the most easy implementation. If some
> implementation would constantly change the internal objects but 
> consulting the dictionary wouldn't show any different effect then
> that would be fine too. And even in this case a mutation
> from the outside would probably cause havoc because it would ruin
> the sanity of the internal structure.

If you insist: s/immutable/effectively immutable/

And no data type can be expected to tolerate having their invariants violated by 
external code messing with internal data structures.

>>Provision of a safe_dict or identity_dict would merely allow a programmer to 
>>state their intent at the time the container is created, rather than having to 
>>state it whenever keys are generated or referenced.
> 
> Well that would be a good thing IMO.

True - I'm not sure how that 'merely' snuck in there :)

> Good enough in any case for me to
> spend some time analyzing the requirements. Whether I'll actually
> implement something depends on how much time I have and how often I
> need it.

My preference is for the identity dict - its performance would actually be 
*better* than that of a standard dict in the domains where it applies 
(annotation and classification of objects are the two main uses I can see for 
such a data structure). This aligns quite well with the stated goal of the 
collections module (providing data structures that are less general purpose than 
the standard builtins, but deliver better performance for their particular use 
cases)

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at email.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://boredomandlaziness.skystorm.net



More information about the Python-list mailing list