
2010/6/1 Benjamin Peterson <benjamin@python.org>:
My contention is that an identity dictionary or at least a dictionary with custom hash and keys is a useful primitive that should be in the standard library. However, I also see its advantage in avoiding bad performance of id() based identity dicts in non-CPython implementations.
It is useful to let the implementation optimize it any time there is moving GC as in Jython and IronPython where id also is expensive. (Basically a mapping has to be maintained for all objects on which id is called.)
Here is how I designed this for my language: You can request the ObjectId of the given object. If an ObjectId corresponding to the given object is still alive, you always get it back again, but it can be GC'ed and later created afresh. ObjectIds are hashable and comparable (with an arbitrary ordering). Hash values and the ordering are preserved when ObjectIds are kept alive, but they may be different if ObjectIds are created afresh. An ObjectId contains an integer index which is unique among ObjectIds being alive at the same time. You can make a dictionary with a specified key function. It is internally backed by something equivalent to f(k) -> (k, v) dict. A dictionary with ObjectId constructor as the key is an identity dictionary; it works because it keeps both k and f(k) alive. An advantage of this scheme is that with a moving GC the id mapping must be maintained only for objects for which the program keeps their ObjectIds alive. A disadvantage is that the program must be careful to not use ObjectIds in a manner which does not keep them alive yet expects consistent hashing and ordering. In particular a key-mapped dict which would store only (k, v) pairs and compute f(k) on the fly would not work. Also ObjectIds cannot be used to generate printable unique identifiers which would be valid without having to keep ObjectIds alive, like in Python's default repr. -- Marcin Kowalczyk