[Python-ideas] An identity dict
Benjamin Peterson
benjamin at python.org
Tue Jun 1 23:17:25 CEST 2010
Raymond Hettinger <raymond.hettinger at ...> writes:
> Benjamin, could you elaborate of several points that are unclear:
>
> * If id() is expensive in PyPy, then how are they helped by the code in
> http://codespeak.net/svn/pypy/trunk/pypy/lib/identity_dict.py
> which uses id() for the gets and sets and contains?
At the top of that file, it imports from the special module __pypy__ which
contains an optimized version of the dict.
>
> * In the examples you posted (such
as http://codespeak.net/svn/pypy/trunk/pypy/tool/algo/graphlib.py ),
> it appears that PyPy already has an identity dict, so how are they helped by
adding one to the collections module?
My purpose with those examples was to prove it as a generally useful utility.
>
> * Most of the posted examples already work with regular dicts (which check
identity before they check equality) -- don't the other implementations already
implement regular dicts which need to have identity-implied-equality in order to
pass the test suite? I would expect the following snippet to work under all
versions and implementations of Python:
>
>
> >>> class A:
> ... pass
> >>> a = A()
> >>> d = {a: 10}
> >>> assert d[a] == 10 # uses a's identity for lookup
Yes, but that would be different if you have two "a"s with __eq__ defined to be
equal and you want to hash them separately.
>
> * Is the proposal something needed for all implementations or is it just an
optimization for a particular, non-CPython implementation?
My contention is that an identity dictionary or at least a dictionary with
custom hash and keys is a useful primitive that should be in the standard
library. However, I also see its advantage in avoiding bad performance of id()
based identity dicts in non-CPython implementations.
It is useful to let the implementation optimize it any time there is moving GC
as in Jython and IronPython where id also is expensive. (Basically a mapping has
to be maintained for all objects on which id is called.)
More information about the Python-ideas
mailing list