
Raymond Hettinger <raymond.hettinger@...> writes:
Benjamin, could you elaborate of several points that are unclear:
* If id() is expensive in PyPy, then how are they helped by the code in http://codespeak.net/svn/pypy/trunk/pypy/lib/identity_dict.py which uses id() for the gets and sets and contains?
At the top of that file, it imports from the special module __pypy__ which contains an optimized version of the dict.
* In the examples you posted (such
as http://codespeak.net/svn/pypy/trunk/pypy/tool/algo/graphlib.py ),
it appears that PyPy already has an identity dict, so how are they helped by adding one to the collections module?
My purpose with those examples was to prove it as a generally useful utility.
* Most of the posted examples already work with regular dicts (which check
identity before they check equality) -- don't the other implementations already implement regular dicts which need to have identity-implied-equality in order to pass the test suite? I would expect the following snippet to work under all versions and implementations of Python:
>>> class A: ... pass >>> a = A() >>> d = {a: 10} >>> assert d[a] == 10 # uses a's identity for lookup
Yes, but that would be different if you have two "a"s with __eq__ defined to be equal and you want to hash them separately.
* Is the proposal something needed for all implementations or is it just an
optimization for a particular, non-CPython implementation? My contention is that an identity dictionary or at least a dictionary with custom hash and keys is a useful primitive that should be in the standard library. However, I also see its advantage in avoiding bad performance of id() based identity dicts in non-CPython implementations. It is useful to let the implementation optimize it any time there is moving GC as in Jython and IronPython where id also is expensive. (Basically a mapping has to be maintained for all objects on which id is called.)