[Python-Dev] gc ideas -- sparse memory

Steven D'Aprano steve at pearwood.info
Sat Dec 4 11:54:31 CET 2010


Martin v. Löwis wrote:
>> I'm afraid I don't follow you. Unless you're suggesting some sort of
>> esoteric object system whereby objects *don't* have identity (e.g. where
>> objects are emergent properties of some sort of distributed,
>> non-localised "information"), any object naturally has an identity --
>> itself.
> 
> Not in Java or C#. It is in these languages possible to determine
> whether to references refer to the same object. However, objects don't
> naturally have a distinct identification (be it an integer or something
> else).

Surely even in Java or C#, objects have an *identity* even if the 
language doesn't provide a way to query their distinct *identification*. 
An object stored at one memory location is distinct from any other 
object stored at different memory locations at that same moment in time, 
regardless of whether or not the language gives you a convenient label 
for that identity. Even if that memory location can change during the 
lifetime of the object, at any one moment, object X is a different 
object from every other object.

The fact that we can even talk about "this object" versus "that object" 
implies that objects have identity.

To put it in Python terms, if the id() function were removed, it would 
no longer be possible to get the unique identification associated with 
an object, but you could still compare the identity of two objects using 
`is`.

Of course, I'm only talking about objects. In Java there are values 
which aren't objects, such as ints and floats. That's irrelevant for our 
discussion, because Python has no such values.


> If you really want to associate unique numbers with objects in these
> languages, the common approach is to put them into an identity
> dictionary as keys.
> 
>> It seems counter-productive to me to bother with an identity function
>> which doesn't meet that constraint. If id(x) == id(y) implies nothing
>> about x and y (they may, or may not, be the same object) then what's the
>> point?
> 
> See James' explanation: it would be possible to use this as the
> foundation of an identity hash table.

I'm afraid James' explanation didn't shed any light on the question to 
me. It seems to me that Java's IdentityHashValue[sic -- I think the 
correct function name is actually IdentityHashCode] is equivalent to 
Python's hash(), not to Python's id(), and claiming it is related to 
identity is misleading and confusing.

I don't think I'm alone here -- it seems to me that even among Java 
programmers, the same criticisms have been raised:

http://bugs.sun.com/bugdatabase/view_bug.do?bug%5Fid=6321873
http://deepakjha.wordpress.com/2008/07/31/interesting-fact-about-identityhashcode-method-in-javalangsystem-class/

Like hash(), IdentityHashCode doesn't make any promises about identity 
at all. Two distinct objects could have the same hash code, and a 
further test is needed to distinguish them.


>> Why would you bother using that function when you could just use
>> x == y instead?
> 
> Because in a hash table, you also need a hash value.

Well, sure, in a hash table you need a hash value. But I was talking 
about an id() function.

So is that it? Is IdentityHashValue (or *Code, as the case may be) just 
a longer name for hash()?



-- 
Steven



More information about the Python-Dev mailing list