[Python-3000] callable()

Nick Coghlan ncoghlan at gmail.com
Tue Jul 25 15:53:37 CEST 2006


Andrew Koenig wrote:
>> All of which is a long-winded way of saying "calculation of an object hash
>> should be both cheap and idempotent" :)
> 
> Actually, I disagree -- I don't see why there's anything wrong with a hash
> being expensive to calculate the first time you do it.

True, but if you cache the result, the amortized cost may still work out to be 
cheap.

> For example, consider a string type in which the hash algorithm examines
> every character of the string.  Those characters had to get there in the
> first place, so the total time spent computing the hash is no more than a
> constant multiple of the time spent creating the string.  Nevertheless, it
> seems reasonable to me to defer the effort of computing the hash until you
> know that it's needed -- that is, until the first time you are asked to
> compute the hash.
> 
> If you're going to say that computing the hash isn't expensive compared with
> dealing with the string itself, then I'll reply that computing the hash of a
> CD isn't expensive either, if you compare it with dealing with the CD itself
> :-)

The difference between the two cases is that when you create the string 
object, you have all the information you need to calculate the hash when you 
need it, so there's no problem with deferring the actual calculation.

With your CD example, you need an external resource (the CD itself) in order 
to calculate the hash - in that case, you can't safely defer the hash 
calculation until the first time you know you need it, since you don't know 
whether or not you'll have access to the physical CD at that point. In such a 
case, an application would need to either hash the CD at object creation time, 
or else use the default hash() for the objects representing the CDs in memory, 
and auxiliary data structures to map the physical CD hashes when they are 
available.

Having __hash__ depend on attributes which are not defined at object creation 
time is just asking for trouble. This aligns with the concept that the hash 
should be based solely on immutable aspects of the object, which _have_ to be 
defined at object creation time :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------
             http://www.boredomandlaziness.org


More information about the Python-3000 mailing list