[Python-3000] callable()
Nick Coghlan
ncoghlan at gmail.com
Tue Jul 25 14:53:45 CEST 2006
Andrew Koenig wrote:
>> In both cases, __hash__ is not idempotent, and is thus an abomination.
>
> Why do you say it's not idempotent? The first time you call it, either it
> works or it doesn't. If it doesn't work, then you shouldn't have called it
> in the first place. If it does work, all subsequent calls will return the
> same result.
>
>> Case
>> 1 is a perverse programmer -- well known to be capable of abominations.
>
> What is perverse about case 1? I'm not being disingenuous here; I really
> don't know. I am assuming, of course, that the object in question never
> changes the value of its component once constructed.
I wouldn't call case 1 perverse, but I would call it buggy if x.partialhash()
wasn't idempotent, or if it used the same hash cache as the full hash function.
E.g. there's no state consistency problems with the following:
def __init__(self):
self._fullhash = None
self._partialhash = None
def partialhash(self, init_hash=None):
if self._partialhash is None:
#work it out and set it
return self._partialhash
def __hash__(self):
if self._fullhash is None:
selfpart = self.partialhash()
self._fullhash = self.y.partialhash(selfpart)
return self._fullhash
Alternatively, the __hash__ function could be written in a transactional
style, backing out the call to the partial hash if the hash of the
subcomponent failed:
def __init__(self):
self._fullhash = None
def __hash__(self):
if self._fullhash is None:
selfpart = self.partialhash()
try:
self._fullhash = self.y.partialhash(selfpart)
except:
self._clearpartialhash()
raise
return self._fullhash
Either way, if the __hash__ function can fail in a way that can leave the
object in an inconsistent state, then that's a bug in the implementation of
the __hash__ function.
For case 2, the problem is the idea of using the hash of the entire CD as the
__hash__ of the object that represents that CD in memory, and then making the
retrieval of that data a side effect of attempting to hash the object.
Touching an IO device or the network to compute the hash of an in memory data
structure sounds like an incredibly bad idea. If that information is an
important enough part of the object's identity to be included in its hash, it
needs to be retrieved before the object can be considered fully created, and
it should NOT be done as a side effect of trying to hash the object. Instead,
if the attribute is not set, the hash operation should simply fail with
something like RuntimeError("necessary attribute not set"). Or you can be
stricter, and make the attribute mandatory at object creation time.
All of which is a long-winded way of saying "calculation of an object hash
should be both cheap and idempotent" :)
Cheers,
Nick.
--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
---------------------------------------------------------------
http://www.boredomandlaziness.org
More information about the Python-3000
mailing list