Question about __hash__

Felix Thibault felixt at
Thu May 4 21:30:40 EDT 2000

I have been using classes like these to simplify searching through lists:

class BeginsWith(Pattern):
    def __init__(self, *beginning):
        self.indices = range(len(beginning))
        self.beginning = beginning

    def __cmp__(self, other):
            for index in self.indices:
                if other[index] != self.beginning[index]:
                    return cmp(other[index], self.beginning[index])
                return 0
        except IndexError:
            return 1

    __rcmp__ = __cmp__

class Any(Pattern):
    def __cmp__(self, other):
        return 0
    __rcmp__ = __cmp__

Right now I'm looping through a list of (TextTools parsings of) the
html tags in a document, to build another list of [tagname, where_i_open,
where_i_close] lists. This isn't as straightforward as I thought it was.
Right now it looks like sometimes I will need to search for all the tags 
that are still open with a certain name- that look like 
             ['table', Any(), None] ,
 say- in the list that I'm building. I want to cache these searches in 
a dictionary so I don't repeat them over and over on the same lists,
so I have to add __hash__ methods to my Pattern subclasses. Right now 
I'm calculating the hash value ahead of time: 

      self.hv = hash(self.__class__) + hash(initargs) 

(calculated in the class body with initargs = () for classes like Any and
 in __init__ for classes like BeginsWith)

       def __hash__(self):
           return self.hv

So my questions are:
       1) Is there another way I should be calculating hash values besides
                               	   adding them so I can be sure I won't get
an overflow ? Or am I 
	   being paranoid ?

       2) If these classes have (instance) attributes that do change (ie,
  	             another attribute that stores the last successful match) is
          I've done enough to be sure that this doesn't affect key matching ?

       3) The documentation mentions that if you define __hash__ you need to
          define __cmp__. Does that mean I need to make sure that instances
          compare to each other as equal if they compare to the same objects
          as equal ? (Which wouldn't be the case with BeginsWith like I have
          it now)


More information about the Python-list mailing list