Question about __hash__

Felix Thibault felixt at dicksonstreet.com
Thu May 4 21:30:40 EDT 2000


I have been using classes like these to simplify searching through lists:

class BeginsWith(Pattern):
    def __init__(self, *beginning):
        self.indices = range(len(beginning))
        self.beginning = beginning

    def __cmp__(self, other):
        try:
            for index in self.indices:
                if other[index] != self.beginning[index]:
                    return cmp(other[index], self.beginning[index])
            else:
                return 0
        except IndexError:
            return 1

    __rcmp__ = __cmp__


class Any(Pattern):
    def __cmp__(self, other):
        return 0
    __rcmp__ = __cmp__

Right now I'm looping through a list of (TextTools parsings of) the
html tags in a document, to build another list of [tagname, where_i_open,
where_i_close] lists. This isn't as straightforward as I thought it was.
Right now it looks like sometimes I will need to search for all the tags 
that are still open with a certain name- that look like 
             ['table', Any(), None] ,
 say- in the list that I'm building. I want to cache these searches in 
a dictionary so I don't repeat them over and over on the same lists,
so I have to add __hash__ methods to my Pattern subclasses. Right now 
I'm calculating the hash value ahead of time: 

      self.hv = hash(self.__class__) + hash(initargs) 

(calculated in the class body with initargs = () for classes like Any and
 in __init__ for classes like BeginsWith)

with:
       def __hash__(self):
           return self.hv

So my questions are:
       1) Is there another way I should be calculating hash values besides
                               	   adding them so I can be sure I won't get
an overflow ? Or am I 
	   being paranoid ?

       2) If these classes have (instance) attributes that do change (ie,
  	             another attribute that stores the last successful match) is
what
          I've done enough to be sure that this doesn't affect key matching ?

       3) The documentation mentions that if you define __hash__ you need to
          define __cmp__. Does that mean I need to make sure that instances
          compare to each other as equal if they compare to the same objects
          as equal ? (Which wouldn't be the case with BeginsWith like I have
          it now)

Thanks!
   Felix





More information about the Python-list mailing list