get word base

John Hunter jdhunter at nitace.bsd.uchicago.edu
Sat Jun 29 10:48:51 EDT 2002


Thanks Daniel and Bengt; excellent suggestions both.

I have installed both packages and put them into a head to head
competition.  It appears to me on a VERY LIMITED sample, that wn does
better than stemmer.  The only error (ie not what I expected) in wn
was with 'walking', which stemmer got.  So I merged them together for
the mother of all morphological root finders.  

import Stemmer, wntools
st = Stemmer.Stemmer('english')

def wnroot(w):
    return wntools.morphy(w, "noun") or \
           wntools.morphy(w, "verb") or \
           wntools.morphy(w, "adjective") or \
           wntools.morphy(w, "adverb")

def stemmer_or_wn(word):
    root = wnroot(word)
    if root != word:
        return root
    return st.stem(word)

words = ['sent', 'walking', 'thoughts', 'rakes', 'eaten', 'tried']

print 'words   : ', words
print 'stemmer : ', st.stem( words )
print 'wntools : ', [wnroot(w) for w in words]
print 'combo   : ', [stemmer_or_wn(w) for w in words]


~/python/examples $ python wordnet_demo.py
words   :  ['sent', 'walking', 'thoughts', 'rakes', 'eaten', 'tried']
stemmer :  ['sent', 'walk', 'thought', 'rake', 'eaten', 'tri']
wntools :  ['send', 'walking', 'thought', 'rake', 'eat', 'try']
combo   :  ['send', 'walk', 'thought', 'rake', 'eat', 'try']

Thanks for the help,
John Hunter



More information about the Python-list mailing list