get word base
John Hunter
jdhunter at nitace.bsd.uchicago.edu
Sat Jun 29 10:48:51 EDT 2002
Thanks Daniel and Bengt; excellent suggestions both.
I have installed both packages and put them into a head to head
competition. It appears to me on a VERY LIMITED sample, that wn does
better than stemmer. The only error (ie not what I expected) in wn
was with 'walking', which stemmer got. So I merged them together for
the mother of all morphological root finders.
import Stemmer, wntools
st = Stemmer.Stemmer('english')
def wnroot(w):
return wntools.morphy(w, "noun") or \
wntools.morphy(w, "verb") or \
wntools.morphy(w, "adjective") or \
wntools.morphy(w, "adverb")
def stemmer_or_wn(word):
root = wnroot(word)
if root != word:
return root
return st.stem(word)
words = ['sent', 'walking', 'thoughts', 'rakes', 'eaten', 'tried']
print 'words : ', words
print 'stemmer : ', st.stem( words )
print 'wntools : ', [wnroot(w) for w in words]
print 'combo : ', [stemmer_or_wn(w) for w in words]
~/python/examples $ python wordnet_demo.py
words : ['sent', 'walking', 'thoughts', 'rakes', 'eaten', 'tried']
stemmer : ['sent', 'walk', 'thought', 'rake', 'eaten', 'tri']
wntools : ['send', 'walking', 'thought', 'rake', 'eat', 'try']
combo : ['send', 'walk', 'thought', 'rake', 'eat', 'try']
Thanks for the help,
John Hunter
More information about the Python-list
mailing list