SpellChecker
Peter Otten
__peter__ at web.de
Wed May 20 05:36:25 EDT 2009
abosalim wrote:
> I used this code.It works fine,but on word not whole text.I want to
> extend this code to correct
> text file not only a word,but i don't know.If you have any help,please
> inform me.
import re
import sys
def correct(word, _lookup={"teh": "the"}):
"""
Replace with Norvig's implementation found at
http://norvig.com/spell-correct.html
"""
return _lookup.get(word.lower(), word)
def correct_word(word):
corrected = correct(word)
if corrected != word:
if word.istitle():
corrected = corrected.title()
if word.isupper():
corrected = corrected.upper()
print >> sys.stderr, "correcting", word, "-->", corrected
return corrected
def sub_word(match):
return correct_word(match.group())
def correct_text(text):
return re.compile("[a-z]+", re.I).sub(sub_word, text)
if __name__ == "__main__":
text = "Teh faster teh better TEH BIGGER"
print "original:", text
print "corrected:", correct_text(text)
Peter
PS: Don't you get bored if you have all your code written for you?
More information about the Python-list
mailing list