SpellChecker

Peter Otten __peter__ at web.de
Wed May 20 11:36:25 CEST 2009


abosalim wrote:

> I used this code.It works fine,but on word not whole text.I want to
> extend this code to correct
> text file not only a word,but i don't know.If you have any help,please
> inform me.

import re
import sys

def correct(word, _lookup={"teh": "the"}):
    """
    Replace with Norvig's implementation found at

    http://norvig.com/spell-correct.html
    """
    return _lookup.get(word.lower(), word)

def correct_word(word):
    corrected = correct(word)
    if corrected != word:
        if word.istitle():
            corrected = corrected.title()
        if word.isupper():
            corrected = corrected.upper()
        print >> sys.stderr, "correcting", word, "-->", corrected
    return corrected

def sub_word(match):
    return correct_word(match.group())

def correct_text(text):
    return re.compile("[a-z]+", re.I).sub(sub_word, text)

if __name__ == "__main__":
    text = "Teh faster teh better TEH BIGGER"
    print "original:", text
    print "corrected:", correct_text(text)


Peter

PS: Don't you get bored if you have all your code written for you?





More information about the Python-list mailing list