String comparision

S.Selvam Siva s.selvamsiva at
Mon Jan 26 07:30:05 CET 2009

Thank You Gabriel,

On Sun, Jan 25, 2009 at 7:12 AM, Gabriel Genellina
<gagsl-py2 at>wrote:

> En Sat, 24 Jan 2009 15:08:08 -0200, S.Selvam Siva <s.selvamsiva at>
> escribió:
>  I am developing spell checker for my local language(tamil) using python.
>> I need to generate alternative word list for a miss-spelled word from the
>> dictionary of words.The alternatives must be as much as closer to the
>> miss-spelled word.As we know, ordinary string comparison wont work here .
>> Any suggestion for this problem is welcome.
> I think it would better to add Tamil support to some existing library like
> GNU aspell:

That was my plan earlier,But i am not sure how aspell integrates with other
editors.Better i will ask it in aspell mailing list.

> You are looking for "fuzzy matching":
> In particular, the Levenshtein distance is widely used; I think there is a
> Python extension providing those calculations.
> --
> Gabriel Genellina

The following code served my purpose,(thanks for some unknown contributors)
def distance(a,b):
c = {}
    n = len(a); m = len(b)

    for i in range(0,n+1):
        c[i,0] = i
    for j in range(0,m+1):
        c[0,j] = j

    for i in range(1,n+1):
        for j in range(1,m+1):
            x = c[i-1,j]+1
            y = c[i,j-1]+1
            if a[i-1] == b[j-1]:
                z = c[i-1,j-1]
                z = c[i-1,j-1]+1
            c[i,j] = min(x,y,z)
    return c[n,m]

print "d=",d
longer = float(max((len(a), len(b))))
shorter = float(min((len(a), len(b))))
r = ((longer - d) / longer) * (shorter / longer)
# r ranges between 0 and 1

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Python-list mailing list