for x,y in word1, word2 ?

Timothy Grant timothy.grant at gmail.com
Mon Aug 11 21:17:56 CEST 2008


On Mon, Aug 11, 2008 at 12:13 PM, Dave Webster <caseyweb at gmail.com> wrote:
> Thanks, Timothy.  I'm pretty sure that there is no such thing as a "beautiful"
> implementation of double-metaphone but I would personally like to have a copy
> of your python implementation.  I have a fairly elegant version of the original
> metaphone algorithm I wrote myself (in PERL, many years ago) but I've
> never found
> the time to reverse-engineer the original C++ code for double-metaphone and
> "pythonize" it.
>
> On Mon, Aug 11, 2008 at 2:08 PM, Timothy Grant <timothy.grant at gmail.com> wrote:
>> On Mon, Aug 11, 2008 at 8:44 AM, Casey <Caseyweb at gmail.com> wrote:
>>> My first thought is that you should be looking at implementations of
>>> Hamming Distance.  If you are actually looking for something like
>>> SOUNDEX you might also want to look at the double metaphor algorithm,
>>> which is significantly harder to implement but provides better
>>> matching and is less susceptible to differences based on name origins.
>>> --
>>> http://mail.python.org/mailman/listinfo/python-list
>>>
>>
>> I responded in the thread of the poster's original message on this
>> subject, but will do the same here. I have a horribly ugly version of
>> the double-metaphone algorithm in python that does work, and may be of
>> some use in solving this problem.
>>
>> --
>> Stand Fast,
>> tjg. [Timothy Grant]
>>
>
This is truly cringe-worthy, and pretty much a direct port of the C++
code. It need unit tests (which are on my "to-do someday" list) but
even though it's ugly it does work and I have managed to do real work
with it.


-- 
Stand Fast,
tjg. [Timothy Grant]
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DMetaph.py
Type: text/x-python
Size: 24353 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20080811/679d65c6/attachment.py>


More information about the Python-list mailing list