sandj.williams at gte.net
Thu Dec 21 00:57:41 CET 2000
Skip Montanaro wrote:
> Python 2.0 no longer ships with a soundex module. Sometime ago, Tim Peters
> and Fred Drake each cooked up replacements written in Python. I merged them
> together into a single module which is available from
> If you have any questions or comments on the module, please send them my
Soundex routines traditionally return a fixed number of characters--the NDIGITS
in your routine. That's 'cause the system was developed before computers.
I've found you can good results with a variable length soundex string--the more
information you give the routine (first names, middle names, prefixes and
suffixes) the better/smaller the result set.
Store the full soundex key as a varchar in your database and use the SQL LIKE
statement to do the retrieval.
This is particularly useful with one syllable surnames--you really need to add
more to the name to get anything useful. (Mao Tse-Tung == M000 vs. M32352, you
be the judge).
print get_soundex('van der tamp')
print get_soundex('van der tamp, albert')
print get_soundex('van der tamp, albert c, lieutenant colonel, Phd')
V5363514163243553245413 <== this is the full key stored as a varchar in your
So a retrieval like V53635141632435532454% will return all the lieutenant
colonel albert c van der tamps in your database, whether they have PhDs or not.
More information about the Python-list