Steve Williams sandj.williams at
Thu Dec 21 00:57:41 CET 2000

Skip Montanaro wrote:

> Python 2.0 no longer ships with a soundex module.  Sometime ago, Tim Peters
> and Fred Drake each cooked up replacements written in Python.  I merged them
> together into a single module which is available from
> If you have any questions or comments on the module, please send them my
> way.

Soundex routines traditionally return a fixed number of characters--the NDIGITS
in your routine.  That's 'cause the system was developed before computers.

I've found you can good results with a variable length soundex string--the more
information you give the routine (first names, middle names, prefixes and
suffixes) the better/smaller the result set.

Store the full soundex key as a varchar in your database and use the SQL LIKE
statement to do the retrieval.

This is particularly useful with one syllable surnames--you really need to add
more to the name to get anything useful.  (Mao Tse-Tung == M000 vs. M32352, you
be the judge).

For example,
    print get_soundex('van')
    print get_soundex('van der tamp')
    print get_soundex('van der tamp, albert')
    print get_soundex('van der tamp, albert c, lieutenant colonel, Phd')
V5363514163243553245413   <== this is the full key stored as a varchar in your

So a retrieval like V53635141632435532454% will return all the lieutenant
colonel albert c van der tamps in your database, whether they have PhDs or not.

More information about the Python-list mailing list