[Doc-SIG] soundex module status?

Skip Montanaro skip@mojam.com (Skip Montanaro)
Mon, 28 Jun 1999 11:32:11 -0400 (EDT)


    Fred> But it's just too obscure to be considered "standard".  It doesn't
    Fred> appear to exactly match any description I've seen of the algorithm
    Fred> (it produces more result than it should!).

It does indeed produce longer results than other implementations.  It
appears that modulo any bugs, it could be brought into "spec" by just
lopping off the last two characters.  I compared its output with the
examples in the Perl soundex module documentation at

    http://language.perl.com/newdocs/lib/Text/Soundex.html

Except for lopping off the last two characters, the only difference I found
was in the mappings for Lukasiewicz and Lissajous.  The Perl version yields
L222, while the soundex module yields L200.  I think the Perl version has a
bug, because duplicate digits should be avoided, though I haven't got
Knuth's algorithm to refer to.  NIST's C implementation at

    http://physics.nist.gov/cuu/Reference/soundex.html

does avoid duplicate digits.

    Fred> There's no reason to doubt it's module-hood; if someone tells
    Fred> Guido they'll adopt it, I'm sure he'll consider it a welcome
    Fred> offering.

I'll be happy to "adopt" the module, especially if it will keep it "in the
fold".  I tried sending mail to the original author but it bounced (not
really surprising).  Should I do something formal to take it over?  Perhaps
add my name to the code and send it in for update?

Skip