I asked a few of my friends about the SOUNDEX issues that were brought up on this list a while ago, and got the following interesting answers. I deleted all headers and greetings and stuff in the interests of brevity, but can provide followup information for anyone who is really interested. Michael ---------------------------- Text of forwarded message ----------------------- ... I don't think it is correct to attribute it to Knuth ... I read somewhere that the algorithm was devised by a monk in the 16th century. ---------------------------- Text of forwarded message ----------------------- Although you specifically asked for NON soundex routines, I sent you a copy of a REXX version that I believe is fairly tailorable as to the 'values' of each letter. It seems to me that if you use a different pattern of values for each language (assuming the language is known), then you can 'stress the importance' of different groups of letters. So if you let alphabet ="ABCDEFGHIJKLMNOPQRSTUVWXYZ ", and for English, let alphaval ="01230120022455012673010702". and for French some other pattern (say with the 'M' and 'N' more differentiated) perhaps you can get to where you are going. ---------------------------- Text of forwarded message ----------------------- I used to do some work with soundex in record linkage work. Seems to me, it predates Knuth. I found the application rules unnecessarily complicated and inconsistent. The methods i used in the end were quite interesting, using actual entropic weights for the identifiers. The note from who-ever talking about first 7 voiced consonants made me laugh -- neither Maaori nor Chinese have too many of those :-) Which 22 languages should be covered? Is the input sound or text? Is _name_ really all you have? what are the costs of false + vs false -? is a unique answer required, or would it be adequate to produce the n most likely hits for human selection?