, ,

My non-tamil speaking friends often take a dig at me regarding typical tamil spellings of non-tamil words.

Tamil does not distinguish phonologically between voiced and unvoiced consonants;
phonetically, voice is assigned depending on a consonant’s position in a word.
Of all the 18 or so official languages in India, only tamil has this peculiarity of not being able to represent all the sounds.

Let us take an example,
The name Padma is pronounce ‘Pad’ as in pud-dle and ma as in Ma-ll.
Typical tamil pronounciations range from padma -> padhma -> badhma -> bathma -> badma.

This is because, tamil has just one letter to represent ‘Pa’, ‘Pha’, ‘Ba’, ‘Bha’.
So, it is not surprising to find the name Brinda transformed into

Brinda -> Birundha -> Pirundha -> Piruntha.

But wait, though the spellings are indeed different, isnt the sound “more-or-less” the same ?

If we can associate some value to the “phonetics” then we can perhaps determine if Brinda and Piruntha indeed sound the same !

Here is where soundex comes in !

Let us look at the names and their soundex values. The first letter represents the starting letter of the name.
The rest gives the sound a number. Closer the two numbers, similar they sound.

Brinda = B653
Birundha = B653
Pirundha = P653
Piruntha = P653

padhma = P350
badma = B350
bathma = B350
badhma = B350

Here is the javascript implementation of soundex.

Lets us take names from different cultures

(1) Kuhlmann and Kulamagal

Kuhlmann = K450
Kulamagal = K452

(2) Thurman and Duraimurugan

Thurman = T650
Duraimurugan = D656

A name by any spelling, sounds as much sweet isnt it ?

But there are some variations that too much to ask for
like Lakshmi -> Letchumi

Lakshmi = L250
Letchumi = L325

Applications use soundex to overcome spelling differences in names etc. To find if an applicant has any previous insurance policy, the search on the database is often performed using soundex to get the possible matches. Most Databases provide this function out of the box.