This page uses content from Wikipedia and is licensed under CC BY-SA.

Tamil script

Word Tamil.svg
Languages Tamil
Time period
c. 700 – present[1]
Parent systems
Sister systems
Dhives akuru
Direction Left-to-right
ISO 15924 Taml, 346
Unicode alias

The Tamil script (தமிழ் அரிச்சுவடி; Tamiḻ ariccuvaṭi; [t̪ɐmɨɻ] [aɾiˈ͡tʃːuʋaɖi]; About this sound pronunciation ) is an abugida script that is used by Tamils and Tamil speakers in India, Sri Lanka, Malaysia, Singapore and elsewhere to write the Tamil language, as well as to write the liturgical language Sanskrit, using consonants and diacritics not represented in the Tamil alphabet.[2] Certain minority languages such as Saurashtra, Badaga, Irula, and Paniya are also written in the Tamil script.[3]


Diverging evolution of Tamil Brahmi script (center column) into the Vatteluttu alphabet (leftmost column) and the Tamil script (rightmost column)

The Tamil script has 12 vowels (உயிரெழுத்து; uyireḻuttu; "soul-letters"), 18 consonants (மெய்யெழுத்து; meyyeḻuttu; "body-letters") and one special character, the (ஆயுத எழுத்து; āyutha eḻuttu; Tamil version of Visarga). is pronounced as "akku" or "அக்கு" and is classified in Tamil grammar as being neither a consonant nor a vowel.[4] However, it is listed at the end of the vowel set. The script is syllabic, not alphabetic. The complete script, therefore, consists of the 31 letters in their independent form and an additional 216 combinant letters, for a total of 247 combinations (உயிர்மெய்யெழுத்து; uyirmeyyeḻuttu; "soul-body-letters") of a consonant and a vowel, a mute consonant, or a vowel alone. The combinant letters are formed by adding a vowel marker to the consonant. Some vowels require the basic shape of the consonant to be altered in a way that is specific to that vowel. Others are written by adding a vowel-specific suffix to the consonant, yet others a prefix, and still other vowels require adding both a prefix and a suffix to the consonant. In every case, the vowel marker is different from the standalone character for the vowel.

The Tamil script is written from left to right.


Historical evolution of Tamil writing from the earlier Tamil Brahmi near the top to the current Tamil script at bottom.

The Tamil script, like the other Brahmic scripts, is thought to have evolved from the original Brahmi script.[5] The earliest inscriptions which are accepted examples of Tamil writing date to a time just after the Ashokan period. The script used by such inscriptions is commonly known as the Tamil-Brahmi, or "Tamili script", and differs in many ways from standard Ashokan Brahmi. For example, early Tamil-Brahmi, unlike Ashokan Brahmi, had a system to distinguish between pure consonants (m, in this example) and consonants with an inherent vowel (ma, in this example). In addition, according to Iravatham Mahadevan, early Tamil Brahmi used slightly different vowel markers, had extra characters to represent letters not found in Sanskrit, and omitted letters for sounds not present in Tamil such as voiced consonants and aspirates.[5] Inscriptions from the 2nd century use a later form of Tamil-Brahmi, which is substantially similar to the writing system described in the Tolkāppiyam, an ancient Tamil grammar. Most notably, they used the puḷḷi to suppress the inherent vowel.[6] The Tamil letters thereafter evolved towards a more rounded form, and by the 5th or 6th century, they had reached a form called the early vaṭṭeḻuttu.[7]

The modern Tamil script does not, however, descend from that script.[8] In the 6th century, the Pallava dynasty created a new script for Tamil, and the Grantha alphabet evolved from it, adding the Vaṭṭeḻuttu alphabet for sounds not found to write Sanskrit.[9] Parallel to Pallava script a new script (Chola-Pallava script, which evolved to modern Tamil script) again emerged in Chola territory resembling the same glyph development like Pallava script, but it did not evolve from that. By the 8th century, the new scripts supplanted Vaṭṭeḻuttu in the Chola resp. Pallava kingdoms which lay in the north portion of the Tamil-speaking region. However, the Vaṭṭeḻuttu was still continued to be used in the southern portion of the Tamil-speaking region, in the Chera and Pandyan kingdoms until the 11th century, when the Pandyan kingdom was conquered by the Cholas.[10]

With the fall of Pallava kingdom, the Chola dynasty pushed the Chola-Pallava script as the de facto script. Over the next few centuries, the Chola-Pallava script evolved into the modern Tamil script. The Grantha and its parent script influenced the Tamil script notably. The use of palm leaves as the primary medium for writing led to changes in the script. The scribe had to be careful not to pierce the leaves with the stylus while writing because a leaf with a hole was more likely to tear and decay faster. As a result, the use of the puḷḷi to distinguish pure consonants became rare, with pure consonants usually being written as if the inherent vowel were present. Similarly, the vowel marker for the kuṟṟiyal ukaram, a half-rounded u which occurs at the end of some words and in the medial position in certain compound words, also fell out of use and was replaced by the marker for the simple u. The puḷḷi did not fully reappear until the introduction of printing, but the marker kuṟṟiyal ukaram never came back into use although the sound itself still exists and plays an important role in Tamil prosody.

The forms of some of the letters were simplified in the 19th century to make the script easier to typeset. In the 20th century, the script was simplified even further in a series of reforms, which regularised the vowel markers used with consonants by eliminating special markers and most irregular forms.

Relationship with other Indic scripts

The Tamil script differs from other Brahmi-derived scripts in a number of ways. Unlike every other Bramic script, it does not regularly represent voiced or aspirated stop consonants as these are not phonemes of the Tamil language even though voiced and fricative allophones of stops do appear in spoken Tamil. Thus the character க் k, for example, represents /k/ but can also be pronounced /ɡ/ or /x/ based on the rules of Tamil grammar. A separate set of characters appears for these sounds when the Tamil script is used to write Sanskrit or other languages.

Also unlike other Brahmi scripts, the Tamil script rarely uses typographic ligatures to represent conjunct consonants, which are far less frequent in Tamil than in other Indian languages. Where they occur, conjunct consonants are written by writing the character for the first consonant, adding the puḷḷi to suppress its inherent vowel, and then writing the character for the second consonant. There are a few exceptions, namely க்ஷ kṣa and ஸ்ரீ śrī.

ISO 15919 is an international standard for the transliteration of Tamil and other Indic scripts into Latin characters. It uses diacritics to map the much larger set of Brahmic consonants and vowels to the Latin script. Tamil can be transliterated into English by using ISO 15919, since English uses the Latin script for writing.


Mangulam Tamil Brahmi inscription in Mangulam, Madurai district, Tamil Nadu dated to Tamil Sangam period c. 400 BCE to c. 200 CE.
Explanation for Mangulam Tamil Brahmi inscription in Mangulam, Madurai district, Tamil Nadu dated to Tamil Sangam period c. 400 BCE to c. 200 CE.
Left:Tampiran Vanakkam (Doctrina Christum) was the first book in Tamil, printed on 20 October 1578. Right:An earlier book in Tamil printed in 1781.

Basic consonants

Consonants are called the "body" (mei) letters. The consonants are classified into three categories: vallinam (hard consonants), mellinam (soft consonants, including all nasals), and itayinam (medium consonants).

There are some lexical rules for formation of words. The Tolkāppiyam describes such rules. Some examples: a word cannot end in certain consonants, and cannot begin with some consonants including r-, l- and ḻ-; there are two consonants for the dental n - which one should be used depends on whether the n occurs at the start of the word and on the letters around it. (Historically, one n was pronounced as an alveolar consonant, as is still true in Malayalam.)

The order of the alphabet (strictly abugida) in Tamil closely matches that of the nearby languages both in location and linguistics, reflecting the common origin of their scripts from Brahmi.

Table: Tamil consonants.[11]

Consonant ISO 15919 Category IPA
க் k vallinam [k], [ɡ], [x], [ɣ], [h]
ங் mellinam [ŋ]
ச் c vallinam [t͡ʃ], [d͡ʒ], [ʃ], [s], [ʒ], [z]
ஞ் ñ mellinam [ɲ]
ட் vallinam [ʈ], [ɖ], [ɽ]
ண் mellinam [ɳ]
த் t vallinam [], [], [ð]
ந் n mellinam [n̪]
ப் p vallinam [p], [b], [β]
ம் m mellinam [m]
ய் y idaiyinam [j]
ர் r idaiyinam [ɾ]
ல் l idaiyinam [l]
வ் v idaiyinam [ʋ]
ழ் idaiyinam [ɻ]
ள் idaiyinam [ɭ]
ற் vallinam [r], [t], [d]
ன் mellinam [n]

Grantha consonants used in Tamil

The Tamil speech has incorporated many phonemes which were not part of the Tolkāppiyam classification. The letters used to write these sounds, known as "grantha", are used as part of Tamil. These are taught from elementary school and incorporated in the Tamil Nadu Government encoding called Tamil All Character Encoding (TACE16).

Table: Grantha consonants in Tamil.[11]

Consonant ISO 15919 IPA
ஜ் j [d͡ʒ]
ஶ் ś [ɕ], [ʃ]
ஷ் [ʂ]
ஸ் s [s]
ஹ் h [h]
க்ஷ் kṣ [kʂ]

There is also the compound ஸ்ரீ (śrī), equivalent to श्री in Devanagari.

In recent times four combinations of Tamil basic letters are generally used to depict sounds of English letters 'f', 'z', and 'x'. Some East Asian names containing 'kh' sounds. This is for writing English and Arabic names and words in Tamil. The combinations are ஃப for f, ஃஜ for z, ஃஸ் for x, and ஃக் for kh. For example: asif = அசிஃப், aZaarudheen = அஃஜாருதீன், rex = ரெஃஸ், Genghis Khan = செங்கிஸ் ஃகான்.[citation needed]

There has also been effort to differentiate voice and unvoiced consonants through superscripted and subscripted integers - one, two, three and four standing for the unvoiced, unvoiced aspirated, voiced, voiced aspirated, respectively - this was used to transcribe Sanskrit words in Tamil, in Sanskrit-Tamil books.[12][13] For example: க₁ - Ka க₂ - Kha க₃ - Ga க₄ - Gha This extension of Tamil script is not yet recognized by Unicode. The place where the numbers should be placed is immediately neighbouring the root consonant character; Unicode currently renders those as கௌ₂ or க₂ௌ on possible permutations.


Vowels are also called the 'life' (uyir) or 'soul' letters. Together with the consonants (mei, which are called 'body' letters), they form compound, syllabic (abugida) letters that are called 'living' letters (uyir mei, i.e. letters that have both 'body' and 'soul').

Tamil vowels are divided into short and long (five of each type) and two diphthongs.

Isolated form

Table: Tamil vowels (Isolated form).[11]

Vowel ISO 15919 IPA
a [ʌ]
ā [ɑː]
i [i]
ī [iː]
u [u], [ɯ]
ū [uː]
e [e]
ē [eː]
ai [ʌj]
o [o]
ō [oː]
au [ʌʋ]

Compound form

Using the consonant 'k' as an example:

Formation Compound form ISO 15919 IPA
க் + அ ka [kʌ]
க் + ஆ கா [kɑː]
க் + இ கி ki [ki]
க் + ஈ கீ [kiː]
க் + உ கு ku [ku], [kɯ]
க் + ஊ கூ [kuː]
க் + எ கெ ke [ke]
க் + ஏ கே [keː]
க் + ஐ கை kai [kʌj]
க் + ஒ கொ ko [ko]
க் + ஓ கோ [koː]
க் + ஔ கௌ kau [kʌʋ]

The special letter (called akh) is the visarga. It traditionally served a purely grammatical function, but in modern times it has come to be used as a diacritic to represent foreign sounds. For example, ஃப is used for the English sound f, not found in Tamil.

The long (nedil) vowels are about twice as long as the short (kuṟil) vowels. The diphthongs are usually pronounced about one and a half times as long as the short vowels, though some grammatical texts place them with the long (nedil) vowels.

As can be seen in the compound form, the vowel sign can be added to the right, left or both sides of the consonants. It can also form a ligature. These rules are evolving and older use has more ligatures than modern use. What you actually see on this page depends on your font selection; for example, Code2000 will show more ligatures than Latha.

There are proponents of script reform who want to eliminate all ligatures and let all vowel signs appear on the right side.

Unicode encodes the character in logical order (always the consonant first), whereas legacy 8-bit encodings (such as TSCII) prefer the written order. This makes it necessary to reorder when converting from one encoding to another; it is not sufficient simply to map one set of codepoints to the other.

Compound table of Tamil letters

The following table lists vowel (uyir or life) letters across the top and consonant (mei or body) letters along the side, the combination of which gives all Tamil compound (uyirmei) letters.


Tholkapyam consonants

a ā i ī u ū e ē ai o ō au
க் k கா கி கீ கு கூ கெ கே கை கொ கோ கௌ
ங் ஙா ஙி ஙீ ஙு ஙூ ஙெ ஙே ஙை ஙொ ஙோ ஙௌ
ச் c சா சி சீ சு சூ செ சே சை சொ சோ சௌ
ஞ் ñ ஞா ஞி ஞீ ஞு ஞூ ஞெ ஞே ஞை ஞொ ஞோ ஞௌ
ட் டா டி டீ டு டூ டெ டே டை டொ டோ டௌ
ண் ணா ணி ணீ ணு ணூ ணெ ணே ணை ணொ ணோ ணௌ
த் t தா தி தீ து தூ தெ தே தை தொ தோ தௌ
ந் n நா நி நீ நு நூ நெ நே நை நொ நோ நௌ
ப் p பா பி பீ பு பூ பெ பே பை பொ போ பௌ
ம் m மா மி மீ மு மூ மெ மே மை மொ மோ மௌ
ய் y யா யி யீ யு யூ யெ யே யை யொ யோ யௌ
ர் r ரா ரி ரீ ரு ரூ ரெ ரே ரை ரொ ரோ ரௌ
ல் l லா லி லீ லு லூ லெ லே லை லொ லோ லௌ
வ் v வா வி வீ வு வூ வெ வே வை வொ வோ வௌ
ழ் ழா ழி ழீ ழு ழூ ழெ ழே ழை ழொ ழோ ழௌ
ள் ளா ளி ளீ ளு ளூ ளெ ளே ளை ளொ ளோ ளௌ
ற் றா றி றீ று றூ றெ றே றை றொ றோ றௌ
ன் னா னி னீ னு னூ னெ னே னை னொ னோ னௌ
Grantha compound table

Grantha consonants

a ā i ī u ū e ē ai o ō au
ஶ் ś ஶா ஶி ஶீ ஶு ஶூ ஶெ ஶே ஶை ஶொ ஶோ ஶௌ
ஜ் j ஜா ஜி ஜீ ஜு ஜூ ஜெ ஜே ஜை ஜொ ஜோ ஜௌ
ஷ் ஷா ஷி ஷீ ஷு ஷூ ஷெ ஷே ஷை ஷொ ஷோ ஷௌ
ஸ் s ஸா ஸி ஸீ ஸு ஸூ ஸெ ஸே ஸை ஸொ ஸோ ஸௌ
ஹ் h ஹா ஹி ஹீ ஹு ஹூ ஹெ ஹே ஹை ஹொ ஹோ ஹௌ
க்ஷ் kṣ க்ஷ க்ஷா க்ஷி க்ஷீ க்ஷு க்ஷூ க்ஷெ க்ஷே க்ஷை க்ஷொ க்ஷோ க்ஷௌ

Numerals and symbols

Apart from the numerals (0-9), Tamil also has numerals for 10, 100 and 1000. Symbols for day, month, year, debit, credit, as above, rupee, numeral are present as well.

0 1 2 3 4 5 6 7 8 9 10 100 1000
day month year debit credit as above rupee numeral time quantity

In Unicode

The Unicode range for Tamil is U+0B80–U+0BFF. Grey areas indicate non-assigned code points. Most of the non-assigned codepoints are designated reserved because they are in the same relative position as characters assigned in other South Asian script blocks that correspond to phonemes that don't exist in the Tamil script.

Like other South Asian scripts in Unicode, the Tamil encoding was originally derived from the ISCII standard. Both ISCII and Unicode encode Tamil as an abugida. In an abugida, each basic character represents a consonant and default vowel. Consonants with a different vowel or bare consonants are represented by adding a modifier character to a base character. Each codepoint representing a similar phoneme is encoded in the same relative position in each South Asian script block in Unicode, including Tamil. Although Unicode represents Tamil as an abugida all the pure consonants (consonants with no associated vowel) and syllables in Tamil can be represented by combining multiple Unicode codepoints, as can be seen in the Unicode Tamil Syllabary below.

In Unicode 5.1, named sequences were added for all Tamil pure consonants and syllables. Unicode 5.1 also has a named sequence for the Tamil ligature SRI (śrī), ஶ்ரீ . The name of this sequence is TAMIL SYLLABLE SHRII, and is composed of the Unicode sequence U+0BB6 U+0BCD U+0BB0 U+0BC0.

Official Unicode Consortium code chart (PDF)
  0 1 2 3 4 5 6 7 8 9 A B C D E F
U+0BBx ி
1.^ As of Unicode version 10.0
2.^ Grey areas indicate non-assigned code points

Programmatic access

  • Tamil script can be manipulated using the Python library called open-Tamil.[14]

See also


  1. ^ Rajan, K. (December 2001). "Territorial Division as Gleaned from Memorial Stones". East and West. Istituto Italiano per l'Africa e l'Oriente (IsIAO). 51 (3/4): 363. Retrieved 30 December 2016.  (table showing Tamil in row for the 601–800 period)
  2. ^ Allen, Julie (2006), The Unicode 5.0 Standard (5 ed.), Upper Saddle River, NJ: Addison-Wesley, ISBN 0-321-48091-0  at p. 324
  3. ^ Lewis, M. Paul, ed. (2009), Ethnologue: Languages of the World (16th ed.), Dallas, Tex.: SIL International, retrieved 2009-08-28 
  4. ^ University of Madras Tamil Lexicon, page 148: "அலியெழுத்து [ aliyeḻuttu n ali-y-eḻuttu . < அலி¹ +. 1. The letter , as being regarded as neither a vowel nor a consonant; ஆய்தம். (வெண்பாப். முதன்மொ. 6, உரை.) 2. Consonants; மெய்யெ ழுத்து. (பிங்.)."]
  5. ^ a b Mahadevan 2003, p. 173.
  6. ^ Mahadevan 2003, p. 230.
  7. ^ Mahadevan 2003, p. 211.
  8. ^ Mahadevan 2003, p. 209.
  9. ^ Mahadevan 2003, p. 213.
  10. ^ Mahadevan 2003, p. 212.
  11. ^ a b c Steever 1996, p. 426-430.
  12. ^ []
  13. ^ []
  14. ^ "Open-Tamil 0.65 : Python Package Index". 


  • Mahadevan, Iravatham (2003), Early Tamil Epigraphy from the Earliest Times to the Sixth Century A.D., Harvard Oriental Series, Volume 62, Cambridge: Harvard University Press, ISBN 0-674-01227-5 
  • Steever, Sanford B. (1996), "Tamil Writing", in Bright, William R.; Daniels, Peter B., The World's Writing Systems, New York: Oxford University Press, p. 426-430, ISBN 0-19-507993-0 

External links

Media related to Tamil script at Wikimedia Commons