Name Origin: Pāli means "text", a reference to the canonical scriptures of Buddhism.

Classification. Indo-European, Indo-Iranian, Middle Indo-Aryan (early stage).

Overview. A popular language of the early historical period of India that became the liturgical language of Theravada Buddhism.

Status. Extinct. Pali is documented from the 3rd century BCE. It ceased to be spoken on the first millennium CE but persisted as an international means of communication among followers of Theravada Buddhism.

Distribution. Pali was probably spoken in the western north of India, and from there it spread in the 3rd century BCE to Sri Lanka where it became a religious language. As such, it was propagated, later, to South East Asia.

Oldest Documents. They are the verse portions of the Tipitaka ('three baskets'), the canonical scriptures of Theravada Buddhism (also known as Hinayana), transmitted orally from the 3rd century BCE until they were written down towards the end of the millennium.


Vowels (10). The vowel system of Pali is more symmetrical than that of Sanskrit, having 5 short and 5 long vowels: 


Short and long e, as well as short and long o are allophones; before a consonant cluster e: and o: are shortened.

The syllabic (liquid) vocals of Sanskrit (ṛ, ṝ, ḷ) are replaced by a, i, u, and  the Sanskrit diphthongs ai, au by e:, o:, respectively.

Consonants (33). Pali preserves the whole range of stops and nasals from Sanskrit, including a complete set of voiceless stops (unaspirated and aspirated) and voiced stops (unaspirated and aspirated). Stops and nasals are articulated at five different places, being classified as labial, dental, retroflex, palatal and velar. The palatal stops are, in fact, affricates. The retroflex stops ɖ/ɖʰ become ɭ/ɭʰ in intervocalic position (allophones). Pali also preserves the 4 Sanskrit glides and liquids (j, r, l, w). In contrast, the three sibilants (fricatives) of Sanskrit are reduced to one (s). Pali has another fricative sound, the glottal ɦ.


A number of consonantal restrictions, not present in Sanskrit, developed in Pali:

  1. Final consonants are avoided.

  2. No more than two consonants can follow a short vowel, nor more than one follows a long vowel.

  3. There is assimilation of consonant clusters  (raktaratta, saptasatta).

Sandhi (sound changes at the juncture of words ) is not imperative as in Sanskrit and is limited mainly to vowels;  consonantal sandhi has disappeared.

Scripts and Orthography. Pali uses different scripts in different countries: Sinhala in Sri Lanka, Devanāgarī in India, Burmese in Myanmar, Khmer in Cambodia. The Latin script with diacritical marks is also employed:

*the aspirated stops and affricates are written as digraphs (pʰ = ph, dʰ = dh, etc).

*the retroflex stops ʈ , ɖ are written ṭ , ḍ

*the affricates tʃ , dʒ are represented as c , j

*the glottal fricative ɦ is written h

*the nasals ɳ , ɲ , ŋ are represented as ṇ , ñ , ṅ

*the retroflex liquid ɭ is written ḷ

*the glide w is written v


  1. Nominal. Pali is an inflective language using a variety of grammatical cases to indicate the relative function of nouns, adjectives and pronouns, a process called declension. All adjectives agree with their nouns/pronouns in case, gender, and number.

  1. gender: masculine, neuter, feminine.

  1. number: singular, plural. The dual number of Sanskrit has disappeared.

  1. case: nominative, vocative, accusative, instrumental, dative, ablative, genitive, locative. The dative of Sanskrit has been almost eliminated, its functions assumed by the genitive; it is restricted to express purpose.

  2. As a consequence of the loss of final consonants and the disappearance of ṛ, the declension system has been largely reduced to the following paradigms:

  3. a (masculine, neuter), ā (feminine), i  (masculine, neuter, feminine), ī  (masculine, feminine), u (masculine, neuter, feminine), ū (masculine, feminine). Consonant declension is vestigial and of marginal importance (as an example we show that of a stem ending in n).


  1. The a-neuter and i-neuter stems have one single form for both the nominative and accusative cases; they differ from the masculine ones only in the accusative singular and nominative/accusative plural. For example rūpa ('form') has nominative/accusative singular rūpaṁ and nominative/accusative plural rūpāni; akkhi ('eye') has nominative/accusative singular akkhi(ṁ) and nominative/accusative plural akkhīni or akkhī.

  1. The i-feminine stem declension is the same as that of the ī-stem.

  1. The u-declensions are identical to the i-declensions except in one form of the nominative plural; for example in bikkhu ('monk') it is bikkhū or bikkhavo.

  1. pronouns: personal, demonstrative, interrogative, indefinite, relative, reflexive.

  2. Personal pronouns are genderless and lack a form for the 3rd person; the demonstrative pronoun sa, that distinguishes gender, is used, instead (see below). They have many variant forms.


  1. There are four demonstrative pronouns (two proximal and two distal) which distinguish gender and are inflected in all cases (except vocative). Their nominative forms are:


  1. There is a single interrogative pronoun ('who?'/'what?'). Its nominative forms are: ko (ms), ke (mp), ki (ns), kāni (np), (fs), kā/kāyo (fp).

  1. The indefinite pronouns are formed by adding the particles ci(d), api or cana to the interrogative pronoun; for example koci (ms) meaning 'any', 'some', 'anyone'. If they are preceded by na ('not') we obtain the meanings 'none', 'no one', 'nothing'. These particles may be also added to interrogative adverbs to give them an indefinite sense: kudā ('when'), kudācana ('sometimes'), etc.

  1. The relative pronoun has these nominative forms: yo (ms), ye (mp), ya/yad (ns), yāni (np), (fs), yā/yāyo (fp).

  1. Attā ('self'), whose declension was shown above, is used as a reflexive pronoun. Others, like saya and sāma, are invariable.

  1. Verbal. The Old Indo-Aryan verbal system has been reorganized and simplified. Now, all finite and most non-finite forms derive from the present stem. The dual number was lost and the middle voice survives only as a relic. Besides, the three main past tenses of Sanskrit, the aorist, imperfect and perfect, merged into a single past tense (the preterite).

  1. person and number: 1s, 2s, 3s; 1p, 2p, 3p.

  1. tense: present, preterite, future, conditional.

  2. Conjugations are based on the present stem (the third person singular of the present indicative minus the personal ending -ti). There are two conjugation types: in one, the stem ends in a, and in the other in e or ā (rarely in ī or o). The first type includes the Old Indo-Aryan thematic classes 1, 4, 6, and the second the athematic ones (see Sanskrit).

  1. The present is formed by adding 'primary endings' to the stem which are almost identical to those of Sanskrit (see below). The future uses the same stem and personal endings, being distinguished from the present by the infix -issa-.

  1. Most preterite forms derive from the Sanskrit sigmatic aorist, though now attached to the present stem. When the stem ends in a short vowel the aorist infix is -i-, and when ends in a long vowel it is -si-. The Sanskrit past marker prefix a-, called -augment- is optional. The preterite's endings derive from the Sanskrit 'secondary endings'.

  1. The conditional is a sort of preterite of the future and, thus it is formed from it by replacing its personal endings with those of the preterite and by prefixing the stem with the 'augment'.

  1. For example the conjugation of pac ('cook'):


  1. black: stem, brown: past tense marker, green: aorist and future markers, blue: personal endings.

  1. mood: indicative, optative, imperative.

  1. voice: active and passive.

  2. There was a gradual eclipse of the middle voice. The middle personal endings of Sanskrit were lost or remained as a relic. The passive voice is formed by adding the affix -ya-/-īya-/-iyya- to the root or to the present stem plus (usually) active personal endings.

  1. secondary (derivative) conjugations: causative, denominative.

  2. Derivative conjugations add an infix to the root to extend its meaning.

  1. The causative, marked by the infixes -(p)aya- or -(p)e-, conveys the idea of 'to cause to', 'to make to'. For example, gameti the causative of the verb gam ('to go') means 'to make go', 'to send'.

  1. Denominative verbs derive from a nominal root by adding the affixes -ya-, -aya-, -āya-, -īya or, in some cases, by adding the personal endings directly to the root (without affix). They confer to a noun the notions 'to be or act like', 'to make', 'to experience'. For example: from the noun sadda ('sound') derives the denominative saddāyati ('makes a sound').

  1. The desiderative and intensive of Sanskrit subsist only in a few historical forms.

  1. non-finite forms: infinitive, gerund, and several participles, including past active and past passive, present active and gerundive.

  1. The infinitive is usually formed by adding the suffix -(i)tum to the present stem or in historical forms to the root. Another infinitive suffix, inherited from Vedic, is -tave.

  1. The gerund is formed by adding a variety of suffixes, either to the present stem or to the root, namely -tvā, -(t)ya (both inherited from Sanskrit), -tu, -tūna, -(i), -(i)yānā, -eyya, -a. It expresses an action that happens before another one and is, thus, generally followed by another verbal form.

  1. The past passive is the most important participle. It is formed by adding -(i)ta or, sometimes, -na to the root or to the present stem. There are frequent irregularities and, often, two participles exist for the same verb. It frequently serves as an adjective but it can also function as a verb, most often in passive constructions though some participles can have an active meaning and replace a finite verb.

  1. The past active participle is formed from the past passive by adding the suffix -va(nt) to it. For example: bhuttava(nt) meaning 'having eaten'.

  1. The gerundive, expressing obligation or necessity, is formed with a variety of suffixes added to the verbal root or to the present stem: -tabba, -teyya, -tayya, -tāya, -anīya, -ya, -a.

  1. The present participle is formed by adding -nt to the present stem. The suffix -māna derived from the Sanskrit middle present participle is also found but with an active sense.     


Word order is quite free though not totally random. There is extensive and productive use of compound words, like in Sanskrit, though inordinate length is avoided. Nominal sentences (without a verb) and participial constructions are frequent.


There are three classes of Pali words. The most numerous are those that descend from Sanskrit and have experienced sound changes according to Pali phonology. A second class of words are those borrowed directly from Sanskrit without change. Other words have no obvious Sanskrit counterparts, some of them might derive from unknown Old Indo-Aryan dialects while others might be borrowings from local languages.

Basic Vocabulary

one: eka

two: dvi

three: ti

four: catur

five: pañca

six: cha

seven: satta

eight: aṭṭha

nine:  nava

ten: dasa

hundred: sata

father: pitar, pitu

mother: mātar, mātu

brother: bhātar, bhātu

sister: bhaginī

son: putta

daughter: duhitā, dhītar, dhītā, dhītu

head: sira

eye: akkhi

foot: pāda

heart: hadaya

tongue: jivhā

Key Literary Works

All of them are Buddhist in nature and most of them belong to the Tipitaka Canon, although  a few like the Milindapanha are postcanonical. All are anonymous.

100 BCE-100 CE   Digha-nikaya (Collection of long [discourses])

  1. It forms one of the main sections of the second basket of the Tipitaka, grouping the long discourses attributed to the Buddha, some of them cornerstones of Buddhism.

100 BCE-100 CE    Dhamma-pada (The way of truth)

  1. A very popular, poetic epitome of the Buddhist teachings.

100 BCE-100 CE    Thera-gata (Songs of the ancients [monks])

  1. An important social and moral document where Buddhist monks tell who they were and how and why they converted to Buddhism.

100 BCE-100 CE    Theri-gata (Songs of the ancient [nuns])

  1. Similar poems as the previous ones but composed by Buddhist nuns.

  1. 100 CE-300 CE    Jataka (Births)

  2. Narratives in prose of the previous 547 lives of Buddha illustrating his character and progressive spiritual perfection. They developed from poems, mostly short, contained in the Canon.

  1. 300 CE    Milinda-panha (Questions of Milinda)

  2. The sage Nagasena answers challenging questions about the Buddhist doctrine posed by Milinda, better known as the Indo-Greek king Menander.

Further Reading

-Pali Literatur und Sprache. W. Geiger. Trübner (1916). Translated into English as Pali Grammar by Pali Text Society (1994).

-Indo-Aryan. From the Vedas to Modern Times. J. Boch. Adrien-Maisonneuve (1965). Translated from the original French edition of 1934.

-Introduction to Pali. A. K. Warder. Pali Text Society (1963).

-'Aśokan Prakrit and Pali'. T. Oberlies. In Indo-Aryan Languages, 161-203. G. Cardona & DH. K. Jain (eds). Routledge (2003).

-'Middle Indic'. S. W. Jamison. In The Ancient Languages of Asia and the Americas, 33-49. R. D. Woodward (ed). Cambridge University Press (2008).

