An insatiable appetite for ancient and modern tongues

Alternative Names: Romany, Romanes, Gypsy.

Name Origin: Derived from rom, the self-designation of the speakers of the language, meaning "men". Romani, then, is "the [language] of men."

Classification: Indo-European, Indo-Iranian, Modern Indo-Aryan, Central.

Overview. Romani is the language of the Rom people who migrated from India in the 9th-10th centuries CE, reaching Europe around the 12th-13th centuries and every continent in the 20th. They were thought to come from Egypt and for that reason were called Gypsies. Romani is the only Indo-Aryan language spoken exclusively outside South Asia though it has been deeply affected by its prolonged contact with European languages. It doesn’t have a written literature.

Distribution and Speakers: The total number of Romani speakers is estimated at around 4-5 million. They live mainly in the Balkans and Russia though they have spread to the five continents. The following countries have the largest numbers of Romani speakers:












Czech Republic


















Status. Several European countries grant Romani a minority status but not those where the Rom are most numerous. In them, the Rom and their language are still discriminated against. Many Rom communities have lost their mother tongue.

Varieties. There are numerous Romani dialects (more than 50) that can be classified into five groups: Balkan, Vlak (originating in Romania), Central (Czech Republic, Slovakia, Poland, Hungary), Northern (north and west Europe), and Northeastern (Poland, Baltic countries, Russia). Some dialects, like British and Iberian Romani, are now extinct.

Oldest Documents

  1. 1542.Egipt Speche is a list of 13 Romani sentences with English translation, the first attestation of the language.

  1. 1570.Clene Gijpta Sprake includes 53 Romani words and sentences translated into Low German.

  1. 1597.More than 50 Romani words were translated into Latin and printed under the title De Nubianis erronibus quos Itali Cingaros apellant eorumque lingua.


Vowels (5-6). Five vowels are common to all dialects: a, e, i, o, u. In some dialects there is an additional central vowel ə or ɨ. Western dialects generally distinguish between short and long vowels.

Consonants (25). Romani preserves the entire Indo-European set of stops and affricates (non-aspirated voiceless, aspirated voiceless, non-aspirated voiced). In contrast to other Indo-Aryan languages, it has lost the aspirated voiced stops and retroflex stops which these have acquired. There is a tendency to palatalize consonants in eastern dialects due to Slavic influence. Most fricatives are relatively recent innovations derived from contact with neighboring Slavic languages.


Stress: is generally final in native words but penultimate in many borrowed ones.

Script and Orthography

Romani is mainly an oral language. However, there is a growing number of publications in Romani using the scripts of the regions and countries of origin with some adjustments to Romani phonetics.

  1. [tʃ] is usually written č.

  2. [tʃʰ] is usually written čh.

  3. [dʒ] is usually written dž.

  4. [ʃ] is usually written š.

  5. [ʒ] is usually written ž.

  6. aspirated stops are written as digraphs: ph, th, kh.

Morphology. Like other Modern Indo-Aryan languages, Romani has two declensional cases (nominative and oblique) but European loanwords are marked, by agglutination, for another five cases due to Greek influence. Adjectives usually (but not always) agree with their nouns. Exceptionally among Modern Indo-Aryan languages, Romani has a definite article.

  1. Nominal

  2. gender: Romani distinguishes between animate and inanimate nouns; animates can be masculine or feminine. Most masculine nouns end in -o and most feminine ones in -i or -a. Nouns ending in a consonant may be either masculine or feminine.



čhavo ('boy')

šero ('head')

kher ('house')



romni ('woman')

cipa ('skin')

suv ('needle')

  1. number: singular, plural. Masculine nouns ending in o change to e; those ending in a diphthong add ja and those ending in a consonant add a. Feminine nouns ending in i drop it and add ja; those ending in a diphthong add ja and those ending in a consonant add a (or rarely ja).



čhavo ('boy') → čhave  ('boys')

kher ('house')  → khera ('houses')

mui ('mouth') → muija ('mouths')



romni ('woman') → romnja ('women')

bar ('fence') → bara ('fences')

goi ('sausage') → goija ('sausages')

  1. case: nominative, oblique, vocative.

  2. These three cases are marked by suffixes attached directly to the nominal base. Gender and number are indicated at this level. The oblique case is used to mark an animate direct object. If the direct object is an inanimate noun its form is identical to the nominative.

  1. Nouns belong to different declension classes according to gender and ending. The classes are masculine ending in consonant, o, i, os, u, and feminine ending in consonant, i, a. For example:


  1. The vocative case is formed with suffixes attached directly to the noun root: -a (masc. sg.), -eja (masc. pl.), -(j)a/-(j)e (fem. sg.), -ale(n) (fem. pl.).

  1. Besides these three, there other cases which are marked by suffixes attached to the oblique stem: dative (-ke/-ge), locative (-te/-de), ablative (-tar/dar), instrumental (-sa/-ha/-ca) and genitive (-k[er]/-g[er]). Postpositions and adverbs trigger these secondary markers.

  1. The dative case has a benefactive meaning; the locative indicates location in space or time; the ablative denotes origin; the instrumental has comitative and instrumental functions. The genitive marker attaches to the possessor noun and agrees in gender and number with the possessed noun like an adjective:

  1. rakles-k-i daj

  2. boy’s mother

  1. rakles: oblique of 'boy'; k: genitive; i: feminine singular; daj: nominative of 'mother'

  1. rakles-k-o dad

  2. boy’s father

  1. k: oblique genitive; o: masculine singular; dad: nominative of 'father'

  1. rakles-k-e phrala

  2. boy’s brothers

  1. k: oblique genitive; e: plural; phrala: nominative of 'brothers'

  1. adjectives: generally agree with their nouns in gender and number. They end in o when used with masculine singular nouns, in i with feminine singular nouns, and in e with plural ones, regardless of their gender. Unstressed adjectives and those ending in a consonant are usually invariable.

  1. pronouns: personal, reflexive, possessive, demonstrative, interrogative, relative.

  1. Personal pronouns refer only to animate nouns. Unlike other Indo-Aryan languages, Romani has 3rd person pronouns different from the demonstrative ones. The 3rd person pronoun distinguishes gender, but only in the singular. There is also a 3rd person reflexive pronoun that lacks a nominative.

  1. The Possessive pronouns are:


  1. Demonstrative pronouns  have great dialectal variation. They exhibit a four-way contrast: proximal-distal, general-specific. In Kalderaš dialect, for example:

  1. kadava (proximal, general): 'this'

  2. kakava (proximal specific): 'this', among others

  3. kodova (remote, general): 'that'

  4. kukova (remote specific): 'that', among others

  1. They inflect for gender, number, and case but their case markers are different from those of other pronouns.

  1. Interrogative pronouns and adverbs are kon ('who?'), so and variants ho/o ('what?'), sav-/hav- ('which?'), kaj ('where?'), kana ('when?'), sar ('how?').

  1. The interrogatives kon, so (ho), sav- (hav-) and kaj may be employed as relative pronouns. Kaj is used extensively while so is restricted to inanimate objects and sav- to animate agents. Kon is used in one of its non-nominative forms: kas ('whom'), kaske ('to whom'), kasa ('with whom'), kasko ('whose').

  1. articles: exceptionally among Indo-Aryan languages, Romani has a definite article. It probably arose through Greek contact. It inflects for case, gender and number. The nominative masculine singular is o in all dialects, the nominative feminine singular, however, varies in different dialects between i and e. The plural le is the same for both genders.

  1. Romani uses the numeral one (jekh) as an inflective indefinite article:

  1. dikhlas jekhe gazeš

  2. 'He saw a man'

  3. jekhe: article in oblique; gazeš: man in oblique

  1. Verbal

  2. Verbal morphology is mostly suffixal. There are two stems, present and perfective, and there are two sets of personal markers, one set for the present stem and the other for the perfective stem. The present stem is identical to the verb root. The perfective stem adds to the root a perfective marker (-d(j)-, -l-, -t-).

  1. The verb is formed by adding suffixes to the lexical root in sequential order. Up to six slots can be filled:

  1. Verbal slots                                      ker-dj-ov-el-a ('It will be done')

  2. 1. lexical root                                    ker ('do')

  3. 2. loan adaptation

  4. 3. perfective stem marker              -d-

  5. 4. passive/causative, etc.               -jov- (passive)

  6. 5. subject concord                            -el- (3rd sg.)

  7. 6. tense/modality                             -a (future marker)

  1. Loan adaptations are morphological markers that are attached to European loan roots.

  2. The tense markers are -a for the future and -as (or -ahi in some dialects) for the imperfect and pluperfect. Particles and auxiliaries, external to the verb, may precede or follow it to express tense or modality:

  1. ka(m), ma(m), l-, jav-: future tense.

  2. s-: stative present.

  3. sin-, ther-: perfect tense.

  4. te, bi, li: conditional.

  1. person and number: 1s, 2s, 3s; 1p, 2p, 3p.

  1. aspect: imperfective (including habitual and continuous actions) and perfective (completed activities). The imperfective aspect is expressed in the present stem and derived tenses; it is not morphologically marked. The perfective aspect is expressed in the preterite and derived tenses; it is marked by -d(j)-, -l- or, more rarely, -t-.

  1. tense: present, future, imperfect, preterite, pluperfect.

  2. The present, future and imperfect derive from the present (imperfective) stem. The preterite and pluperfect derive from the perfective (preterite) stem.

  1. The present is formed by adding the personal endings directly to the present stem which may end in a consonant or (less frequently) in a vowel. In some dialects there is a 'long' form of the present that includes the tense marker -a (kerava, keresa, etc) which also may be used to express the future. There are several conjugational classes in the present according to the vowel connecting the 3rd singular ending to the stem. The verb 'to be' is irregular.


  1. black: root

  1. green: connecting vowel

  1. blue: personal markers

  1. The preterite (or aorist) is formed by adding a different set of personal endings to the perfective stem. The pluperfect adds the marker -as (or -ahi in some dialects) to the preterite; it serves also as a counterfactual. The imperfect is formed by adding the same marker to the present tense, and the future by adding the tense marker -a to the present, in some dialects, or by placing the particle ka(m), or more rarely ma(m) before the present tense forms in other dialects.


  1. black: root

  1. brown: perfective marker

  1. blue: personal markers

  1. red: tense markers.

  1. mood: indicative, imperative, subjunctive, conditional.

  2. The present subjunctive has no marker and in many dialects it is not distinguished morphologically from the indicative while in other dialects, that have a present-future 'long form' ending in -a, it is different from it.

  1. The imperative consists of the present stem in the singular and subjunctive forms in the plural.

  1. In most Balkan dialects, the future particle ka combines with the imperfect to form a conditional mood which can also be expressed by the particles te, bi or li preceding the pluperfect forms.

  1. non-finite forms: infinitive, present participle, perfective participle.

  2. Romani lacks an inherited infinitive though it has developed a 'new infinitive' which differs among dialects. It is formed, generally, with the verbal root plus the personal marker for 3rd singular present (-el). Thus phen-el ('to say'). In many cases the new infinitive is introduced by the particle te: te phenel.

  1. The perfective (or past) participle consists of the perfective stem with adjectival inflection e.g.  ker-d-o (masc. sg.), ker-d-i (fem. sg.), ker-d-e (pl.). Perfective participles of transitive verbs agree with the patient (the recipient of the action), those of intransitive verbs show agreement with the subject.

  1. The present participle or gerund is used to show simultaneity of  two actions. It has two forms, one inflected (-nd-/-ind-) and the other non-inflected. The inflected form is non-perfective.


    The Subject-Object-Verb (SOV) word order of Indo-Aryan tends to shift to VO by Greek influence with the subject preceding or, alternatively, following the verb (SVO or VSO). Accordingly, Romani uses prepositions instead of postpositions. Pronominal direct objects precede pronominal indirect ones. The copula appears usually in final position. Interrogative sentences do not have a distinctive word order. Relative clauses tend to follow the main ones.

    In the noun phrase, prepositions occupy the first slot. After the prepositions determiners are placed, then quantifiers, adjectives and nouns; the article is placed before attributive adjectives:

  1. prep. indef. art.  adj.      noun

  2. ande     jekh     cikno     kheroro

  3. in           a         small   house-diminutive


The inherited lexicon is comparatively small compared to that of other Indo-Aryan languages. Apart from some early borrowings from Iranian and Armenian, most loanwords are from Greek.

Basic Vocabulary

Aspirated stops and affricates are rendered by digraphs (ph, th, kh, etc), the affricates tʃ, dʒ are rendered č and , and the fricatives ʃ, ʒ, by š, ž.

one: jekh

two: duj

three: trin

four: štar

five: pandž

six: šov

seven: efta (borrowed from Greek)

eight: oxto (borrowed from Greek)

nine: enja (borrowed from Greek)

ten: deš

hundred: šel

father: dad

mother: daj

brother: phral

sister: phen

son: čhavo

daughter: čhaj

head: šero

eye: jakh

foot: pindro

heart: ilo

tongue: čhib

  1. © 2013 Alejandro Gutman and Beatriz Avanzati                                                                               

Further Reading

  1. -Romani. A Linguistic Introduction. Y. Matras. Cambridge University Press (2004).

  2. -'The Position of Romani in Indo-Aryan'. R. L. Turner. Journal of the Gypsy Lore Society 5 (3),145–189 (1926).

  3. -The Gypsies. A. Fraser. Blackwell (1992).

  4. -Rombase. Didactically edited information on Roma.

  5. -Romani Project.

  1. Top   Home   Alphabetic Index   Classificatory Index   Largest Languages & Families   Glossary



Address comments and questions to: