An insatiable appetite for ancient and modern tongues

Classification: Altaic?, Turkic, Northeastern (Siberian branch), Northern group.

Yakut is a member of the Turkic family. The external classification of Turkic is disputed. Many consider it one of the three divisions of the Altaic phylum but for others its relationship with Tungusic and Mongolic is not proven. Yakut belongs to the northern group of the Siberian branch along with the very similar Dolgan. It is also closely related to neighboring south Siberian languages Tuvan and Khakas.

Overview. Yakut is the easternmost Turkic language and, with Dolgan, also the northernmost. The Yakuts emigrated to the remotest regions of northeast Siberia from their homeland in the area of Lake Baikal becoming geographically isolated from other Turkic peoples. Their tongue preserved, thus, a number of archaisms and, on the other hand, acquired some innovative features due to prolonged contacts with Mongolic and Tungusic. For example, Yakut retains the long vowels of Proto-Turkic but has changed some common Turkic consonants, it has lost some grammatical cases but developed new ones, its core vocabulary is still mostly of Turkic origin but it has adopted many foreign words.

Distribution and Speakers. Yakut is spoken in northeastern Siberia, mainly in the Sakha Autonomous Republic within the Russian Federation, which is extremely large (more than 3 million square km) and has a very low population density. The republic has one million inhabitants of whom one-third are Yakuts. The total number of Yakut speakers is around 450,000.                                        

Status. Yakut is widely used within the Sakha Autonomous Republic by native speakers and as a second language by speakers of the Tungusic languages Evenki and Even as well as by Yukaghir speakers (an isolate). However, Russian is used in higher education.

Varieties. Differences between Yakut dialects are rather small so they are mutually intelligible. There are central dialects, including Aldan and eastern and western Lena, as wells as peripheral dialects, including a northeastern group influenced by Even and a northwestern group influenced by Evenki. Dolgan is closely related to Yakut and is considered by some scholars as one of its dialects.

Oldest Documents

A Yakut list of words appeared in "Noord en Oost Tartarije", a book published in 1692 by the Dutch traveller Nicolaas Witsen. Another word-list was collected by von Strahlenberg who remained in Siberia for 13 years as a prisoner of war and who published it in 1730. The first Yakut grammar was published in 1851 by the scholar Otto N. Böhtlingk with the title "Über die Sprache der Jakuten" (About the language of the Yakuts).


Vowels (16): Yakut has preserved the long vowels of Proto-Turkic. Its vowel system is completely symmetrical regarding height (8 high and 8 low vowels), backness (8 front and 8 back), roundness (8 unrounded and 8 rounded) and length (8 short and 8 long):



  1. The symbols are those current in writing in the Cyrillic alphabet; those of the International Phonetic Alphabet are indicated between brackets.

The most usual transliteration of Yakut vowels into the Latin alphabet is as follows:

  1. [a], [e], [i], [o], [u] = a, e, i, o, u, respectively.

  2. [y], [œ], [ɯ] = ü, ö, ï, respectively.

  3. long vowels are indicated by a macron: ā, ē, ī, etc.

Yakut has also 4 diphthongs: front unrounded [ie], front rounded [yœ], back unrounded [ɯa], and back rounded [uo].

Vowel harmony.  Yakut, like all Turkic languages, has vowel harmony. It governs the distribution of vowels within a word opposing front versus back vowels, and rounded versus unrounded ones.

    In the first syllable of a word all vowels can occur. If it is a front vowel all the subsequent vowels must be also of the front type. If it is a back vowel all the other vowels must be also of the back type. Thus, all the vowels of a word belong to the same class (back or front) and the vowels of suffixes vary according to the class of vowels in the primary stem.

    If the first vowel of a word is round then the following high vowels should be also round. But if the following vowel is low there is no harmony because of the phonological constraint that low non-initial vowels must be always unrounded.

Consonants (21): The [ʃ], [ʒ] and [z] fricative sounds of Proto-Turkic developed into [s]. The affricates [] and  [] are not present in native Turkic words.


  1. When the velar voiceless stop is accompanied by back vowels it is realized as [χ], when it is accompanied by front vowels it is realized as [k]. When the velar voiced stop is accompanied by back vowels it is realized as [ʁ], when it is accompanied by front vowels it is realized as [g].

Script and Orthography

In 1922 a Latin alphabet-derived script was created by S. A. Novgorod. It was replaced by a new Latin alphabet in 1929 which, in turn, was replaced by a Cyrillic one in 1939. It has 40 letters (their equivalents in the International Phonetic Alphabet are given between brackets):



  1. The letters highlighted in color are used only in foreign words.


The transliteration into the Latin alphabet of native sounds used here is as in Stachowski & Menz (1998) except for the uvular fricatives [χ] and [ʁ] which we represent, respectively, as x and . We, like them, don’t differentiate between [l] and [ł] transliterating both as l. On the other hand, we distinguish between the two glides [j] [] transliterating them, respectively, as [y] and []. Long vowels are written with a macron (ā, ī, ū, etc).

Suffixes variants. Due to sound harmony all vowels and some consonants of suffixes change according to the preceding sounds. This is indicated with capital letters as follows:

  1. A indicates a low vowel that is realized as e [e] after a front unrounded vowel, as ö [œ] after a front rounded vowel, as a [a] after a back unrounded vowel or as o [o] after a back rounded vowel.

  1. I indicates a high vowel that is realized as i [i] after a front unrounded vowel, as ü [y] after a front rounded vowel, as ï [ɯ] after a back unrounded vowel or as u [u] after a back rounded vowel.

  2. L may be realized as l, t, d, or n. B may be realized as b, p or m.

  1. T may be realized as t, d, l or n. G may be realized as g, k, , x or ŋ.


  1. Nominal

  1. gender: there is no grammatical gender.

  1. number: singular and plural.

  2. Plurality is indicated by -LAr.

  1. possession: is marked by suffixes, one for each person and number, which are attached to the noun.

  1. case: nominative, accusative, partitive, dative, ablative, instrumental, comitative, comparative.

  2. Yakut has lost  some cases, like the genitive and the locative, common to the majority of Turkic languages but has incorporated some new ones. The dative has taken the function of the locative. New cases are the partitive, the instrumental, the comitative and the comparative. Cases are marked by suffixes which are subject to sound harmony. Plurals are made by adding the suffix -LAr before the case marker.


  1. pronouns: personal, possessive, demonstrative, interrogative, indefinite, reflexive.

  2. Yakut uses pronouns more frequently than other Turkic languages. They are inflected in several cases. The personal pronouns are:


  1. There are two collective pronouns: bihikki (you/he + me) and ehikki (he/they + you), both declined as nouns.

  1. Possessive pronouns are formed with personal pronouns plus the possessive suffix -e: miene ('mine'), eiene ('yours' sg.), kiniene ('his/hers/its'), bihiene ('ours'), ehiene ('yours' pl.), kinienere ('theirs').

  1. Demonstrative pronouns distinguish three deictic degrees (this/that/that remote) and have emphatic forms (exactly this/exactly that):


  1. Interrogative pronouns are kim ('who?'), tuox ('what?'), xaya ('which?'). Other interrogative words are xaydax ('how?'), xanna ('where'?), xahan ('when?'), too ('why?'). For yes/no questions the particles duo or dū are used.

  1. Indefinite pronouns are formed with the particles ere or eme plus the interrogative pronoun e.g. kim ere ('somebody'), kim eme ('somebody, anybody)'.

  1. Reflexive pronouns are formed with beye plus possessive suffixes e.g. beyem ('myself'), beyete ('himself'), etc.

  1. compounds: nominal compounds of the type noun + noun are frequent. In one variety, the first element indicates the material of the second noun. In another one, the second noun carries a possessive suffix to highlight the genitive relationship with the first one.

  1. Verbal. A finite verb form has a verbal stem + tense-aspect or mood marker suffixes + personal marker.

  1. person and number: 1s, 2s, 3s; 1p, 2p, 3p. They are indicated by pronominal or possessive suffixes.

  1. tense-aspect: present, simple past, perfect past, imperfect, pluperfect, prospective (future).

  2. Aspect and tense are intimately linked. They are marked by specific suffixes while person and number are indicated by pronominal markers or possessive suffixes. There are affirmative and negative conjugations.

  3. The endings added to the stems to form the affirmative forms of the main tenses are as follows (the first component is the aspect-tense marker, the second is the personal ending:


  1. The imperfect and pluperfect are compound tenses. The imperfect is built with the aorist participle + the copula in the past, or with the aorist participle + possessive suffixes. For example:

  2. ete    =   keler ete   ('He was coming')

  3. + past copula

  4. keler

aorist participle of 'come'

  1. im= kelerim   ('I was coming')

  2. + possessive 1sg

  1. The pluperfect is formed with -BIt  plus the past copula.

  1. mood: indicative, imperative, conditional, presumptive, necessitative.

  2. The presumptive expresses conjecture or guess, the necessitative need or obligation.

  1. The imperative endings are: -Īm (1s), zero (2s), -TIn (3s), -IAx (1p inclusive),

  2. -IAIŋ (1p exclusive), -Iŋ (2p), -TInnAr (3p).

  1. There are two conditionals marked with two different suffixes. The conditional I uses the irreal suffix -TAr, the conditional II uses the real suffix -TAX.

  1. The presumptive is formed with the suffix -ĪhI plus pronominal markers.

  1. The present necessitative (‘must, should’) is formed with the aorist participle plus the -LĀx plus pronominal markers.

  1. voice: active, middle, passive, cooperative-reciprocal, causative.

  2. The passive voice is formed with the suffix -n added to the vowel stem, and -(I)lIn added to consonant stems:

  1. ahā (eat)  →   ahan (with reduction of the second vowel)

  2. tik (sew)  →   tigilin

  1. The reflexive voice is formed with -(I)n and shortening of the last vowel of the stem.

  1. The cooperative-reciprocal uses the suffixes -(I)s/-sIs. For example:

  1. ülelē (work) → üleles (work together)

  2. kör (see) → körsüs (see each other).

  1. The causative uses one of three suffixes: -t (after vowels), -TAr (after consonants), -(I)Ar (after some consonant stems). For example:

  1. sanā (think) → sanat (make somebody think).

  1. tik (sew) → tikter (make somebody to sew).

  1. non-finite forms: participles (aorist, perfect, future, nondum facti), converbs.

  2. The aorist participle signals an habitual action; it is formed with -Ar after consonants and -Īr after vowels.

  1. The perfect participle signals a completed action that has an effect on the present; it is made by the addition of the suffix -BIt.

  1. The prospective or future participle ends in IAx.

  1. The nondum facti participle signals an action that has not yet taken place and it is formed with the converb in -A plus ilik.

  1. There are several converbs with different meanings. The one in -An  means 'and', the one in -A functions as an adverb. The converb in -BAkkA means 'without doing', that in -ĀrI 'in order to', that in -Āt 'as soon as', etc.


    Yakut, like all Turkic languages, has a basic Subject-Object-Verb word order which, nevertheless, can be changed to highlight a certain topic. Subjects and their predicates agree in number; if the subject is a collective noun in singular, then the predicate is marked with plural.

It employs postpositions (corresponding to English prepositions, but placed after the words they interact with) to specify and precise syntactical relations established by the grammatical cases. Determiners and modifiers precede the head noun and do not agree with it in case and number. Indefinite direct objects are unmarked and definite direct objects are marked with accusative.

    The subject of a relative clause is unmarked for case, and relative clauses based on participles precede the head noun. Comparative constructions use the suffix -TĀAr, and these constructions may be replaced by the ablative.


The basic lexicon of Yakut consists of native Turkic words but it has also many words of unknown origin probably due to early contacts with Paleoasiatic languages. Most borrowings come from Mongolic and Tungusic languages including many everyday terms. A much later influence is Russian.

Basic Vocabulary

  1. Further Reading

  2. -Über die Sprache der Jakuten. O. Böhtlingk. St. Petersburg (1851).

  3. -'Yakut'. M. Stachowski & A. Menz. In The Turkic Languages. L. Johanson & É.Á. Csató (eds),  417–433. Routledge (1998).

  4. -Yakut Manual. J. R. Krueger. Uralic and Altaic Studies 21. Indiana University Publications (1962).

© 2013 Alejandro Gutman

  1. and

  2. Beatriz Avanzati                                                                               

  1. Top   Home   Alphabetic Index   Classificatory Index   Largest Languages & Families   Glossary



Address comments and questions to: