One of the tangraphs I mentioned last night

4874 2lõ 'wide'

has its own alphacode yum in David Boxenhorn's system. Can it be broken down further?

Nishida (1966: 344) regarded the top half of yum as his radical 028 'metal' (= David's tex). However, the 'metal' radical tex is distinct from the top half of yum. Compare 5037 1biəʳ 'knife' (texgux < 'metal' + 'small') which immediately follows 4874 (yum) in Nishida's dictionary:

The third stroke of tex is フ whereas the third stroke of yum is 一.

The right side of

3537 2thwã 'the Chinese surname 段 Thwan' (cirzukgux)

resembles both 5037 and 4874. 3537 may have

2014 1thã 'beach, sands' (ciryumcok)

as an abbreviated phonetic. zuk is yum minus its bottom half (con). Should yum be analyzed as zuk + con?

Two Tangraphic Sea analyses imply that zuk could be broken down even further into bio (the so-called 'horned hat') + box ('wood'; the four strokes under the 'hat'):


4808 1khwẹ 'to expand, enlarge' (biopandex) =

4874 2lõ 'wide' (yum; semantic) +

5864 1zie 'extensive, wide, vast' (pandex; semantic)


4809 1bẽ 'wide, vast, flat' (biopanges) =

4874 2lõ 'wide' (yum; semantic) +

5888 1giew 'broad, wide, extensive' (panges; semantic)

I don't think zuk originated as a compound of bio plus box. There is nothing 'wooden' about width. Instead, I regard zuk and bio as abbreviations of yum which was based on the sinograph


The bio-like part of zuk corresponds to 宀, the box-like part to 艹, and con to 見 plus a dot.

yum is unabbreviated in three tangraphs:


2014 1thã 'beach, sands' (ciryumcok) =

2725 1ʔwɔ̣ 'round' (cirtuecin; cir = 'water')

4874 2lõ 'wide' (yum) +

2107 1tsəiʳ 'land' (giigircok)

5636 2lõ (second half of 1ŋɛɛ-2lõ 'live in peace') (yumcun; why is cun 'speech' on the right?)

5637 2lõ (second half of 1tã-2lõ 'upright and outspoken') (yumdim)

The analyses of the last two tangraphs are unknown. yum is obviously their phonetic. PROPOSED TANGUT COGNATES IN ZHANG-ZHUNG

Recently I've been writing about the possibility of Khitan (a variety of Para-Mongolic) and Uyghur loanwords in Tangut. Neither Khitan nor Uyghur are related to Tangut, and I doubt they are related to each other. The Khitan numerals have not yet been fully reconstructed, but the presumed Proto-Mongolic ancestors of some of them are nothing like Uyghur (or Tangut) numerals.

On the other hand, the extinct Zhang-zhung language (ZZ) is related to Tangut, but the exact nature of its relationship is uncertain. Tangut is thought to belong to the Qiangic branch of Sino-Tibetan. In "Zhang-zhung and Qiangic languages", Guillaume Jacques evaluated proposed cognates shared by ZZ and Qiangic languages (such as Tangut). If such cognates are valid and not found in other Sino-Tibetan branches, they may be shared innovations that are evidence for grouping ZZ and Qiangic together:

Innovation shared by Zhang-zhung and Qiangic Absence of innovation shared by Zhang-zhung and Qiangic
Zhang-zhung Qiangic languages Other Sino-Tibetan languages
Tangut, Qiang, rGyalrong, etc. Chinese, Tibetan, Burmese, etc.

Guillaume only found four "problematic" nonmorphological innovations shared by ZZ and Qiangic. These resemblances could be coincidental and are insufficient to propose a close relationship between the two.

9.4.21:59: Thus the above table could be revised as

Presence of Qiangic Innovation Absence of Qiangic innovation
Qiangic languages Other Sino-Tibetan languages
Tangut, Qiang, rGyalrong, etc. Chinese, Tibetan and Zhang-zhung, Burmese, etc.

The most interesting of the four proposed shared innovations for me was ZZ lgyam 'wide' (?; gloss dubious*) which resembles these Tangut synonyms:


0034 2lõ 'wide' (fexcirqes; fex = 'top'?) < ? + 2029 2lõ 'country' (cirqes; phonetic)

4874 2lõ 'wide' (yum; see my next post for more on this radical)

My reconstructions are simply revisions of Gong's reconstructions in Li Fanwen (2008: 7, 771). Gong and Li Fanwen (1986: 446, 450) reconstructed them as homophones and they are in the same homophone group in Precious Rhymes of the Tangraphic Sea without any dot separating them (see p. 4 of this PDF). However, they are in separate homophone groups in Homophones and Nishida (1966: ) and Sofronov (1968 II: 283, 318) did not reconstruct them as homophones::

LFW number Nishida 1966 Sofronov 1968 Li Fanwen 1986 Gong 2008 This site
0034 lɔɦ 2ldwon 2lɑ̃ 2low 2lõ
4874 lõ 2lwon

Were these originally two unrelated words that merged into one in the PRTS dialect? Was a separator dot accidentally omitted from PRTS? Or did the two words share a common root? If Tangut had ld- (a possibility most recently argued by Tai 2008), could it partly originate from a prefix-root initial sequence *C-l-?

Pre-Tangut *lamH > 4874 2lõ

Pre-Tangut *C-lamH > 0034 2ldõ?

If the two Tangut words were not related, which one, if any, was ZZ lgyam related to?

Guillaume proposes three possible interpretations of ZZ lgy-:

1. "It might be an attempt at representing a lateral palatal *ʎ"

2. "it could be the result of a metathesis from a cluster such as *k-lj-"

3. "the -g- could be an epenthetic consonant, in the same way as -g- in Tb [Tibetan] words such as brgyad < *p-rjat (Li 1969)."

In Tangut, l- is normally followed by Grade III -ɨ- rather than Grade IV -i-**, implying that it was velarized *[ɫ]. Perhaps ZZ lgy- represented something like *ɫɣɨ- or even *ɫɣi- from an earlier Tangut-like *lɨ-. But note that the proposed Tangut cognate 2lõ lacks -ɨ-.

*9.4.21:56: Andrew West pointed out that ZZ lgyam may simply be "a borrowing or corruption of Tibetan rgyas". He led me to the entry for lgyam in Dan Martin's ZZ dictionary (2010: 68):

This is a rather dubious entry since it occurs in Zhu and in Mdzod, ch. 8, only as part of the technical term pra lgyam dub, which corresp. to Tib. phra rgyas dug.
Tibetan rgyas cannot be cognate to Tangut 2lõ since Tibetan rgy- does not correspond to Tangut l-.

**9.4.2:30: In rhyme groups that distinguish between Grades III and IV, l- is generally in Grade III:

Rhyme group II IV VI VII
Grade III lɨi (x 17) lɨa (x 8), lɨaa (x 1) lɨə (x 20) lɨe (x 1)
Grade IV (none) lia (x 4) (none) lie (x 11)

Its voiceless counterpart lh- has the opposite pattern without any exceptions:

Rhyme group II IV VI VII
Grade III (none) (none) (none) (none)
Grade IV lhi (x 11) lhia (x 4) lhiə (x 12) lhie (x 4)

This suggests that l- and lh- did not only differ in voicing. l- may have been velarized whereas lh- may have been palatalized or palatal. "THE STATE AND THE BUDDHIST SANGHA: XIXIA STATE (982-1227)"

I found this article by Kychanov when I Googled "uighur tangut". Here are all its references to the Uyghur with links added (emphasis mine):

THE Tangut State came into being in the late 10th century, surrounded by powerful Buddhist centers: the Chinese ones—Dunhuang, Wutaishan and Helanshan, the Tibetan ones—Amdo and Liangzhow, and the Uighur ones—Ganzhow, Shazhow and Turfan. [...] The independent Tangut state system, even in a multi-national context, required creation of the written language for its native tongue for the purposes of conducting business correspondence (1036); further, this language was used to translate the Buddhist canon into Tangut. In 1038 a special committee was formed whose task was to translate the sutras into the Tangut language. Yuanhao [his Mandarinized name], the ruler of the Great Xixia [Mandarin for 'Tangut'] State, took personal control over the committee’s activities, which, according to some Chinese sources, was consulted by Uighur monks who had some experience in translating Buddhist texts from Chinese into their own language.


The Tangut State was a multinational one, the main nationalities being the Mi—the Tanguts, the Han—the Chinese, and the Bodpa—theTibetans and the Uighurs.


Notably, the sources that we are familiar with at present do not make any mention of mixed Tibetan-Tangut or Uighur communities, though we would be justified in expecting both to exist.

I have long been curious about the Uighur component of the Tangut state. I remain in the dark about it, though this article sheds much light on other matters: e.g.,

Thus, ideally the future superior Buddhist hierarch in Xixia was to be familiar with the holy texts in the Sanskrit, Tangut, Chinese andTibetan languages.

All languages I've studied. Unfortunately, I'm not familiar with the Buddhist canon in any language, so I wouldn't qualify. Oh well ...

The [monastic] candidate had to be able “to voice the sounds of Sanskrit purely and clearly,” and “do the ritual bowing” while reading the holy texts.

I presume "the sounds of Sanskrit" were actually the Tangutized sounds of Sanskrit which don't perfectly match the actual sounds of Sanskrit in any Tangut reconstruction I've seen. Would Panini himself have pleased the Tangut judges? DID TANGUT HAVE MONGOLIC NUMERALS?


After reading my posts on Khitan loanword(s?) in Tangut (part 1 / part 2), David Boxenhorn suggested that Odic Tangut or Tangut B (the hypothetical language underlying the more obscure structures of some tangraphs) could have been Para-Mongolic (Khitan? Xianbei? a third language?). I would not be surprised if some Tangut surnames were of PM origin, but am agnostic (not pessimistic!) about any further PM loanwords in Tangut.

The only convenient list of Odic Tangut words that I have on hand is Guillaume Jacques' 2006 list of Tangut month names. (The same list can be found in Nevsky 1960 II: 537.) The names for all twelve months end in the disyllabic Odic Tangut word


Some of the names are analyzable whereas others (in bold) are disyllabic numerals that are treated as synonyms for the regular monosyllabic numerals in dictionaries. None resemble Janhunen's (2003) reconstructions of Proto-Mongolic numerals (which should be similar to Para-Mongolic numerals) or Uyghur numerals (source; added 9.2.19:36; Uyghur lived in the Tangut Empire and Uyghur is another untapped source of loans into Tangut):.

Number of month Standard Tangut numeral Odic Tangut name Gloss of OT name Proto-Mongolic Uyghur
1 1lew 1kiew 1siw new year (lit. 'year new') *nike/n bir
2 1niəə 2riəʳ 1lọ PERF-two = 'paired'? *jiri/n; *koxar ~ *koyar ikki
3 1sọ 2lheʳ 2giu three *gurba/n üch
4 1lɨəəʳ 1kwe 1ŋwəʳ four *dörbe/n töt
5 1ŋwə 2tʃɨəʳ 2ləu five *tabu/n besh
6 1tʃhɨiw 1ʒɨiw 1vəi six *jirguxa/n < *jir '2'+ guxa/n '3' alte
7 1ʃɨa 1ŋwəʳ 1kạ seven *doluxa/n yette
8 1ʔiaʳ 1niəə 1lɨəəʳ two-four = 2 x 4 *na(y)ima/n sekkiz
9 1giəə 1lɨəəʳ 1ŋwə four-five = 4 + 5 *yersü/n toqquz
10 ɣạ 1niəə 1ŋwə two-five = 2 x 5 *xarba/n on
11 ɣạ 1lew 1ŋwə 1tʃhɨiw five-six = 5 + 6 (none listed) on bir
12 ɣạ 1niəə 2diə 1kieʳw PERF-cold = 'became cold'? on ikki

The OT names for the second and twelfth months have different perfective prefixes. 2riəʳ has no known direction whereas 2diə indicates inward direction.

The OT numerals for 'three' through 'seven' don't look like numerals in any language I know of. Are they remnants of some language (Tangut B?) without any living relatives? Or did they originate as descriptions that were reinterpreted as numerals after their original meanings were forgotten?

9.2.9:48: It's curious that the OT words for 'four' and 'seven' have the same syllable 1ŋwəʳ written with completely different tangraphs:


second half of 'four' <> first half of 'seven'

1ŋwəʳ is attested as an independent word for 'seven' but not 'four'.

The only OT numeral that might be related to a Standard Tangut numeral is 1ʒɨiw 1vəi 'six'. 1ʒɨiw looks like a derivative of 1tʃhɨiw 'six' with a lenited initial due to a lost prefix:

1ʒɨiw < *1Cɯ-ʒɨiw < *1Cɯ-ɨiw < 1Cɯ-tʃhɨiw

But unlike 1tʃhɨiw 'six', 1ʒɨiw cannot stand by itself as an independent word. It must be followed by the syllable 1vəi whose origin is unknown. TANGUT EVIDENCE FOR PALATALIZATION IN KHITAN (PART 2)

Guillaume Jacques proposed that the word represented by the phonetic for the first half of 1tʃhɨə 1tã 'Khitan'


4361 1tʃhɨə (first half of 'Khitan') (boxbeehel) =

4389 1tã (transcription tangraph) (boxyinbaxbelcin) +

1374 1tʃhɨə 'that' (beehel)

could also be a loan from Khitan. 1tʃhɨə 'that' and 1tʃhɨə 1tã 'Khitan' share the same sound correspondences:

Gloss Tangut Khitan
'Khitan' 1tʃhɨə 1tã qid.ún
'that' 1tʃhɨə qi

Do those Tangut forms reflect palatalization of a uvular turned velar followed by depalatalization of the following vowel in an unwritten Khitan dialect that the Tangut had contact with?

*tʃɨ < *tʃi < *tɕi < *ki < *qi

Such a development has a partial parallel in Mandarin:

zhi [tʂr̩] < *tʂɨ < *tʂi < *tɕi < *tɕi < *ki

The aspiration in the Tangut forms may reflect allophonic aspiration of voiceless initial obstruents in Khitan (as in English):

[tʃh] = /tʃ/

Guillaume, however, pointed out that

[...] it is not likely that the Tanguts would have only borrowed a demonstrative without borrowing much other vocabulary.

I would add that it's easier to find coincidental lookalikes for short words. Tangut 1tʃhɨə also resembles Late Middle Chinese 此 *tshɨ < tsheʔ 'this' (not 'that'!) and premodern Korean 져 tsyə < 뎌 tyə 'that'.

It is possible, though, that Khitan loanwords in Tangut are waiting to be discovered. We should look for Tangut words without a Qiangic etymology and resembling Mongolic forms. For instance, Tangut njijr2 'face' [= 2nieʳ in my reconstruction] reminds of MM [Middle Mongolian] ni'ur "face" and might have been borrowed from Khitan.

But how can one account for the vowel correspondence e : u? Could the Khitan form have had a vowel like ö or ü that had palatalized to assimilate with the first vowel i? Hence 2nieʳ could be from a Khitan niör. There is no ö in my Tangut reconstruction or in any other reconstruction I have seen, and the substitution of e for foreign ö can also be found in Japanese: e.g., rentogen 'X-ray' < Röntgen.

For future work, an area where Khitan loanwords could be found are the lists of personal names, several hundreds of which are attested in the book Mixed Characters (Terent'ev-Katanskij & Sofronov 2002). The Tangut Imperial family originally had the family name 拓跋 Tuòbá (Middle Chinese takbɛt), a Xiānbēi (Para-Mongolic) name. Therefore for other Tangut names might also have a Para-Mongolic or even Khitan origin. For instance, one wonders whether the Tangut name

.jɨ2rjir2 [= 2jiə 2riʳ in my reconstruction]

(in Chinese known as 野利 yělì, among other things the clan name of

2jiə 2riʳ 1dʒwɨu 1lɨi [Yyrir Jwuli, in Chinese known as 野利仁榮 Yeli Renrong; Renrong is a translation of his personal name 'humane prosperous'.]

the inventor of the Tangut script) is not a transcription of the Khitan word <i.ri> "name, title" (K. [Kane] p. 108; this word has been loaned into Middle Korean, see Shimunek 2007: 75).

I don't have Shimunek (2007) and I can't find anything like 이리 iri 'name' in Yu (1964)'s Middle Korean dictionary. What was the MK word?

9.1.5:31: Grinstead (1972: 13) linked the name of the creator of the Tangut script with the Khitan 耶律 Yelü clan:

Wittfogel tells us that 'a member of the Yelü family, Yelü Tulübu, was on the committee for [Khitan] script reform'. (History of Chinese Society: Liao, p. 577, note 24.) This could indicate, through more than a century, from 920 A.D. [when the Khitan Large Script was created] to 1036 A.D. [when the Tangut script was introduced]. TANGUT EVIDENCE FOR PALATALIZATION IN KHITAN (PART 1)

I apologize to Guillaume Jacques for forgetting that he first proposed Khitan-internal palatalization in the source of Tangut

1tʃhɨə 1tã 'Khitan' (last seen in line 78 of the Golden Guide)

in his 2010 review of Daniel Kane's 2009 book on Khitan. This word corresponds to qid.ún in Daniel Kane's reconstruction. The acute accent indicates a Khitan small script character for ún

distinct from that for un:

It is not known whether ún and un were phonetically distinct.

I suspect qi was [qɨ] with a nonpalatal vowel matching the nonpalatal vowels of Tangut 1tʃhɨə and Late Middle Chinese *khɨt, the first half of 契丹 *khɨttan 'Khitan'.

The Korean names for the Khitan (거란 Kŏran < *kətan and Sino-Korean 글단 [契丹] kŭlttan < *kɨrʔtan) also have a nonpalatal first vowel.

Guillaume mentioned a Tibetan name for the Khitan with an unusual voiced initial and optional velar final: Ge-tan(g). He proposed "that this name was borrowed by way of another language (possibly Uighur) and that it is not a genuine loan from the Khitan language into Tibetan." (What was the Uighur name for the Khitan?)

The various foreign versions of 'Khitan' have high and/or mid first vowels. This may reflect Khitan-internal variation.

All foreign versions of the name have a low nonlabial second vowel whereas the name has -ún rather than -an in the Khitan small script. Does this discrepancy also reflect Khitan-internal variation?

Next: Is that from Khitan?

