I think it's awkward to mix IPA in English text and unreasonable to expect people to pronounce it, so I have been using the following romanization for Tangut names based on my reconstruction:


Same as my reconstruction with the following exceptions:

- The velar nasal is ng-.

- The alveopalatals are ch- chh- j- sh- zh-.

- The glottals are h- gh-; glottal stop is only written between syllables as an apostrophe: (C)V + V = (C)V'V.


These are highly simplified according to six principles:

1. Grades are ignored. Vowels are all 'unbent' as in most Tibetan transcriptions: u, i, a, y, e, o.

2. y represents the central vowel. Cf. the romanization of Russian ы as y. Arakawa reconstructed this vowel as capital I. Perhaps it was always high, though Tibetan transcriptions with nonhigh vowels (-a, -e, -o) lead me to think it was sometimes nonhigh.

3. Length is ignored. I'm not certain that the distinction which Gong (and in turn I) reconstruct as length actually involved length.

4. Nasality is indicated with a final -n. Cf. the use of -n to indicate nasality in French.

5. The lax-tense distinction is ignored. I originally wanted to indicate tenseness with -q like Arakawa, but that could be misinterpreted as a final uvular or glottal stop rather than an indication of vowel quality. I decided to follow the Tibetan transcriptions which ignore this distinction completely.

6. Retroflexion is indicated with a final -r as in some Tibetan transcriptions.

Here is a list of rhymes in simplified romanization:

Rhyme cycle Gong's rhyme group Rhyme numbers Simplified romanization
I. Plain / II. Tense I Plain: 1-7
Tense: 61-62
II Plain: 8-14
Tense: 68-70
III Plain: 15-16 -in
IV Plain: 17-24
Tense: 66-67
V Plain: 25-27 -an
VI Plain: 28-33
Tense: 71-72
VII Plain: 34-40
Tense: 63-64
VIII Plain: 41-43
Tense: 65, 76 (sic)
IX Plain: 44-46, 48, 49 -ew
Plain: 47 -iw
X Plain: 50-55
Tense: 73-75
XI Plain: 56-60 -on
XII Plain: 104 -un
Plain: 105 -wa
III. Retroflex I 80-81 -ur
II 82-84, 99, 101 (sic) -ir
IV 85-89 -ar
VI 90-92, 100 (sic) -yr
VII 77-79 -er
IX 93-94 -ewr
X 95-96, 102-103 (sic) -or
XI 97-98 -orn

I have followed Gong's classification of rhymes 76 and 99-105 even though they are not in the order that most of the other rhymes are in. Sofronov (1968 I: 138) regarded them as a fourth cycle distinct from his third cycle (77-98 rather than Gong's and my 77-103).

The reconstruction of 97-98 as nasalized retroflex is dubious since such vowels are extremely rare. The only language I know of with nasalized retroflex vowels is Kalasha: e.g., Kalasha kõʳ sounds like Tangut 0039 2kõʳ 'tooth' (simplified romanization: korn). Unlike Tangut, Kalasha has a full set of nasalized retroflex vowels: aʳ̃, iʳ̃, uʳ̃, eʳ̃, õʳ. Are rhymes 97-98 the last survivors of a larger set of earlier nasalized retroflex vowels in Tangut or should they be reconstructed differently: e.g.,

Rhyme 97 Rhyme 98
Sofronov 1963 -ʷo -i̭ʷo
Nishida 1964 -ɔr -?
Hashimoto 1965 -awN -jawN
Sofronov 1968 -ụo -i̭ụo
Huang Zhenhua 1983 -ïor -?
Li Fanwen 1986 -ɔ̣ -ǐɔ
Gong 1997 -owr -jowr
Arakawa 1999 -o:r -(y)wor

This site

-õʳ -iõʳ
Tibetan transcriptions Tai Chung Pui: r-...-u (7), r-...-o (4), r-...-a (1; sic), -u (2)
Nishida: -ong
Chinese transcriptions Nonnasal: 謀, 箇, 左, 移作
Final nasal (nasality may have been lost by the Tangut period): o

Neither rhyme was ever used to transcribe Sanskrit, so Sanskrit must not have anything that sounds like them.

The Tibetan transcription r-...-a was probably meant to be r-....-u or r-...-o but its vowel symbol was left out or lost by accident. -a is the default vowel if there is no vowel symbol.

The complete absence of transcriptive evidence for 98 has led scholars to reconstruct it as a variant of an adjacent rhyme: e.g., I reconstruct 98 as the Grade IV counterpart of Grade I 97 -õʳ. THE GOLDEN GUIDE: LINE 52: TANGRAPHS 256-260

52: Why are these three surnames way ahead of basic tangraphs like 2541 2dzwio 'person' which doesn't appear until line 119?

Tangraph number 256 257 258 259 260
Li Fanwen number 0997 2073 3366 3230 2867
My reconstructed pronunciation 2ʔiew 1diẹ 1bɛ 2lhiẹ 2kɛ
Tangraph gloss the surname Ew; 1st half of the surname Ewde 2nd half of the surname Ewde the surname Be 1st half of Lhe-surnames suffix for persons; 'noble'; 2nd half of the surname Lheke
Word the surname Ewde the surname Lheke
Translation Ewde, Be, Lheke

256: Were the ancestors of the Ew(de) highly skilled at something? 0856 'basic' in the analysis of 0997 may refer to ancestors who are a basis for their descendants:


0997 2ʔiew 'Ew(de)' (duudim) =

0856 2məʳ 'basic' (duupux) +

3267 2tʃɨəəʳ 'skill, artistry' (dexhoodim)

257: 2073 consists of dex 'person' plus gao which never has anything to its right. gao appears in tangraphs with 'filling' semantics, but not all tangraphs with gao have something to do with 'filling' and vice versa.

258 is a straightforward semantic + phonetic combination:


3366 1bɛ 'Be' (dexhoe) =

2888 2mə 'surname' (dexpux; semantic)

1121 2bəi 'the surname Bi' (hoe; phonetic)

259: 3230 consists of dex 'person' plus the right half yeo of the homophonous phonetic 2937 2lhiẹ 'country' (baeciryeo; cir = 'water'):


yeo by itself is in turn an independent tangraph

1034 1kwo 'the surname Kwo' (similar to the pronunciation of Chn 國 'country', translation equivalent of 2937 2lhiẹ, though it's not derived from 2937 in Tangraphic Sea)

260: 2867 consists of dex 'person' plus boxbildek, a unique combination not to be confused with

4481 1ʃɨə 'to go' (fisbildek)

Were the Lheke the 'Noble Lhe'? Or is Kychanov's (2006: 727) gloss 'noble' based on the assumption that 2867 was an adjective modifying 2628 'man' in

2628-2867 1gooʳ-2kɛ 'gentleman'

rather than a suffix according to Li Fanwen (2008: 470)? THE GOLDEN GUIDE: LINE 51: TANGRAPHS 251-255

51: This line consists of a list of Tangut surnames. I have romanized the surnames in a very simplified form of my lay notation without -q for tense vowels. I'll explain this notation in detail later.

Tangraph number 251 252 253 254 255
Li Fanwen number 1518 3437 5654 3878 1633
My reconstructed pronunciation 1tʃɨẹ 2ʔõ 2dẹ 2mie 2pəụ
Tangraph gloss 1st half of the surname Che'on the surname On; 2nd half of the surname Che'on the surname De the surname Me; 1st half of the surname Mepu old; aged; 2nd half of the surname Mepu
Word the surname Che'on the surname Mepu
Translation Che'on, De, Mepu

251: The analysis of 1518 makes no obvious phonetic or semantic sense:


1518 1tʃɨẹ 'the first half of the surname Che'on' (fambalciebil) =

3198 1ləụ 'the surname Lu' (dexweu) +

5031 1 'By, name of the ancestor of the black-headed Tangut (biobalbaeduucin) +

2879 2dzo 'the surname Dzo' (dexboxbilciebil)

Were the Che'on related to the Lu and the Dzo and descended from By?

The analyses of the other four tangraphs in this line are unknown.

252: 3437 consists of geo 'sage' plus por, a radical of unknown function.

253: 5654 consists of pik 'hand', a vertical line (bae), and her 'slant'.

254: 3878 (vixful) is the only tangraph with the radical vix. The right radical (ful) is phonetic and probably an abbreviation of

2mie 'to not exist' (gerful)

from the previous line.

255: 1633 represents a Tangut word 'old' as well as the second half of a surname. Could 3878-1633 'Mepu' originate from a phrase 'Old Me' (as opposed to the surname Me written with 3878 by itself)?

Why do Tangut surnames generally require their own unique tangraphs? Why didn't the Tangut simply recycle tangraphs for homophonous syllables whenever possible? There are twelve other tangraphs pronounced 2mie. Why not use one of them for the first half of the surname Mepu? Why create a tangraph that is not only unique to that name but also contains a unique radical vix? THE GOLDEN GUIDE: LINE 50: TANGRAPHS 246-250

This line ends the first fourth of the Golden Guide. The following lines are quite different.

Incredibly, the extremely basic tangraph 1918 1mi 'not' doesn't appear until this line. The related verb 1mie 'to not exist' also doesn't turn up unitl now.

50: Topic (Number (246) + Noun (247)) + Comment (Subject: (248 + 249) + Verb (250))

Tangraph number 246 247 248 249 250
Li Fanwen number 0966 5243 1918 4279 2194
My reconstructed pronunciation 2khiə 2sie 1mi 1vəə 1mie
Tangraph gloss ten thousand people not to bend over; to subdue to not exist
Translation There were none among the myriad masses who did not submit.

A more literal translation would be 'Ten thousand people - the non-submissive were nonexistent.'

246: 0966 has a nonsensical analysis. How is 'tooth' relevant? And 3135 is obviously derived from 0966, not the other way around:


0966 2khiə 'ten thousand' (bixbaebumcin) =

0974 1phə 'tooth' (bixbaehelcin) +

3135 2vɨã 'transcription tangraph' (dexbixbaebumcin)

3135 sounds like Chinese 萬 'ten thousand'.

The real analysis of 0966 may be bixbae + bumcin, but what would be the sources?

247: 5225 may be phonetic in 5243, though the match isn't very good. is presumably supposed to imply lots of people.


5243 2sie 'people' (guofos) =

5225 2ʃɛw 'color' (guofox; loan from Middle Chinese 色 *ʂək before Tangut *-k > -w) +

0359 1təụ 'thousand' (bulfos)

248: 1918 is so much more complex than its Chinese equivalent 不 'not'. I cannot believe that 'hinder' and 'stupid' predated 'not':


1918 1mi 'not' (ciadiocok) =

3497 2ləụ 'to hinder; obstacle' (ciapax) +

5208 1gɛɛ 'stupid' (pikdiocok)

249: Is submission a way of becoming familiar (getting close to, advancing toward) the powerful?


4279 1vəə 'to bend over' (boxdexfua; loan from Chinese 服?) =

4398 1tiẽ 'to advance; powerful' (boxfoltun) =

2183 1ʃɔ̃ 'familar' (dexfua)

250: The common verb 2194 'to not exist' does not exist in the Tangraphic Sea, but is in Precious Rhymes of the Tangraphic Sea:


2194 1mie 'to not exist; (gerful) =

4932 2riʳ 'single, lonely' (biogerful; semantic) +

0746 2giõ 'stuporous, comatose' (fulduucin; ?)

This analysis is improbable because 0746 has no apparent relevance and 2194 must predate 4932 'lonely' (< 'not have anyone'?). Surprisingly, the analysis of 4932 doesn't contain 2194:


4932 2riʳ 'single, lonely' (biogerful) =

5050 2riʳ 'wealth, money' (biogeodexceu with geo instead of ger!; phonetic) +

0469 2miọ 'lonely, wifeless' (jeiful; semantic)

There is, however, another tangraph for 'lonely' that does have 2194 in its analysis:


4924 2lɨaa 'lonely; solitary' (biodexful) =

4951 2lɨaa frontier; border'' (biodexgik; phonetic)

2194 1mie 'to not exist' (gerful; semantic) THE GOLDEN GUIDE: LINE 49: TANGRAPHS 241-245

49: Object (241 + 242) + Adverb (243 + 244) + Verb (245)

Tangraph number 241 242 243 244 245
Li Fanwen number 0773 2430 2748 1737 3844
My reconstructed pronunciation 2biuu 2bio 2tʃɨa 1ka 1dʒɨẹ
Tangraph gloss reward; award to punish; to penalize morals; virtue even; equal to go; to send; to move; to strive; to use
Translation Rewards and punishments are carried out in a virtuous and fair fashion.

241: 0773 consists of bos 'sound' plus jae ('language' with 艹 on top) which occurs nowhere else. What does 'sound' have do with 'award'?

242: 2430 might have this analysis:


2430 2bio 'to punish' (jinbuldex) =

3551 2niõ 'evil' ̣(jinquu) +

5334 1dzwə 'to arrest, catch' (pikbuldex; 'hand' on left)

243: I have no idea what the analysis of 2748 is. I wonder if it's related to 2089, second syllable of the reduplicative verb 2lɨə-2lɨi 'to dote on':

The only differences between 2748 and 2089 are the length of 二 and the height of ヒ:


244: 1737 is a phonetic-semantic compound:


1737 1ka 'equal' (feahal) =

0822 1ka 'collapse' (feabie) +

1576 2kạ < *S-ka-H 'equality' (feahalfir)

1737 and 1576 are cognates and 1576 must be based on 1737, rather than the other way around. The analysis of 1576 is unsurprising, though I don't understand the function of 2396:


1576 2kạ 'equality' (feahalfir) =

1737 1ka 'equal' (feahal) +

2396 2dzəəu 'to sit' (dexdexfir; structure is analogous to Chn 坐 - two 'people' atop 'earth'):

245: In Chinese, 行 'to go' can also mean 'carry out', and I assume 3844 'to go' also has the same range of meaning.

I also assume 3844 predates 5067 which has a less basic meaning:


3844 1dʒɨẹ 'to go' (doejam) =

3989 1ʔie 'upward motion/optative prefix' (doehal) +

5067 1vɔ 'to send' (hanjam)

7.7.0:05: I wonder if 5067 is cognate to 5071 1vɨi 'to dispatch' and even 5113 1vɨi 'to do, make', all written with doe 'to make' (Nishida 1966: 242) on the left:


5071 is the source of the left side of 5067:


5067 1vɔ 'to send' (hanjam) =

5071 1vɨi 'to dispatch' (handao) +

3844 1dʒɨẹ 'to go' (doejam) THE GOLDEN GUIDE: LINES 47-48: TANGRAPHS 231-240

47: A list of verbs: (231-232), (233-234), (235)

Tangraph number 231 232 233 234 235
Li Fanwen number 3692 0342 4059 0269 1736
My reconstructed pronunciation 1bə 1dziə 1tʃɨẽ 1khiəə 1tʃɨu
Tangraph gloss to cast, to throw to lose to examine to gather to do, to work, to manage
Word to throw away to investigate
Translation Discarding, investigating, managing,

231: Li Fanwen (2008: 597) translated 3692-0342 as 丢棄 'to discard' but Nie and Shi translated it as 取舍 'accept or reject' (lit. 'take-discard'). My translation is more literal.

3692 has a strange analysis consisting of two nearly homophonous phonetics and no semantic element like 'hand':


3692 'to throw' (dextiebilbil) =

3304 2bə 'a place name' (dexfamceudex) +

4908 1bə 'rite; protocol' (tiebilbilcin)

232: How does 'not placing half' suggest 'to lose'?


03421dziə 'to lose' (herfomful) =

5449 1tị to place' (pikhercin) +

0074 1khwə 'half' (baaden = fom) +

2194 1mie 'not' (gerful)

233: How does 'abandoning basics' suggest 'to examine'?


4059 1tʃɨẽ 'to examine' (geipikfal) =

4018 2tʃhɨi 'basic' (geidao) +

5671 1thiə 'to abandon' (pikfal)

Is gei derived from Chn 本 'basic'?

234: I suspect 4236 was derived from 0269 rather than the other way around:


0269 1khiəə 'to gather' (belgoscok) =

0823 1khiəə 'shivery' (belgosfeo; phonetic) +

4236 2ʔa 'to gather, to concentrate' (boxbelgoscok; semantic)

235: See line 46.

48: Objects ((236) + (237)) + Object ((238) + (239)) + Verb (240)

236 and 237 seem synonymous with 238 and 239. The latter two may have been added to fill up the line.

Tangraph number 236 237 238 239 240
Li Fanwen number 0993 3668 5578 5158 3268
My reconstructed pronunciation 1lhew 1lị 1nwiə 1siọ 1tʃhwɨẽ
Tangraph gloss to herd, to graze; a surname to plant; to cultivate; agriculture to herd agriculture; farming to prohibit, to restrain, to forbid; to warn; to administer
Translation In charge of herding and farming.

236: Could 0993 be a distortion of Chn 牧 'to herd? I think it's rather unlikely that 0993 is derived from 1021 and that the left vertical line of 5578 is in 0993.


0993 1lhew 'to herd' (bombaecie) =

1021 1lhew 'miscellaneous' (bombaegax; phonetic) +

5578 1nwiə 'to herd' (voxpok; semantic) +

1514 1lɨəə 'to suppress' (hilbalcieces; semantic?)

1021 is, of course, derived from 0993:


1021 1lhew 'miscellaneous' (bombaegax) =

0993 1lhew 'to herd' (bombaecie) +

5851 2nieʳ 'all kinds of' (taxgak)

237: 3668 'agriculture' looks 'earth' plus 'hand':


3668 1lị 'agriculture' (gespik) =

2627 2lɨə̣ 'land' (gesgir; semantic)

5159 1kwie 'convulsion' (pikfil; why?)

238: The analysis of 5578 is unknown, but it may be derived from 4708 1ʃɛ 'to guide' plus 3719 2lhew 'to herd' (obviously cognate to 0993 1lhew 'to herd'):


If so, where do the remaining parts of 5578 (the bottom left and the upper right ユ) come from?

239: 5158 consists of 'hand' plus fur which is always on the right:


5158 1siọ 'agriculture' (pikfur) =

5802 2dəəi 'to plow' (pikdui; semantic) +

0252 1siọ 'long and thin' (tunfur; phonetic)

240: I follow Nie and Shi's translation of 3268 as 管 'to be in charge'


3268 1tʃhwɨẽ 'to be in charge' (dexhilcieces) =

5341 2riʳ 'to prohibit' (pikfoldex; semantic) +

1514 1lɨəə 'to suppress' (hilbalcieces; semantic) THE GOLDEN GUIDE: LINES 45-46: TANGRAPHS 221-230

45: Subject (221-222) + Location (223-224) + Verb (225)

Tangraph number 221 222 223 224 225
Li Fanwen number 5212 0760 3370 1408 3708
My reconstructed pronunciation 1pi 2dziẽ 2gəu 1lhiooʳ 1phia
Tangraph gloss to discuss to judge; to decide generally; together; common place; site; market to break off, to sever, to cut off, to stop, to curb; to judge?
Word prime minister law court
Translation The prime minster judges at court.

221: Why do the tangraphs for 'discuss' have pik 'hand' in them?


5212 1pi 'to discuss' (pikbui) =

4464 1lɨə 'to discuss' (puxpik) +

1014 1ŋwəəu 'speech' (buicun; 'language' + 'language')

I don't know the semantic difference, if any, between the two language radicals (bui and cun).

222: 0760 is a phonetic-semantic compound:


0760 2dziẽ 'to judge' (besfalcin) =

1630 2dziẽ 'to carve' (bespik; phonetic) +

2708 1riaʳ 'to manage' (ciafalcin; semantic; falcin is not an independent tangraph)

223: 3370-1408 'law court' < 'common place' is a calque of Chinese 公堂. The analysis of 3370 is unknown.

224: The function of the left two-thirds of 1263 in 1408 is unknown:


1408 1lhiooʳ 'place' (puebeejox) =

1535 1lo 'to gather' (pueqaldex; phonetic; last seen in line 43) +

1263 1khee 'recreation; game' (beejoxbee)

225: Neither Kychanov (2006: 468) nor Li Fanwen (2008: 599) defined 3708 as 'to judge', but I follow Nie and Shi in translating it as 裁 'to judge'. I suspect that the Tangut used 'to cut' for 'to judge', just as the Chinese used 裁 'to cut' for 'to judge'. Also cf. Chn 斷 'to cut off' > 'to decide'.

What is the left side of 3351 doing in 3708?


3708 1phia 'to cut off' (dexcoipik) =

3351 1ɣiw 'to call' (dexfamcurgii) +

4459 2bia 'to break' (coipik)

3708 is a transitive/causative verb derived from intransitive/noncausative 4459 via a voiceless prefix *K-:

*k-b- > *xb- > *bh- > ph-

Perhaps this prefix was actually *S-, but I have already derived tense vowels rather than aspirates from *S-.

The radical coi in 3708 and 4459 resembles Chinese 扌'hand' but unlike 扌, it generally appears in the middle of characters. 4459 is the only instance of coi on the left. The function of coi, if any, is unknown.

46: I can't make much sense of this. Is it a series of afterthoughts describing the prime minister in the previous line?

Verb (226-227), Adjective (228), Object (229) + Verb (230)

Tangraph number 226 227 228 229 230
Li Fanwen number 2305 0474 1474 1183 5043
My reconstructed pronunciation 1vɔ 2go 2ʔɨəʳ 2dạ 2dzwiə
Tangraph gloss to clear away to get rid of diligent matter; affair; thing to resolve
Word to clear away
Translation Clearing away, diligent, resolving matters.

226: 2305-0474 is a synonym compound.

Why does 2305 contain vil 'wind'? Because 'clear away' is like 'blow away'?


2305 1vɔ 'to clear away' (vilfun) =

2302 1lɨə 'wind' (vildexcok) +

1836 1tʃiẽ 'correct' (funham)

227: 0474 is the mirror image of its synonym 2305:


0474 2go 'to get rid of' (funvil) =

1836 1tʃiẽ 'correct' (funham) +

2302 1lɨə 'wind' (vildexcok)

I was expecting a circular derivation of 0474 from 2305 and/or vice versa and was surprised.

228: What does 3355 'wrestling' have to do with 1474 'diligent'? Li Fanwen (2008: 246) translated 3355 as 力 'strength', the semantic element in 勤 'diligent', but his entry for 3355 has the translation 相撲 'wrestling' (the same characters as 'sumo'!).


1474 2ʔɨəʳ 'diligent' (folfeijiu) =

3355 2lə 'wrestling' (dexfolwex)

1571 2ʔɨəʳ 'platform' (feijiu; phonetic)

Nishida (1966: 243) translated the radical fol as 齊しい 'equal', but that meaning seems irrelevant here. fol is vaguely similar to the top of 齊 plus the H-like shape at the bottom.

229: I don't know what the analysis of 1183 'matter' is, but it appears in the analysis for 1736 'to do' (not the default verb for 'to do'):


1736 1tʃɨu 'to do' (heryee) =

1183 2dạ 'matter' (herwir) +

5593 1bɔɔ 'to look at' (yeecor; why is this relevant?)

1183 and 1736 are translation equivalents of Chinese 事 as a noun and verb.

Does the lost analysis of 1183 include 1736?

230: Envoys (0457) are sent to speak (4902) and resolve (5043) matters (1183).


5043 2dzwiə 'to resolve' (bioboajal)

4902 1ŋwəəu 'to speak' (biodexbelcin)

0457 1ʃɨə 'envoy' (boajal; loan from Chinese 使)

