Since R47 is the palatal counterpart of R46, it's not surprising that it occurs almost entirely before acute initials. The following table lists all initials preceding R46 organized by Homophones chapter number. Initials that also precede R47 are in bold. Initials unique to R47 are in bold red.

p k
kh tɕh
g dz
s ɕ x
ɣ z ʑ
j l

(My consonants are identical to Gong's with one exception: I have reinterpreted his ʔj as j.)

Neither R46 nor R47 precede labiodentals (II), dentals (III), or 'retroflexes' (IV). The absence of dentals suggests that they might have palatalized before the high vowels of those rhymes. External comparison may confirm a dental stop origin for the palatal affricate in 'six':

Tangut tɕhiw R47 1.46 : Proto-rGyalrong *trɔk : Written Tibetan drug

I reconstruct 'six' in pre-Tangut as *K-trik. Note that rGyalrong varieties (Japhug, Somang, Zbu) also have kV-prefixes for 'six' but this may be coincidental. (A correlation between Tangut aspiration from *K- and kV-prefixation in rGyalrong has yet to be found.) The pre-Tangut vowel *i doesn't match the other languages* but there is no Tangut-internal evidence for a rounded vowel. The final *-k became *-w (Gong 1995) unless it is actually Tangut -iw that corresponds to the rounded vowel in the other languages.

Initials from the grave-only chapters (I and V) occur exclusively before R46. There are only two examples of labials before R46. Did all other *Pɨw-like syllables merge with *P(j)u-syllables in R2 and R3? Did labial-initial R46 syllables have some pre-Tangut segment that prevented them from shifting to R2 and R3?

Initials in the remaining chapters (VI-IX) may be followed by both R46 and R47.

Alveolar sibilants s (VI) and z (IX) are unique to Grade IV R47. Similarly, *s and *z are in Grade IV but not Grade III in the Middle Chinese rhyme table Yunjing.

However, the alveolar affricate dz (VI) is in both Grade III R46 and Grade IV R47, unlike MC *dz which is only in Grade IV in Yunjing. Such odd behavior is not entirely surprising for Tangut dz which must have had some peculiar characteristic justifying the placement of all dz-syllables into the Mixed Categories volume of Tangraphic Sea. Could Grade III dz be from an acute-grave cluster like *sg, unlike Grade IV dz which was originally acute?

(But it is difficult to believe that all ten instances of dzɨw R46 could have had *sg- in pre-Tangut. If *sgɨw occurred ten times, I would expect *skɨw to be even more common, and such an *skɨw would develop into tsɨw R46 which is unattested. Moreover, *s-clusters condition tense vowels, but dzɨw R46 < *sgɨw has a lax vowel. It's possible that one layer of *s-prefixation conditioned tense vowels and an earlier layer didn't, though the existence of a nontensing layer would need to be confirmed by word families with alternations such as dz- [< *s-g-] ~ g-.)

All palatals except for tɕh (VII) and ʑ (IX) occur only in nonpalatal Grade III R46, though I would expect them to be strongly associated with palatal Grade IV R47. Perhaps Tangut palatals were postalveolars. The Middle Chinese counterparts of tɕh and ʑ were absent from Grade IV in Yunjing. At least one instance of ʑ may be from an affricate that lenited after a prefix:

TT2880 SIXTH ʑiw R47 1.46 < *Cɯ-tɕiwk < *Cɯ-truk

cf. TT3448 SIX tɕhiw R47 1.46 < *k-tɕiwk < *k-truk

(SIXTH presumably contains an ordinal prefix, but not all ordinals are cardinals with lenited initials.)

I didn't expect the grave consonant ɣ (VIII) to be followed by Grade IV R47 as well as Grade III R46. I would have expected it to palatalize to j, which oddly never occurs before R47. (Or did *-iw dissimilate to -ɨw before j [and other palatals]?)

In Middle Chinese, ɣ occurs in Grade IV but not in Grade III. In this case, there is no reason to expect parallelism since MC ɣ-initial Grade IV syllables were once anti-palatal / pro-grave 'emphatic' syllables, whereas Tangut Grade IV syllables were presumably always pro-palatal and anti-grave (unless my derivation of 'six' is correct).

Perhaps the two instances of ɣiw R47

TT0265 MASTICATE ɣiw R47 1.46

TT3874 CALL 0 R47 1.46

had developed a secondary grave initial from a grave prefix plus an acute root initial (e.g., *k-j-; cf. Vietnamese gi- [17th c. [ɟ]?] < *kj-).

A more likely possibility is that the unexpected ɣ- in those two words reflect a foreign origin.

CALL could be a borrowing of northwestern Tang Dynasty Chinese 叫 *kjew plus a prefix conditiong lenition:

*Cɯ-kiw > ɣiw

but the vowels (Tangut i, Chn e) do not match. It's tempting to make this etymology work by changing R47 to -iew, but the Tibetan transcriptions of R47 end in -i(H), not -e. However, if Tangut lacked -iew, -iw would be the closest native equivalent to a foreign *-jew.

A foreign origin is less probable for MASTICATE. The closest Chinese word I can find is 喫 'eat' (Middle Chinese *khek), which was transcribed in Tibetan as khyig (implying *khjik) in the pre-Tangut period. There are no phonetic problems if a Tangut prefix were added to account for lenition:

*Cɯ-khik > ɣiw

but the semantic match is not perfect.

The following table compares the distribution of Tangut initials preceding R47 with the distribution of their Middle Chinese counterparts in Grades III and IV in Yunjing. Differences between Tangut and Middle Chinese are in bold red.


s z dz tɕh ʑ ɣ
less palatal Tangut Grade III R46 no yes yes yes
MC Grade III no no
more palatal Tangut Grade IV R47 yes yes yes
MC Grade IV no

(Note that I interpret Yunjing extremely literally: e.g., I regard 悉 MC *sit > Tangut period NW Chn ?*si [a transcription of siw R47 1.46 'new'] as Grade IV because it appears in the Grade IV row. 悉 is normally regarded as Grade III in spite of its placement in Yunjing.)

If the grades had similar qualities in both languages, it would not be unreasonable to expect similar distributions of initials. But that assumes the initials in the two languages are similar, and that the equations between Tangut rhymes, Tangut grades, and Chinese grades are correct. One could also doubt the existence of grades in Tangut, but I think it is difficult to question the evidence for grades in Gong's "A Hypothesis of Three Grades and Vowel Length Distinction in Tangut". The exact nature of the grades remains open to question. Both Gong and Arakawa regard both R46 and R47 as Grade III, whereas I think it's possible to posit a fourth Tangut grade roughly equivalent to the fourth Chinese grade. I am not certain that my approach is correct and I would like to test it in future posts.

*The *i of pre-Tangut *K-trik does match the *i of northwestern Tang Dynasty Chinese *liwk (the source of Kan-on riku), but its *iw is from an earlier *u. Perhaps the Tangut word for 'six' had undergone a parallel but independent evolution:

tɕhiw < *k-tɕiwk < *k-truk A RESPECT-FUL VOWEL

While looking for all instances of subscript H + vowel in Sofronov (1968), I found one very strange Tibetan transcription which could correspond to tdfdwo tangraphs:

RESPECTFUL and the name of a tree (WOOD [semantic] atop RESPECTFUL [phonetic])

Rhyme Transcription Sofronov & Kychanov Nishida Hashimoto Sofronov Huang Li Fanwen Gong Arakawa This site
R46 1.45 གཽ་ gau -jы̂ Gjəw -jäw ŋêɯ -iau gjəu gjiw -eeu gɨw

(G is Nishida's symbol for an unknown velar initial. Hyphens indicate that a scholar's reconstruction of the initial is unknown.)

This is the only instance of -au that I have ever seen. I have never found any instances of -ཻ -ai. I assume that -au and -ai are letters devised for Sanskrit transliteration since they do not appear in native Written Tibetan words. They look like stacked o and e. I doubt that -au in gau is an error for o, since a scribe would have to exert extra effort to write o twice (= au) instead of once.

All other Tibetan transcriptions of R46 lack a labial vowel: -i, -iH, -iŋ (with a nasal!). -i and -au cover the three points of the a-i-u vowel triangle.

The Chinese rhymes corresponding to R46 in transcriptions can be grouped into 4 classes on the basis of reconstructed northwestern Tang Dynasty readings (ignoring tones):

1. *-ɨw and *-iw (the majority), *-iwk > Tangut period ?*-iw

2. *-it > *-ir > Tangut period ?*-i

3. *-jew

4. *-əw

Most readings have *ɨ/*i/*j followed by *w. In Tibetan transcriptions of NW Tang Chinese, -Hu and -Hi were used to represent final *-w and *-j, but neither letter sequence appears in Tibetan transcriptions of Tangut. Does this mean that Tangut had no final glides (as in Sofronov and Kychanov's reconstruction), or that -Hu and -Hi were no longer sufficiently glidelike in local Tibetan pronunciation? A remote possibility is that both Tangut and Tangut period NW Chinese had no final glides, though I know of no modern colloquial (i.e, substratal) readings that have -V instead of -VG.

My reconstruction -ɨw has a vowel that is close to all the vowels in the Tibetan and Chinese transcriptions:

front central
high i ɨ
mid e ə

Moreover, the nonpalatality of ɨ is confirmed by the fact that many transcriptive sinographs belong to Grade III in the rhyme tables rather than Grade IV. Man'yougana and Sinoxenic evidence indicate that Grade IV was more palatal than Grade III whose rhymes may have been partially or wholly nonpalatal. And if my variation on Gong's Tangut grade hypothesis is correct, R46 -ɨw is exactly where'd I expect a Grade III (high nonpalatal vowel) rhyme to be - between Grade II R45 -ɛw (low vowel) and Grade IV R47 -iw (high palatal vowel). R47 occurs only after acute consonants with only two exceptions that I'll examine next time. LONG VOWELS IN TIBETAN TRANSCRIPTIONS OF TANGUT?

Here are all instances of subscript H + vowel that I found in Sofronov (1968):


Tone/rhyme Gong This site Transcription with subscript H Other transcriptions
1 2.1 ?lu ?(pə-)ləu bluH lu, lwu
17 1.17 pa pa paH pa
swa swa swaH swa
20 2.17 ?sja ?sia saH
21 1.21 kjaa kɨaa kaH ka
khjaa khɨaa khaH
37 1.36 sjij sie gseH se
40 1.39 njiij (kɯ-)niee ñeH ne, gneH, gñeH
48 2.41 ?seew ?(kə-)seew seH gseH
85 1.80 raʳ raʳ raH ra

(Sofronov's list of transcriptions includes fanqie and Homophones chapter numbers for the transcribed tangraphs but not the tangraphs themselves. I have supplied Gong's reconstructions on the basis of the fanqie. Most rising tone tangraphs have no known fanqie, so I am not certain if I have supplied the correct reconstructions. My guesses are preceded by question marks.)

Although subscript H represents vowel length in the Tibetan transcription of Sanskrit, it doesn't correlate with Gong's reconstructed vowel length in Tangut (which I have tentatively carried over into my revision of his reconstruction). Both short and long-vowelled syllables were transcribed with subscript H. Moreover, if subscript H was intended to represent vowel length, it should have appeared more than ten times and with more than three vowels (u, a, e). I have no idea what it's supposed to stand for in Tangut.

Is the combination of e and subscript H ever used in Tibetan? Unicode 5.1 does not have a codepoint for that combination. I've only seen it in Tangut transcription. I believe Sanskrit e is always transliterated in Tibetan as e without subscript H even though it is long [ee].

The absence of subscript H in the tense vowel rhyme cycle and its near-total absence in the retroflex vowel cycle may be significant, or it may merely be an artifact of the lower frequency of tense and retroflex vowels relative to lax vowels.

There is a bit of external evidence for the pronunciation of the Tangut syllable transcribed as paH. One of the two tangraphs associated with the syllable pa R17 1.17


represents a borrowing of 波 Middle Chinese *pa 'wave' with a level tone. (波 'wave' and 寶 MC *pawʔ > Tangut period ?*po were in turn used to transcribe Tangut pa R17 1.17.)

I suspect that MC level tone was associated with long vowels, so 波 'wave' might have been [paa]. Its long vowel [aa] might correspond to the -aH of the Tibetan transcription paH for pa R17 1.17.

Jñaanagupta, a speaker of late 6th century northwestern Chinese, used 波 'wave' to transcribe Sanskrit paa and vaa. (But earlier speakers of NW Chinese used 'wave' to transcribe both long and short-vowelled syllables. See Coblin [1994: 128].)

In the immediate pre-Tangut period, 'wave' was transcirbed as pa in Tibetan. I have never seen any Tibetan transcriptions of Chinese with subscript H.


I don't know, but I can guess.

I've never found any explanation in Nishida's 1964 book. But in the introduction to his reconstruction of the Tangut rhyme system (p. 41), he wrote that

... it is utterly inconceivable that the 97 level tone rhymes and 86 rising tone rhymes of the Tangut language represented simple vowels.

Thus he proposed three oppositions:

1. the presence or absence of medial -j-

2. the presence or absence of medial -w-

3. a four-way opposition of zero coda, -ɦ, nasalization, or retroflexion

(Oddly, he did not mention tension at this point, and I wonder if he wrote this section before he reconstructed R61-R79 with tense vowels on pp. 58-62.)

Nishida did not cite any specific evidence for his unusual coda -ɦ.

I have always assumed that he was influenced by the letter འ (transliterated on this site as H) which frequently appears at the end of Tibetan transcriptions of Tangut syllables. In Written Tibetan, final -H is an orthographic device representing the absence of a coda: e.g., dgaH [dga]. Its function in Tangut transcription is unknown. It does not correspond to

- either the level or rising tone

- any of the cycles of Tangut rhymes (plain, tense, retroflex)

- any consonant in another Sino-Tibetan language, as far as I know

(I doubt that Tangut is the only ST language that preserved a coda lost in all of its relatives.)

- Gong's proposed vowel length distinction*: e.g.,

Tibetan transcription Gong's short vowel rhymes Gong's long vowel rhymes
Only with -H R52, R53, R58 R33, R59
With or without -H R1, R3, R4, R9, R10, R11, R17, R19, R20, R28, R30, R31, R34, R36, R37, R43, R44, R46, R47, R50, R51, R56, R57, R61, R62, R64, R66, R67, R70, R71, R72, R79, R80, R82, R84, R85, R92, R93, R95 R5, R7, R13, R21, R38, R40, R100
Only without -H R2, R25, R29, R35, R41, R42, R46, R69, R75, R77, R81, R83, R87, R90, R96, R97 R12, R14, R23, R24, R101

- or even to Nishida's -ɦ:

Tibetan transcription Nishida's -less rhymes Nishida's rhymes
Only with -H R33, R52, R59 R53
With or without -H R1, R5, R9, R10, R11, R19, R31, R34, R38, R43, R44, R46, R47, R51, R56, R57, R58, R61, R62, R64, R66, R67, R70, R71, R72, R79, R80, R82, R84, R85, R92, R93, R95, R100 R3, R4, R7, R13, R17, R20, R21, R28, R30, R36, R37, R40, R50
Only without -H R2, R23, R24, R25, R29, R35, R41, R42, R69, R75, R77, R81, R83, R87, R90, R96, R97, R101 R12, R14

5.7.1:23: Added tables of correspondences between -H in Tibetan transcriptions, Gong's vowel length, and Nishida's -ɦ. These tables do not include all 105 Tangut rhymes since some rhymes (e.g., R6) have no known Tibetan transcriptions.

*5.7.1:10: In Written Tibetan, subscript -H represents Sanskrit long vowels: e.g., ཱ -aH = [aa]. However, subscript -H is rare in Tangut transcriptions. I'll gather all the examples I can find in Sofronov (1968) next time. SHOULD NISHIDA'S BE REINTERPRETED AS BREATHINESS?

A comment from David Boxenhorn made me wonder if Nishida's -ɔɦ in 'bodhi' corresponded to the breathy vowel in 菩 Late Middle Chinese *pɦo ?[pɦo̤]. (See Pulleyblank 1970-71, 1978, 1984 for the arguments in favor of reconstructing in LMC.) Perhaps LMC was like Lachi which developed breathiness after proto-voiced initials (Ostapirat 2000: 84-85): e.g.,

'shoulder': Lachi pɦu ?[pɦṳ] < Proto-Kra *m-ba (cf. Proto-Tai *ʔbaa)

However, the breathy vowel in the second syllable of 'bodhi', 提 LMC *tɦjej ?[tɦje̤j], corresponds to Nishida's R43 -jẽ with nasalization, not his R40 -jeɦ.

If there is a stratum of Chinese loans in Tangut with unaspirated voiceless initials followed by an in Nishida's reconstruction, it has yet to be found. Gong ("西夏語中的漢語借詞") discovered that Chinese *voiced initials were borrowed either as voiced or as aspirated initials in Tangut depending on stratum with the exceptions of *z-, *ʑ-, and *ɣ-:

*b- in earlier stratum of Chinese loans > Tangut b-

*ph- or *pɦ- (< *b) in later stratum of Chinese loans > Tangut ph- (not p-)

*z- in earlier stratum of Chinese loans > Tangut z-

*s- or *sɦ- (< *z) in later stratum of Chinese loans > Tangut s- (not sh-, since Tangut had no aspirated s)

*ʑ- in earlier stratum of Chinese loans > Tangut ʑ-

*ɕ- or *ɕɦ- (< *ʑ-) in later stratum of Chinese loans > Tangut ɕ- (not ɕh-, since Tangut had no aspirated ɕ-)

*ɣ- in earlier stratum of Chinese loans > Tangut ɣ-

*x- or *xɦ- (< *ɣ-) in later stratum of Chinese loans > Tangut x- (not xh-, since Tangut had no aspirated x-)

I have found only two cases of Chinese voiced (or breathy) initials corresponding to Nishida's s-ɦ:

TT4434 'slanting': 斜 *sɦje > Nishida seɦ R37 1.36 (my sie)

TT3104 'Buddhist monk': 和 *xɦwa > Nishida xwɑɦ R17 1.17 (my xwa)

According to Pulleyblank, Chinese voiceless obstruents and voiced sonorants did not develop breathiness*, yet there are examples of such Chinese initials corresponding to Nishida's sonorants followed by -ɦ: e.g.,

TT4921 'wave': 波 *pa > Nishida pɑɦ R17 1.17 (my pa)

TT2644 'record': 記 *kɰi > Nishida kiɦ R11 1.11 (my ki)

TT5194 'gruel': 糜 *mɰe > Nishida meɦ R37 1.36 (my mie)

TT4012 'south': 南 *nam > Nishida nɑɦ R17 1.17 (my na)

TT1205 (second half of 虹蜺 'rainbow'): *ŋjej > Nishida neɦ R37 1.36 (my nie; the initial seems strange, but *ŋ- > n- fronting before palatals also occurred in the ancestor of standard Mandarin: the Md reading of 蜺 is ni, not yi, which would be the expected reflex of *ŋjej.)

Why would the Tangut add an extra -ɦ to such loanwords? I don't think those loanwords - or any words in Tangut - had final -ɦ.

Next: Why did Nishida reconstruct in Tangut?

*5.6.00:02: Unlike Late Middle Chinese, Lachi developed breathiness after voiced sonorants as well as voiced obstruents: e.g.,

'yam': Lachi mɦɑ < Proto-Kra *məl (cf. Proto-Tai *mən)

Could lost preinitials account for exceptions such as Lachi (not mɦɒ) 'bear' < Proto-Kra *C-me (cf. Proto-Tai *hmwi)? WHAT ANNOUNCES THE SUN?

Bodhi, which was borrowed into Tangut as


TT2134 SUN** R43 1.42

I was excited when I discovered that this word was reconstructed in Nishida (1964: 223) as

pɔɦ ljẽ

with a medial -l- which corresponded to Sanskrit dh. For a moment, I hope that this word was evidence for medial lenition after full syllables as well as presyllables. However, TT2134 had a nonliquid initial because it was listed in chapter III (dentals) of Homophones. If it did have initial l-, it should have been in chapter IX (liquids).

I assume the l was a typo, since TT2134 appears in Nishida (1966: 476) as tjẽ with t-.

A t- is unexpected because it doesn't match Chinese *d-/*th- or Sanskrit dh. The voiceless initial consonant has a similar problem, and there are other issues depending on reconstruction:

Sanskrit Chinese Tangut
bodhi 菩提
Early Middle Chinese *bo dej
(*dej was the closest equivalent to dhi.)
Late Middle Chinese *pɦo tɦjej
Tangut period NW Chinese *phu thjej
Nishida 1966: pɔɦ tjẽ
Sofronov 1968: po tɪ̭e
Li Fanwen 1986: pjuo tje (tjẽ)
Gong 1997: po tjɨj
Arakawa 1999: po tjẽ
This site: po tiẽ or po tiej

Syllable initials: Why doesn't Tangut have ph-th- or b-d-?

Final of first syllable (R51): Why would Tangut add an extra (Nishida 1966)? Why would Tangut borrow -o as -juo (Li)?

Final of second syllable (R43): Why would Tangut have a nasal vowel (Nishida, Li, Arakawa, AMR) or a nonpalatal vowel ɨ (Gong)? (Is there any non-Tangut evidence for a nonpalatal vowel in the Tangut period NW Chinese pronunciation of 提?)

Perhaps the creator of the word did not choose to imitate Sanskrit or Chinese as closely as possible because he wanted to coin a vague soundalike containing morphemes with meanings reminiscent of 'bodhi (enlightenment)'. But if that were his intent, why didn't he choose TT4495 'bright, intelligent', a homophone of TT2134 'sun', for the second syllable? Moreover, the relevance of TT2783 'report' to 'bodhi' is unclear. Nishida (1966: 473) glossed TT2783 as 覺らせる satoraseru 'awaken', but this may be a guess based on its use in 'bodhi'. No other scholar has such a gloss:

Nevsky (1960 II: 375): сообщать 'to inform'

Li Fanwen (1986: 211) and Shi et al. (2000: 124): 报 'report'

Li Fanwen (1997): 'report, announce'

*TT2783 is probably a borrowing from Chinese 報 'report; announce' which was *pawh in Early Middle Chinese and *pow in NW Late Middle Chinese (see Numoto, "Sino-Japanese Kana Usage", p. 77). It's not clear whether the Tangut borrowed the Chinese word before or after it developed a round vowel in the northwest. In modern NW dialects, words that rhyme with 'report' end in -o, -ɔ, or -au (the latter may be conservative or due to external influence [cf. standard Mandarin -ao]).

The tangraph for TT2783 may be a calque for 报, a simplified variant of 報. TT2783 and 报 both have 'hand' on the left side. But I don't know if 报 existed when tangraphy was created.

**The glosses for TT2134 vary:

Nevsky (1960 II: 614): "источник света и тепла; солнце" (source of light and heat; sun)

Nishida (1966: 476): つまる 'choke, clog, fill'

Li Fanwen (1986: 275): 光 'light'

Li Fanwen (1997): 'sun'

Shi et al. (2000: 116): 发光热 'emit light and heat'

