Siamese kaw 'nine' looks like Cantonese kaw 'nine', but the former is not a loan from Cantonese because the a in the two words has different origins. In the previous entry, I reconstructed Proto-Tai 'nine' as *kɔu. Cantonese kaw 'nine', on the other hand, may come from something like southern Middle Chinese *kɨw (cf. Sino-Vietnamese cửu [kɨw] 'nine'). Siamese a is a lowered whereas Cantonese a is an that lowered little over a millennium ago (i.e., probably after the breakup of Proto-Tai).

Since a Chinese loanword for 'nine' can be reconstructed at the Proto-Tai level, and since other Chinese loanwords in PT look quite old (e.g., 'five'*), it is generally** better to compare Chinese loans in PT with (Late) Old Chinese than with forms from modern Chinese languages.

The early Old Chinese word for 'nine' could have been *kwəʔ or *kuʔ. If it was *kwəʔ, it probably shifted to *kuʔ and remained unchanged in late Old Chinese without any mid vowel. So why wasn't the word borrowed into PT as *ku?

PT *-ɔu is identical to the late OC reflex of OC *u after emphatics. Could PT *kɔu reflect a southern late OC *kɔuʔ < *kouʔ < *kuʔ < early OC *Cʌ-kuʔ with an emphasis-triggering low-vowelled presyllable?  (But I know of no other evidence for an emphatic variant of 'nine' in Old Chinese.)

An emphatic solution can't be applied to another odd correspondence involving PT *-ɔu because LOC palatal * must come from an early OC nonemphatic (in this case, *t):

'master': PT *ɔu : 主 LOC *tɕuoʔ < *Cɯ-toʔ

I would rather not claim that PT metathesized the LOC diphthong. Maybe LOC *-uo was phonetically *[uow], but I know of no other evidence for a final *-w.

*Proto-Tai *ha < *hŋ- 'five' has a low vowel like Old Chinese *ŋaʔ instead of a mid vowel like Middle Chinese *ŋoʔ.

Li (1977: 251) cited a Sui form ŋo resembling MC *ŋoʔ. Although Li seemed to assume that the Sui and PT forms share an ancestor (i.e., a Chinese loan into Proto-Kam-Tai), I wonder if Middle Chinese *ŋoʔ was borrowed into Sui after its ancestor Proto-Kam-Sui split from Proto-Tai. Did PKT *a become *o in Sui? I doubt it, since Sui -aa corresponds to Siamese -aa < PT *-a (Ostapirat 2000: 4-8):

Gloss Sui Siamese
eye ndaa taa
dog hmaa maa < PT *hma
sesame ʔŋaa ŋaa
thick ʔnaa naa < PT *hna
grandmother jaa jaa

If Sui inherited a PKT word for 'five', I would expect ŋaa instead of ŋo.

**Most northern Tai languages have -u in 'nine' from PT *-ɔu, but Wuming has kau (a recent loan from Cantonese kaw?) and Qiangjiang has kjuu, which looks like my southern Middle Chinese *kɨw (Li 1977: 291) and is probably not inherited from Proto-Tai.  Hence Qiangjiang kjuu is not evidence for Li Fang-kuei's *-j- in 'nine'. David Strecker (1983: 71) wrote:

The medial -j- [in 'nine' in Qiangjiang and other northern Tai dialects which have kjuu or khjuu] might go back to Li's Proto-Tai medial but I suspect that these dialects have simply reborrowed the word from modern Chinese dialects.

Strecker (1983: 41) reconstructed Proto-Tai 'nine' as *kaw ~ *kuu.  *kaw looks like *kɑuʔ, which is what my late OC *kɔuʔ would become in very late OC and Middle Chinese.  And of course his *kuu looks like my Middle Chinese *kuʔ.  He cited Sarawit's reconstruction of *kəəw which doesn't match any of my Chinese reconstructions but still lacks Li Fang-kuei's *-j-. LI FANG-KUEI'S PROTO-TAI DIPHTHONGS

I'm always looking at vowel systems, hoping to learn something that I can apply to my reconstruction of the Tangut vowel system.

0. The basic Proto-Tai vowels

In A Handbook of Comparative Tai (1977: 297), Li Fang-kuei reconstructed a nine-vowel system for Proto-Tai:

*i *u
*e *o

Li also mentioned an alternative solution by Mary Elizabeth S. Sarawit in her 1973 PhD dissertation: that is like my pre-Tangut six-vowel system if schwa is omitted:

*i *u
*e (*ə) *o

(Sarawit also reconstructed long versions of all seven vowels. My six pre-Tangut vowels also have long versions that I have tentatively carried over from Gong's reconstruction in spite of the lack of evidence for Tangut vowel length.)

The nine vowels of Li's system combine to form a large number of diphthongs. He distinguished between diphthongs with unaccented and accented high first elements. Li used an subscript inverted breve to represent lack of accentuation. To avoid font issues, I will write his unaccented high vowels as glides: *ɰ, *j, *w. These glides have very restricted distributions:

1. *ɰV sequences

*ɰa(i) *ɰɔ

I would expect a *ɰo to fill up the back vowel column. Could *ɰɔ have been *ɰo?


*ɰo has the following reflexes:

Proto-Tai Siamese (SW Tai) Lungchow (Central Tai) Po-ai (Northern Tai)
*ɰɔ (or *ɰo?) ɔɔ oo or ɨ ɨ(ɨ)

Proto-Tai *ɔ also became Lungchow oo, so the upper mid height of Lungchow oo is not strong evidence for the upper mid vowel in my *ɰo.

2. *jV sequences

*je *jəu *jo

*j seems to precede only nonhigh vowels.

The absence of *ja is strange because *a is such a common vowel. I would reinterpret *jɛ as *ja (cf. Li's Proto-Northern Tai *ia < his Proto-Tai *jɛ). *jɛ (or *ja?) has the following reflexes:

Proto-Tai Siamese (SW Tai) Lungchow (Central Tai) Po-ai (Northern Tai)
*jɛ (or *ja?) ɛɛ ee ii

Lungchow has no ɛ(ɛ), so its ee could be from *ɛɛ < *jɛ < *ja. Li reconstructed as a source of Lungchow ee.

*jəu is an isolated oddity. I don't understand why Li reconstructed a *j in it because its modern reflexes are all nonpalatal (but see my next entry!):

Proto-Tai Siamese (SW Tai) Lungchow (Central Tai) Po-ai (Northern Tai)
*jəu au au uu

Perhaps the *j is due to the fact that two of the words with this diphthong are Chinese loanwords which would have a *j in Li's Old and Middle Chinese reconstructions. However, I reconstruct those Chinese words without *j:

Gloss Sinograph Old Chinese (LFK) Middle Chinese (LFK) Late Old Chinese (AMR) Proto-Tai (LFK) Proto-Tai (AMR; see below) Siamese (SW Tai) Lungchow (Central Tai) Po-ai (Northern Tai)
nine *kjəgwx *kjəu *kuʔ *kjəu *kɔu kau kau kuu
master *tjugx *tɕju *tɕuoʔ *tɕjəu *ɔu tɕau tɕau ɬuu

I think *jəu might have been *ɔu, which would fill a gap in Li's nearly full set of -Vu diphthongs:

*iu *ɨu
*eu *əu *ou
*ɛu *au (no *ɔu)

Proto-Tai *ou became oo in Po-ai. Perhaps that left a gap that was filled by *ɔu:

Proto-Tai *ɔu > pre-Po-ai *ou

This new *ou later became Po-ai uu.

Changing *jɛ to *ja and reinterpreting *jəu without *j results in a symmetrical triangular chart:

*je *jo

3. wV sequences

*wi *wɨ
*wə(i) *wo
*wa *wɔ

I'm surprised there is no *we or *wɛ. Perhaps words that Li would reconstruct with labiovelar + were velar + *wɛ: e.g., *khwɛn 'suspend'. (I can't find any examples of *Kwe-type words in Li's chapter on labiovelars.)

4. Why move proto-vowels around?

I assume that reconstructed vowel systems should resemble attested vowel systems which tend to have systematic (and sometimes even symmetrical) patterns.

*Whatever happened to Mary Elizabeth S. Sarawit? This is the only reference I've ever seen to her. Li doesn't list her dissertation in his bibliography. Maybe I can order it. David Strecker (1983) discussed Sarawit's vowel system at length. 3 X 3: THE TANGUT VOWEL SYSTEM?

I can't settle on a reconstruction of the Tangut vowel system. Here's what I think it might be at the moment:

Pre-Tangut had six basic vowels (disregarding length):

*i *u
*e *a *o

Although this looks (almost) exactly like the early Old Chinese vowel system (depending on whether one writes the sixth OC vowel as *ɨ, *ɯ, or *ə), there is no guarantee that pre-Tangut and early OC had the same vowels in the same etyma.

0. The 3 x 3 grid

In Tangut, each of these vowels developed up to eight variants conditioned by surrounding segments. The nine 'daughters' of each pre-Tangut vowel can be described with a 3 x 3 grid if length and nasalization are ignored:

Plain Tense Retroflex
High (Grade III [-palatal] /IV [+ palatal])
Mid (Grade I)
Low (Grade II)

I have replaced my earlier high/velarized/low distinction with a high/mid/low distinction since I have never heard of velarized retroflex or velarized tense vowels. Of course, "never heard of" does not mean 'does not exist'. (I felt quite awkward writing R78 as -eɣr in "Stellar Changes" and changed it to r.)

The terms 'high', 'mid', and 'low' are relative: e.g., 'low' ɛ is actually a lower mid vowel, but it is lower than its 'mid' counterpart e and its 'high' counterpart ie.

The identification of Grade II as the low series was modelled after Schuessler's late OC reconstruction (though Schuessler characterized his reconstruction's low vowels as 'retroflex'). Many scholars reconstruct lower-mid and/or low vowels in Grade II for Middle Chinese. It is not clear what the characteristic of Grade II was in Tangut period northwestern Chinese, but TPNWC Grade II must have sounded like Tangut Grade II because of the strong correlation between Grade II in transcriptions and borrowings observed by Nishida and Gong.

In the following grids, I assume that the lowness of Grade II more or less continued from late OC to the Tangut period in northwestern Chinese. Vowels that have not changed since pre-Tangut are in bold. Each High series vowel has up to three subvariants* that I have omitted so that the basic pattern can be clearly seen. I have also omitted long and nasal variants.

1. The u-grid

Plain Tense Retroflex
High (Grade III) u u
Mid (Grade I) ou oụ ou
Low (Grade II) (theoretically ʊ; does this row exist?)

For now, I follow Gong and reconstruct no Grade II u which would have been [ʊ]. I am still uncertain about how to reconstruct the u-rhymes.

2. The i-grid

Plain Tense Retroflex
High (Grade IV) i ir
Mid (Grade I) ei eị eir
Low (Grade II) ɪ ɪ̣ ɪr

Only this grid and the o-grid (#6 below) have no lacunae.

3. The a-grid

Plain Tense Retroflex
High (Grade III) ɨɐ ɨɐ̣ ɨɐr
Mid (Grade I) ɐ ɐ̣ ɐr
Low (Grade II) a (none) ar

I am the least certain about this grid because I think Grade I a is more common than Grade II a, and I'd rather use a less exotic symbol to represent low a. The absence of is also suspicious. If Tangut was like Middle Chinese, then Grade I a should be [ɑ] and Grade II a should be [æ].

4. The ɨ-grid

Plain Tense Retroflex
High (Grade III) ɨ ɨ̣ ɨr
Mid (Grade I) əɨ əɨ̣ əɨr
Low (Grade II) ə (none) ər

This grid has the same problem as the a-grid: the vowels are more exotic in Grade I than in Grade II, though it should be the other way around. Perhaps Grade I ɨ was [ə] and Grade II ɨ was [ʌ]. If one wants the other grid members to be back like [ʌ], high ɨ could be [ɯ] and mid ɨ could be [ɤ].

Note that both achromatic vowels (a and ɨ) have no tense low variants.

5. The e-grid

Plain Tense Retroflex
High (Grade IV) ie iẹ ier
Mid (Grade I) e (none) er
Low (Grade II) ɛ ɛ̣ ɛr

I don't know why tense mid is missing. Although Gong reconstructed R63 as Grade II -iẹj (my -ɛ̣), perhaps it should be Grade I -̣e. A lacuna in the low tense box would match the lacunae in the u, a, and ɨ-grids. I suspect that some low tense vowels merged with mid tense vowels.

6. The o-grid

Plain Tense Retroflex
High (Grade III) uo uọ uor
Mid (Grade I) o or
Low (Grade II) ɔ ɔ̣ ɔr

7. The basic vowel inventory

If diphthongs are broken up and length, nasality, tenseness, and retroflexion are ignored, Tangut has at least a dozen vowels:

Front Central Back
Labiality - + - +
High i y (see here*) ɨ u
Not quite high ɪ (ʊ?ge)
Upper mid e ə o
Lower mid ɛ ɐ ɔ
Low a

This chart includes all vowel symbols in this post. Uncertain vowels are in parentheses.

Front Central Back
Labiality - + - - +
High i y (see here*) (ɨ?) (ɯ?) u
Not quite high ɪ (ʊ?)
Upper mid e (ə?) (ɤ?) o
Lower mid ɛ (ɐ?) (ʌ?) ɔ
Low (æ?) (a?) (ɑ?)

If the above chart had more [+labial] front and central vowels, it would be a nearly complete list of IPA vowel symbols! A large vowel inventory is inevitable because Tangut had 105 rhymes and the Tibetan transcriptions of Tangut have almost no codas.

*The subvariants of High series vowels can be arranged in a 2 x 2 grid:

Kaikou: [-labial] first element Hekou: [+labial] first element
Grade III ɨV uV
Grade IV iV yV

Note that the hekou Grade III variant of u(u) is simply u(u) (not uu[u]) and the kaikou Grade IV variant of i(i) is simply i(i) (not ii[i]). All other Grade III and IV vowels are diphthongs combining a high vowel with a nonlow vowel. y only occurs in diphthongs. STELLAR CHANGES

In "ʔo-ʔo", I proposed that Tangut R51 -o was partly from *-aw. Here's a possible instance of R51 corresponding to Matisoff's Proto-Tibeto-Burman *-aaw:

TT1086 子/兒 CHILD no R51 2.42 < *naw-H

cf. PTB *naaw 'younger sibling', Jingpho nāu (the macron indicates a high tone, not a long vowel), Lushai, Thado, and Tangkhul nau (see Matisoff [2003: 225-226] for other cognates)

(but the semantics are imperfect and the Tangut vowel is short; why isn't the Tangut form noo R54 2.45?)

Could Tangut -e rhymes be at least partly from earlier *-aj? The title refers to Tangut e-words which may correspond to -aj elsewhere:

TT0207 STAR gịe (Gong: gjịj) R64 1.61 < *Cɯ-ge

cf. PTB *graaj, Written Burmese krai 'star'

(Why doesn't the Tangut form have vowel retroflexion corresponding to *-r-?*)

(It's not clear how TT0036 STAR *gɨ̣ R72 2.61 < *C-gɨ-H is cognate, unless it's from an earlier *gɨ̣e; the rhyme -ɨ̣e is absent from my reconstruction and the medial -ɨ- may be a trace of a feature in a lost presyllable.)

TT1674 CHANGE le (Gong: lej) R34 2.30 < *le-H

TT5477 CHANGE lhe (Gong: lhej) R34 2.30 < *k-le-H

TT2073 CHANGE lie (Gong: ljij) R37 1.36 < *Cɯ-le

cf. PTB *laj ~ *lej, Lushai lei 'change', Pwo lai 'exchange'

Old Chinese 移 *laj 'move', 易 *lek < ?*laj-k 'change'

It's not possible to use Chinese loanwords as evidence for deriving R51 -e from *-aj, since Tibetan transcriptions indicate that northwestern Chinese monophthongized *aj to an e-like vowel before the Tangut period. Thus Chinese loanwords such as TT4119 xe (Gong: xej) R34 1.33 'sea' could be from NW Chn *xaj (premonophthongization) or *xe (postmonophthongization). In modern Xi'an, 'sea' is xɛ. The monophthongization of *-aj may have spread from NW Chinese to Tangut.

*Other Qiangic languages have an r or retroflex obstruents in their words for 'star':

rGyalrong: ʑŋgri [Japhug], tsu-rî [Somang]

Pumi: ɖɨ [Dayang], ɖʐə [Jinghua and Taoba])

Ronghong Qiang ʁɖʐə)

Tangut did not have tense retroflex vowels, so some tense-vowelled syllables may go back to pre-Tangut *CCrV or *CCVr: e.g.,

Stage 1 Stage 2: root vowel bent up to match height of presyllable vowel Stage 3: loss of presyllable vowel Stage 4: preinitial assimilated to root initial and merged into a single tense consonant gg- (cf. Korean) Stage 5: loss of *-r- after tense consonants Stage 6: spread of tenseness into vowel Tangut: loss of tense consonants; tenseness in vowel became phonemic
*Cɯ-gre *Cɯ-grie *C-grie *ggrie *ggie *ggịe gịe R64 1.61

Even if 'star' lacked a tense vowel, it would still not have had a retroflex vowel. Retroflex e-vowels

R77 -er (Gong: -ejr)

R78 r (Gong: -iejr) (see my next post)

R79 -ier (Gong: -jijr)

did not occur after velars. Perhaps *KEr -syllables merged with plain-vowelled KE-syllables. ʔO-ʔO

The second syllable of Tangut

lɨ̣̣ ʔo 'rabbit'

is mysterious for three reasons. Its initial and rhyme are as uncertain as its etymology.

Here are several reconstructions of that second syllable and its homophones (e.g., 'enter') in Homophones group VIII.84-85 (44B31-44B52*). If a scholar has not reconstructed an initial, I have written a hyphen (e.g., Hashimoto).

Nishida 1964 ɣɔɦ
Hashimoto 1965 -ɔwN
Sofronov 1968 ʔo
Huang 1983 -uẽ or -uɐ̃
Li Fanwen 1986 wjuo
Gong 1997 ʔo
Arakawa 1999 (in Kotaka's online glossary) ʔo

Until now, I agreed with Sofronov, Gong, and Arakawa, but now I'm not so sure.

Was the initial of VIII.84-85 a glottal stop?

I suspect it might have been (ʔ)w- for four reasons:

1. The fanqie initial speller of VIII.84

TT4472 ʔou (Sofronov and Gong: ʔu) R1 2.1

was used as an initial speller for

TT0234 wõ (Sofronov: wõ, Gong: wow) R56 1.54

in the labiodental section of Homophones. This suggests that TT4472 and VIII.84 had initial (ʔ)w-. (Reconstructing an initial glottal stop would explain why TT4472 and VIII.84 were listed in the glottal section of Homophones.)

2. I don't have Nevsky's (1926) list of Tibetan transcriptions with me, but according to my database, 'enter' was transcribed in Tibetan as wo, not ʔo. 

(4.9.20:54: Now that I have Nevsky's list in front of me, I see that he wrote the Tibetan letter wa without a vowel o [#222]. Is his romanization wo correct? His dictionary [1960 II: 166] lists no Tibetan transcription for 'enter'.)

Nishida (1964: 129) listed two transcriptions for the homophone group VIII.85: wo and d-woH. Li Fanwen (1986: 166) listed two transcriptions for TT4176 in VIII.84 (wo and d-wo) and three transcriptions for TT2903 in VIII.85 (wu, wo, d-woH).

3. In the Pearl, the second syllable of 'rabbit' was transcribed in Chinese with 訛 instead of, say, 蛾. In Middle Chinese, 訛 was *ŋwa whereas 蛾 was *ŋa. It's not clear how 訛 and 蛾 were pronounced in Tangut period northwestern Chinese, but if medial *-w- was still distinctive in that rhyme class, their readings may have been *wɔ and *ɔ. (In modern Xi'an, they have merged as ŋɤ.)

4. Probable external cognates of 'enter' do not have an initial glottal stop:

Written Tibetan Hong ?[ɣoŋ] 'come'

Written Burmese wang 'enter; go; come in'

Old Chinese *waŋʔ 'go'

(The first two are from Gong, "The System of Finals in Proto-Sino-Tibetan".)

Note, however, that Chn 翁 'old man' was borrowed into Tangut as

TT0224 (Gong: ʔo) R51 1.49

It's not clear whether the Chinese original had a glottal stop like Middle Chinese *ʔəwŋ or a w- like standard Mandarin weng [wəŋ] at the time of borrowing.

It would have been nice if the initial of the second syllable of 'rabbit' were ɣ, so I could derive it from an earlier intervocalic *-k- and link the disyllabic word lɨ̣̣ ʔo to these Qiangic forms:

Muya/Minyag ʐi vø

Queyu ʐi ko

(from Matisoff, " 'Brightening' and the place of Xixia")

There is no ɣo R51 in Gong's reconstruction, and I wonder if earlier *ɣo R51 merged with wo R51. (Gong's reconstruction does contain ɣow R56, ɣiow R57, ɣjow R58, and ɣor R95, so I cannot claim that all became w- before o-type vowels.)

Was the final (R51) of VIII.84-85 simply -o?

R51 was certainly o-like, since it was generally transcribed with Tibetan -o (Nishida 1964: 56).

Although Nishida noted that an unspecified R51 graph was used to transcribe Sanskrit dhaa, this is not strong evidence for reconstructing R51 as -ɔ. I suspect that R51 was used to transcribe a Chinese 陀 *thɔ which was used to transcribe Skt dhaa centuries earlier when it was pronounced *da.

Nishida's final does correlate with the -H in some Tibetan transcriptions of R51. Could Tib -H represent the nasalization that Huang reconstructed? Perhaps it did in some cases if the transcribed Tangut dialect maintained a distinction between nasalized and nonnasalized versions of R51. External evidence indicates that R51 had nonnasal-final as well as nasal-final origins:

TT1100 POISON do R51 1.49

TT3225 POISON do R51 2.42

cf. Japhug rGyalrong tɤ ndɤɣ < *-ɔk, Written Tibetan dug, Late Old Chinese *douk < *luk

(this word for 'poison' may be a late OC form borrowed into other languages; see Sagart 1999)

TT0240 BRAIN no R51 2.42

cf. Japhug rGyalrong tɯ rnoʁ < *-oq, Old Chinese *nuʔ

R51 words include loans from Chinese without nasal codas (Gong, "西夏語中的漢語借詞"):

TT0683 CHARIOT ko R51 1.49 < 車 *kø < MC *kɨə

TT4297 PUDDLE pho R51 1.49 < 泊 *ph- < MC *bak

TT4628 READ do R51 1.49 < 讀 MC *dok

According to Gong, the nasal-final sources of R51 are Proto-Sino-Tibetan *-aŋ, *-am, *-əm: sequences of achromatic vowels followed by grave nasals. Since two of these rhymes have low vowels, I wonder if R51 -o was phonetically lower mid [ɔ].

The table below shows how the various sources of R51 might have merged:

Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Tangut
*-o *-o *-o *-o -o ?[-ɔ]
*-o(ɰ)k *-oɰ *-ow
*-a(ɰ)ŋ *-a(ɰ)ŋ *-ãɰ *-aw *-ɔ
*-a(ɰ)k *-aɰ
*-a(w)m *-a(w)m *-ãw

I presume that 翁 MC *ʔəwŋ 'old man' was borrowed as pre-Tangut *ʔə(ɰ)ŋ.

*It's not clear why the two tangraphs of VIII.85

TT2903 WOMB (Gong: ʔo) R51 1.49

TT2386 CONSIST-OF (Gong: ʔo) R51 1.49

are separated from the 16 tangraphs of VIII.84 in both Homophones (44B51-52) and Tangraphic Sea (F56B52-61). Nishida and Gong reconstruct groups 84 and 85 identically.

I suspect the difference lies in their initials. The fanqie initial speller for VIII.85

TT3149 (Gong: ʔu) R1 2.1

points to ʔ- whereas VIII.84 may have had (ʔ)w- (see above). OF RATS AND RABBITS

Yesterday, it occurred to me that Old Chinese voiceless sonorants (and voiceless aspirated obstruents?) could have originated from *k-C- clusters: e.g.,

'rat': 鼠 *kɯ-naʔ > *hnaʔ > late OC *ɕɨaʔ

for *n- cf. Tangut ṇoɣ R74 2.63 < *Cɤ-no(k)-H

'rabbit': 兔 *k-las > *hlas > late OC *thah

for *l- cf. Tangut lɨ̣̣ R72 2.61 ʔo R51 2.42 < *C-lɨ-H-ʔ-...-H

might have the same *k-prefix as 蜘蛛 ?*kɯ-reto 'spider' (see here).

Note that I reconstructed the OC word for 'rabbit' with *k- instead of *kɯ-. A high-vowelled prefix would deemphasize the root:

*kɯ-las > *klas > *hlas > *hlah > *hlɨah > late OC *ɕɨah (not attested)

Tangut *C-lɨ-H could be indigenous or be a borrowing from a nonemphatic Chinese form. The high vowel could either reflect a late OC *ɨa or be the result of Tangut-internal vowel harmony:

*Cɯ-las > *C-lɨah > *ḷɨ̣h > lɨ̣̣ R72 2.61

̣(I can't reconstruct the Tangut prefix as *k- since I think *kl- may have become Tangut lh-.)

Subscript dots indicate tenseness: *Cl- > *ḷ- (tense l).

The pre-Tangut source of the second syllable of lɨ̣̣ ʔo 'rabbit' is unknown. It could be *ʔoH, *ʔaŋH, *ʔamH, or *ʔəmH.

(See Gong, "The System of Finals in Proto-Sino-Tibetan". Gong reconstructed PST *-u as a source of Tangut -o, since his earlier *u is equivalent to my *o.) REVIVING DEAD VELAR CLUSTERS

In Siamese phonology, stop-final syllables are 'dead' and other syllables are 'live'. This terminology can also be applied to other languages with similar phonologies.

Middle Chinese word families may contain mixtures of 'dead' and 'live' syllables: e.g.,

惡 MC *ʔak 'evil' (dead) ~ *ʔoh 'hate' (live)

These alternations are reconstructed as *-C ~ *-C-s in Old Chinese: e.g..,

惡 OC *ʔak 'evil' ~ *ʔak-s 'hate'

The final cluster *-ks fused to *-x and merged with *-h. In Middle Chinese, *-h disappeared after conditioning breathy phonation (去聲 'departing tone') in the preceding vowel: *-Vh > *-V̤.

(For typing convenience, I represent this breathy phonation as *-h in MC, even though MC no longer has final glottals.)

Although (nearly*) all Tangut syllables were 'live', comparative evidence indicates that pre-Tangut had 'dead' syllables which were 'brought to life': e.g.,

'one': *kʌ-tik > *kʌ-dəik > *kʌ-lek > *leɣ > *leɰ > lew R44 1.43

cf. Written Tibetan gcig < *k-tik, Written Burmese tac < *tik

(The Tibetan transcriptions gliH, gli, kli  may represent a nonstandard Tangut dialect which underwent different sound changes [4.9.21:07: e.g., *kʌ-l-*ɣl-].)

I reconstruct *-H as the pre-Tangut source of the Tangut 'rising tone' (breathy phonation?). *-H is a cover symbol for unknown glottals (probably *-h < *-s and possibly also *-ʔ). If a 'level tone' word that was once 'dead' has a 'rising tone' cognate, I reconstruct the latter with a stop followed by *-H: e.g.,*-k-H. Since clusters ending in glottal stops such as *-k-ʔ would be extremely difficult to pronounce, I suspect that *-H was still *-s when it was added to stop-final roots: e.g.,

'same; one and the same': *kʌ-tik-s > lew R44 2.38

(the root is *tik 'one'; see above)

'leopard': *rʌ-tsi/ek-s > zerw R93 2.78

(Tangut has no form implying a simple *-k coda, but such a form is implied by Japhug rGyalrong kɯ-rtsɤɣ < *-k, Somang rGyalrong kə-ɕtɕík, and Written Tibetan gzig [not gtsig!**] < *-k)

The velar component of *-k-s left a trace (-w) in Tangut, whereas it disappeared in Chinese:

Tangut Middle Chinese
*-k -w + 'level tone' (live) *-k + 'entering tone' (dead)
*-k-s -w + 'rising tone' (live) -Ø + 'departing tone' (live)
*-s -Ø + 'rising tone' (live)

Without looking at word families and sinographic structure (i.e., the presence of a stop-final phonetic), there is no way to tell whether an MC departing tone open syllable is from OC *-k-s or *-s.

I've been struggling to figure out how pre-Tangut *-k-s became Tangut -w + 'rising tone'. Here are my latest attempt to bridge the two stages:

Solution 1: Velar chain shift

Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6 Tangut
*-Vk *-Vx *-Vɣ *-Vɰ -Vw + 'level tone'
*-Vk-s *-Vx *-V̤x *-V̤ɣ (?*-V̤ɰ) *-V̤ɰ -Vw + 'rising tone'
*-Vs *-Vh *-V̤h *-V̤ -V + 'rising tone'

The parallels with Chinese end after the fusion of *-k-s to *-x. In pre-Tangut, vowels developed breathy phonation (*..) before final *-x and *-h (whereas in Chinese, *-x and *-h probably merged into *-h which then conditioned breathy phonation). Velar codas switched places in a chain shift:

*-k > *-x > *-ɣ > (?*-ɰ)

Solution 2: Velar glide-stop cluster

Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6 Tangut
*-Vk *-Vɰk *-Vɰk *-Vɰʔ *-Vɰ -Vw + 'level tone'
*-Vk-s *-Vɰk-s *-Vɰx *-Vɰh *-V̤ɰh *-V̤ɰ -Vw + 'rising tone'
*-Vs *-Vs *-Vh *-V̤h *-V̤ -V + 'rising tone'

In this solution, Tangut -w is not directly from *-k but is a trace of a velar glide *-ɰ- that developed before *-k. Final velar obstruents shifted to glottals before being deleted.

The variety of northwestern late Middle Chinese underlying the Kan-on stratum of Sino-Japanese probably had clusters with velar glides borrowed as Old Japanese *u: e.g.,

亡 OC *maŋ > NW LMC *mbwaɰŋ 'to die', borrowed as OJ period Kan-on *mbaũ (now [boo])

Tangut could have developed such velar glides under the influence of its neighbor NW LMC.

Solution 3: Late glottal stop affixation

This solution assumes that glottal stop was a source of the Tangut 'rising tone' and that this was added to *-k after it lenited (stage 5; in bold). Subscript tilde indicates creaky phonation.

Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6 Stage 7 Tangut
*-Vk *-Vx *-Vɣ *-Vɰ *-Vɰ *-Vɰ *-Vɰ -Vw + 'level tone'
*-Vɰ-ʔ *-V̰ɰʔ *-V̰ɰ -Vw + 'rising tone'
*-Vʔ, *-Vʔs, *-Vs *-Vʔ *-V̰ʔ *-V̰ -V + 'rising tone'

Solution 4: Posttonogenetic flipflop

This is the least likely of the four solutions: After Tangut developed tones, some -w 'level tone' words became -w 'rising tone' words by analogy with tonal alternations arising from pretonogenetic affixation. If zerw R93 2.78 'leopard' arose from a late tonal change, its 'level tone' source zerw R93 1.87 must have disappeared before a tangraph could be devised for it.

*Precious Rhymes of the Tangraphic Sea lists a few 'entering tone' tangraphs which may have represented 'dead' syllables.

**Matisoff (2003: 657) reconstructed the Proto-Tibeto-Burman root for 'leopard' as *zik. However, rGyalrong forms imply Proto-rGyalrong root(s) with a voiceless initial:

Japhug kɯ-rtsɤɣ < PGR ?*tsek

Somang kə-ɕtɕík < PGR ?*tɕek

Zbu qə-sɐ̂ < PGR ?*s- (This may not be cognate with the other two words, since I would expect an affricate initial and -ôx or -ə̂x.)

Ronghong Qiang has which superficially resembles the Zbu form. (Huang and LaPolla [2003: 334] derived this from a Proto-Tibeto-Burman *sik.)

It's no longer clear to me whether Tangut z- in 'leopard' is from a lenited *(t)s- or an original *z-, though on the basis of the extremely limited evidence presented here, it seems that Proto-Qiangic (and its descendant Tangut) had a voiceless initial root. RHOTIC LENITION?

On this blog, I've proposed that Tangut voiced initials corresponding to non-Tangut voiceless initials arose through intervocalic lenition:

Gloss Pre-Tangut Tangut (AMR) Cf.
head *Cu-ku ɣu R4 1.4 Japhug rGyalrong tɯ-ku
pillow *Cɯ-ke/oN ɣiẽ R42 1.42, ɣuõ R58 1.56 Jph tɤ-mkɯm
leopard *rʌ-tsi/ek-H zerw R93 2.78 Jph kɯ-rtsɤɣ
one *kʌ-tik lew R44 1.43 Tib. transcription kli; Written Tibetan gcig < *k-tik, Written Burmese tac < *tik
be able to *sɯ-pi-H wị R70 2.60 Jph kɯ-spa

Last night, I realized that pre-Tangut *rj may have also lenited intervocalically:

'eight': *ʔɯ-rja > ʔjar R87 1.82

cf. Somang rGyalrong wu-rjat, Written Burmese hrac (< Proto-Lolo-Burmese *ʔ-rit [Matisoff 2003: 56])

'stand': *ʔɯ-ra > *ʔɯ-rɰa > *ʔɯ-rja > ʔjar R87 1.82

cf. Written Burmese rap (< Proto-Lolo-Burmese *ʔ-rap [Matisoff 2003: 609]), Old Chinese 立 *rəp

There may have been intermediate z-stages after postrhotic vowel retroflexion: e.g., *ʔɯ-ʑjar.

Just now, I realized that pre-Tangut *r may have lenited between a presyllable vowel and *i:

'hundred': *ʔɯ-riH > *ʔɯ-ʑir > ʔjir R84 2.72

cf. Somang rGyalrong pə-rjâ

Not all syllables of the type ʔjVr may go back to *ʔɯ-r(j)V. The retroflexion in the vowel could reflect an earlier root-final *-r: *ʔɯ-jVr, though I can't find any comparative evidence to support that. RHOTIC HARMONY?

(I'm writing this on a plane and am relying almost entirely on my memory, so there may be even more errors than usual.)

Until recently, I used to make a three-way distinction between dental, prefixable dental, and dental cluster initials in sinographic phonetic series: e.g.,

Phonetic series Old Chinese Middle Chinese (if emphatic) Middle Chinese (if nonemphatic)
Dental *t- *t- tɕ-
Prefixable dental *(r-)t- nonprefixed > *t-
prefixed > *ʈ-
nonprefixed > *tɕ-
prefixed > *ʈ-
Dental cluster
(prefix + root initial
or root cluster)
*t(-)r- *ʈ- ʈ-

If this were correct:

知 should be a *tr- dental cluster phonetic, since its derivatives were pronounced with MC retroflex initials.

朱 should be an*(r-)t- prefixable dental phonetic, since its derivatives were pronounced with MC palatal and retroflex initials.

The word for 'spider' is written with both those phonetics: 蜘蛛. Since its MC reading was *ʈie ʈuo, its OC reading should be *tre r-to. Yet this is unlikely, since

- disyllabic OC words often have the pattern XAXB, so I would expect *tre tro or *rte rto.

- it would be strange to add an affix r- to a meaningless second syllable of a disyllabic morpheme

So for a while, I reconstructed the word as OC *tre tro with a reduplicated *tr-root. I regarded the use of a *(r-)t-phonetic 朱 for an *tr-syllable 蛛 as anomalous and possibly even inevitable, since I couldn't think of any other phonetics pronounced *tro.

However, today I realized that another solution was possible:

- The original root was *reto

- A prefix *tɯ- was added to this root: *tɯ-reto

(Or was the root trisyllabic with a reduced first vowel: *tVreto > *tɯreto?)

- Without a prefix, *reto would have developed emphasis because its vowels were nonhigh. However, a high-vowelled prefix preserved the root's original nonemphatic quality.

- The spelling 蜘蛛 was devised when the word was *t(ɯ)-reto;*to 'vermillion' was a perfect phonetic because it was homophonous with the *-to of 'spider'

- *t(ɯ)-reto was compressed to *treto

- Later on, the second syllable came to have the same initial as the first:

*treto > *tretro


*ʈeto > ʈeʈo

It's also possible that vowel retroflexion was present in the first syllable and spread to the second:

*trerto > *trertror


*ʈerto > ʈerʈor

(If this retroflexion persisted into Middle Chinese, it may have been nonphonemic since it would have automatically followed any retroflex initial.)

Was this an isolated change that made the word look like other reduplicative words? Or was it a rhotic harmony rule that affected disyllabic morphemes with rhotic first syllables and nonrhotic second syllables?

[+rhotic] [-rhotic] > [+rhotic] [+rhotic]

This rhotic harmony would be similar to the first of the two emphatic harmony rules that affected most disyllabic morphemes:

[+emphatic] [-emphatic] > [+emphatic] [+emphatic]

[-emphatic] [+emphatic] > [-emphatic] [-emphatic]

Did the deemphasizing rule have a rhotic counterpart?

[-rhotic] [+rhotic] > [-rhotic] [-rhotic]

I'd like to compile a database of disyllabic OC words to test these hypotheses.

ADDENDUM: OC has a *k(V)-prefix in count nouns (Sagart 1999: 106). I could reconstruct such a prefix for 'spider' if *k(V)-r- > *tr-. But this conflicts with other cases of *k(V)r- becoming *k-. Perhaps *C(V)r-sequences fused differently at different stages.

Stage 1 Stage 2 Stage 3 Stage 4
Early fuser *kVr- *kr- *tr- *ʈ-
Late fuser *kVr- *kr- *k-
Early fuser *tVr- *tr- (> *dr-) *r- *l-
Late fuser *tVr- *tr- *ʈ-

'Spider' would be an early fuser:

*k(V)-reto > *kreto > *treto

Part of the table can be summarized as a chain shift:

*kVr- > *kr- > *tr- > *r-

( Also cf. the Written Burmese animal prefix k-. Matisoff [2003: 138-139] regarded it as a loan from Mon-Khmer: cf. Vietnamese con 'child'.) Ø ɣ ɨ i

An emergency came up and I don't know when I'll be able to blog again.

I want to tie up one loose end before I go on hiatus. You may have noticed that I reconstructed a contrast between medial -ɨ- and -i- in Tangut in part 3 of "What's Geng On?". That was a glimpse of my latest hypothesis about the grades of Tangut:

Gong AMR
Tangut grade Kaikou Hekou Kaikou Hekou
I -Ø- -w- lowered vowel -w- + lowered vowel
II -i- -iw- velarized lowered vowel -w- + velarized lowered vowel
III -j- -jw- -ɨ- + raised vowel -u- + raised vowel
IV -i- + raised vowel -y- + raised vowel

(In the Middle Chinese rhyme tables, hekou 'closed mouth' syllables had labial glides or vowels. All other syllables were kaikou 'open mouth'.)

Unlike Gong, I have restored a distinction between Grade III and IV which enabled me to project my current interpretation of the four grades of Middle Chinese onto Tangut. I hope to test this hypothesis after I return. "PART OF MY LIFE IS SAVING LIFE": DITH PRAN (1942-2008)

Cambodia's been on my mind a lot lately, and not just because of its language. While one part of my mind was considering the possible parallels between vowel warping in Khmer, Old Chinese, and Tangut, another was thinking about Sihanouk's role in the tragedy of modern Cambodian history. I just finished reading Milton Osbourne's Sihanouk: Prince of Light, Prince of Darkness yesterday afternoon. I didn't expect to learn that one of the most vocal survivors of the Khmer Rouge's genocidal regime had died only a day later.

I first learned of Dith Pran when I saw him depicted by the late 吳漢 Haing S. Ngor in The Killing Fields 23 years ago. 15 years ago, I read his biography, The Death and Life of Dith Pran. But his story didn't end there. He founded The Dith Pran Holocaust Awareness Project, Inc. Here's his official biography:

"I'm a one-person crusade. I must speak for those who did not survive and for those who still suffer ... Like one of my heroes, Elie Wiesel, who alerts the world to the horrors of the Jewish holocaust, I try to awaken the world to the holocaust of Cambodia, for all tragedies have universal implications.

"Part of my life is saving life. I don't consider myself a politician or a hero. I'm a messenger. If Cambodia is to survive, she needs many voices."

- Dith Pran

May he rest in peace.

The Khmer Rouge's rule ended almost 30 years ago. But genocide continues in other parts of the world.

May the voices continue to speak.

