The "MUST" in "A Must-See Tangut Article" belongs to rhyme 43, which Gong reconstructed as -jɨj. Its Chinese source, 定 'certainly', had the following readings over time:

Old ChineseLate Old ChineseEarly Middle ChineseLate Middle ChineseTangut period NW ChineseModern NW Chinese
my reconstruction: generic except for NW*deŋs*deiŋh*deŋh*tɦjejŋ (?*tɦjeɲ) > NW *tɦjẽj*thjẽj (*thjij in Gong's reconstruction) Not reconstructions: tiə̃ in several dialects and tiŋ in Xi'an*
Coblin (1994): NW Chn only*dėŋ*dɨŋ*dɨŋ > *dieŋ
Tibetan transcriptions of NW Chndeng, de
Khotanese Brahmi transcriptions of NW Chnthye, thyai
Kan-on stratum of Sino-Japanesetein/a

Emphatic *d lost its emphasis, becoming a plain *d and then *th. I have tentatively accepted Pulleyblank (1984, 1991)'s reconstruction of a *tɦ-stage in the period between the *d and *th stages.

Coblin's (1991: 33) *ė is a "non-front ... rather high" vowel. He suggested ɨ as an alternate transcription.for his 'Old NW Chinese' stage which I would call very late NW OC. So perhaps his Sui-Tang NW Chn *dɨŋ is phonetically identical to his *dėŋ. In any case, a *[ɨ] matches Gong's ɨ in R43, but does not match the vowels in most other reconstructions of that Tangut rhyme:

Tibetan transcriptionsSofronov and Kychanov 1963Nishida 1964Hashimoto 1965Sofronov 1968Huang Zhenhua 1983Li Fanwen 1986Gong 1997Arakawa 1999
R43-e, -eH-iɐ'-jẽ-jejN-ɪ̭aiC > -ɪ̭e-ïən, -ïə̃-iẽ-jɨj-jẽ

Next: How would I reconstruct R43?

2.16.12:41: The "Sofronov 1963" form is taken from Grinstead (1972: 266). I have not seen Sofronov and Kychanov's (1963) Исследования по фонетике тангутского языка since I returned it to the University of Hawaii library in 2000. I don't know what the apostrophe represents.

2.16.12:45: Sofronov (1968 I: 137) reconstructed R43 as -jaiC > -je. However, his reconstruction of TT1355 CERTAINLY has medial -ɪ̭-, not -j- (1968 II: 308).

*2.16.12:57: These correspondences indicate that Xi'an is not descended from the NW dialects transcribed by the Tibetans and Khotanese (Coblin 1994), though it could contain substratal elements from those dialects:

Generic EMCTibetan transcriptions of NW LMCKhotanese transcriptions of NW LMCModern Xi'an
*-eŋ-(y)eng, -(y)e, -yi (x 1)-(y)e, -(y)ai-iŋ
*-en-en, -yen, -yan-yeṃnä (x 5), -iṃnä (x 2), -yenä (once)earlier -iæ̃, now -iã
*-em-yam (only 3 examples)-yeṃmä (sole example)

Khotanese represented nasalization, not a nasal consonant, and Khotanese ä was *[ə].

The transcriptions indicate that the NW LMC equivalent of the generic EMC *-eŋ category was something like *-(j)ẽ. I have never heard of a final nasalized vowel 'regrowing' an earlier nasal coda. Thus I conclude that the Xi'an form tiŋ must have been brought in from the east (cf. standard Mandarin [tiŋ]).

It is not clear whether tiə̃ with a nasalized vowel in other modern NW dialects is a retention or an import that has lost its final nasal. A MUST-SEE TANGUT ARTICLE

David Boxenhorn found K. Ju. Solonin's "Tangut Chan Buddhism and Guifeng Zong-mi", published in 1988, the year that I first heard of the Tangut. On a plane to Japan, I watched the movie adaptation of Inoue Yasushi's novel 敦煌 Tonkou (Dunhuang), partly set in the Tangut Empire. I was totally unaware that I would start working on Tangut eight years later. At the time I was too interested in Japanese to devote my attention to another language. I did not even know the name 'Tangut', as the movie used the name Seika, the Japanese pronunciation of the Chinese exonym 西夏 'Western Xia'.

The twentieth anniversary of that first encounter is in five months. I wonder how I should celebrate it.

In the meantime, I'd like to examine the Tangut Buddhist terms listed in Solonin's article. The title of this post refers to the analysis of the tangraph in endnote 57 for 禪 Chan, better known in the West by its Japanese reading Zen:


TT1356 ZEN ɕjã 1.26

= all of TT1355 CERTAINLY djɨj 2.37 (Grinstead translated this as MUST) +

= EYE = left of TT2857 OBSERVE bioo 1.53

ɕjã 'Zen' is a relatively late borrowing from northwestern Chinese since it has a voiceless initial. In the immediate pre-Tangut period, this Chinese word was transcribed into Tibetan as shan, zhan, and Hzhan (Coblin 1994: 323). Jpn Zen retains the voicing of Middle Chinese *dʑien.

djɨj 'certainly' is an earlier borrowing predating the devoicing of initial obstruents in NW Chinese. Coblin (1994: 441) reconstructed its Sui/early Tang reading as *dɨŋ, which is close to Gong's reconstruction. However, I don't think the Chinese word had a nonpalatal vowel since it was probably the source of Sino-Japanese (Kan-on stratum) tei. (*dɨŋ would have become Kan-on chuu < *tiũ.) Coblin reconstructed *dėŋ and *dieŋ for the periods before and after Kan-on was borrowed. I prefer to think that the word had an *e vowel from OC all the way up to late MC, even though this vowel doesn't match Gong's ɨ.

The tangraph CERTAINLY seems to be a deformation of the sinograph 定:

- the 宀 roof has been converted into Tangut WOOD

- the horizontal strokes have been turned 90 degrees to become Tangut NOT on the left

- the remaining strokes on the bottom have been rearranged beneath WOOD

CERTAINLY is probably part of ZEN because the Chinese term for abstract meditation is 禪定 'Zen' + 'certainly' (here, 'settle' - i.e., settling the mind). The compound is redundant, as 禪 'Zen' is from Sanskrit dhyaana (or some Middle Indic form like jhyaana) 'meditation'.

The element EYE in OBSERVE oddly does not appear in the tangraph for 'eye':

TT1850 EYE mej 1.33

The EYE element appears in some ocular tangraphs but not all of them. Such inconsistencies make me question the semantocentric interpretation of tangraphy, though I don't deny the existence of semantic compounds.

The Chinese equivalent of OBSERVE, 觀, can also refer to meditation (a different kind of 'seeing'; cf. Eng insight which does not entail literal vision). Thus ZEN (meditation) was written as MEDITATION + MEDI(TATION). Only in tangraphic math could MEDITATION x 1 equal (MEDITATION x 1) + (MEDITATION * .5). IM-Pɯ-RAS-SIVE METAL SKIN

In "A Lu-dicrous Error", I reconstructed 69f, v, y-z identically:

GSR numbersSinographsOCMC
69f, v, y-z慮, 鑢, 金+膚*(Cɯ-)ras*lɨəh

After I uploaded the post, I realized that 金+膚 'inlay' might have had a labial-initial presyllable like its phonetic 膚 OC *pɯ-ra 'skin'. (金 is the semantic element 'metal'.)

This brings up the question: if A contains B as phonetic, and B was pronounced as *X-Y, did A necessarily have a presyllable or at least a preinitial resembling *X-?

'Inlay' has another spelling 鑢 69v with 慮 69f 'consider' as phonetic. The latter seems to have a n open-vowel root *ra if it is related to Written Tibetan bgro-ba 'consider' (proposed by Unger in Hao-ku 20) and Lushai ruatF 'consider' (Schuessler 2007: 368). So should I reconstruct *pɯ- for all members of this subseries?

*pɯ-ra 'skin'

phonetic in 金+膚 *pɯ-ras 'inlay'

also written 鑢 *pɯ-ras 'inlay'

whose phonetic is

*pɯ-ra-s 'consider'

all of the above lost *pɯ- except for 'skin', whose presyllable fused with *r:

early OC*pɯ-ra > *pra > late OC *pɨa > MC *puə

I am not sure. I wonder if the two spellings of 'inlay' represented two forms:

金+膚 *pɯ-ras

*(Cɯ)-ras with a non-labial-initial prefix, or no prefix at all

Textual analysis may reveal semantic differences between 金+膚 and 鑢 that could be attributed to different affixation. 'Inlay' could be cognate to 'skin', since inlaying is the process of decorating the surface (skin) of something (presumably metal in this case). GET A 首 *QLUH

Here's another lu-etymology.  (The first is "廬 Lu-dicrous" but not ludicrous - I hope.)

For years, I was bothered by Sagart's proposal of a relationship between 首 OC *hluʔ 'head' and Proto-Austronesian *qulu (in Blust's reconstruction; Sagart reconstructed *qulu(h). This is because I assumed that a uvular initial triggered emphasis, so *qulu should have become OC *hluʔ.  However, the actual OC word is nonemphatic.  I haven now found that my latest hypotheses can cover this case, even though I didn't have 'head' in mind when I devised them:

1. OC either inherited *qulu(h) from Proto-Sino-Austronesian (if such a language existed) or borrowed it from Proto-Austronesian.  (I prefer the latter possibility.)

2. The first vowel was unstressed and reduced (i.e., it lost its rounding):

*quluh > *qɯlúh

3. The first vowel was lost:

*qɯlúh > *qluh (no need to indicate accent if there is only one vowel left)

4. *u, being a high vowel, was automatically nonemphatic at this stage unless preceded by a nonhigh vowel.  Its immediately neighboring segments (*l and *h) were also nonemphatic.

*q was an emphatic consonant that did not match the following nonemphatic segments, so it became nonemphatic *k:

*qluh > *kluh

5. This early *kl-cluster became a voiceless lateral:

*kluh > *hluh

Cf. my proposal of Tangut lh- from earlier *kl-.

Later on, early *kV-l- became new *kl-clusters which did not develop into *hl- (except possibly in Proto-Min, whose *hl- may come from this newer *kl-).

6. At some point, original *-h merged with original *-ʔ to become glottal stop:

*...uh > *...uʔ

This change had nothing to do with any of the others, so it could have taken place at any point between steps 1 through 5: e.g., OC speakers might have borrowed PAN *-h as *-ʔ if they had no *-h in their language.

Two problems:

First, 首 OC *hluʔ 'head' is somehow connected to Proto-Tai *kləuʔ 'head'*, the source of Siamese เกล้า klaaw.  Chinese-like PT words tend to look like borrowings from late OC, yet I believe that *kl- was simplified to *hl- very early.  So why does this word have *kl-?  Five possibilities:

1. The word is an early loan from Chinese.

2. The word is a loan from a late OC dialect that preserved *kl-.

3. The word is a borrowing from some AN language that shifted *qul- to *kl-, just like OC.

4. The word is inherited from a common ancestor of Tai and Austronesian, and Tai shifted *qul- to *kl-, just like OC.

5. The word is a lookalike and has nothing to do with OC or PAN.

I think 3-5 are unlikely, and that the rhyme points toward 2 rather than 1.

If  *kləuʔ 'were an early loan, it should have a simple *u, not a diphthong. Simple early OC vowels 'bent' into late OC diphthongs which in turn became Middle Chinese glide-vowel sequences.  At some point, OC *u became something like MC *ɨw (cf. how this rhyme was borrowed as Sino-Vietnamese -ưu [ɨw]).  PT *əu may be an attempt to imitate a foreign *ɨw.  Perhaps there was a southern late OC dialect whose word for 'head' was *klɨwʔ, combining a relatively archaic initial with an innovative final.

Another possibility which I believed in until tonight was that PT *əu reflected late OC *ɑu, which was bent from an earlier emphatic *u.  In a dialect with different synharmonic rules, 'head' might have become an emphatic word due to its emphatic initial:

*quluh > *quluh > *kluʔ > *klouʔ > *klɑuʔ > borrowed into PT as *kləuʔ

Second, Tangut

TT0238 ljụ 2.52 HEAD

which appears to be related to the above words for 'head', has a voiced rather than a voiceless initial.  One could argue that *ql- became *l- + tense vowel whereas *kl- became *hl-, but this is an ad hoc solution.

Moreover, Gong reconstructed a medial -j- that does not correspond to anything in the PAN, OC, or PT forms.  Perhaps this -j- could be from an *i in an earlier presyllable:

*Ci-lu > *C-lju > *l-jụ

At least the rising tone corresponds to a Chinese glottal coda, as I would expect.

*I'm going to write Proto-Tai tone C as a glottal stop since it corresponds to OC *-ʔ in Chinese loanwords, and I assume that PT speakers borrowed OC glottal stop as is. A 廬 LU-DICROUS ERROR

Two nights ago, I wrote,

廬 'hut', homophonous with 盧 'house' in MC, could have had a different prefix in OC. It's also possible that 'hut' was the root *ra in OC and that the OC words for 'house' had prefixes attached to it.

The premise of this paragraph is incorrect. I mistakenly assumed that 廬 'hut' and 盧 'house' were homophones in Middle Chinese because they are homophones in Mandarin. They are also homophonous in Cantonese. Normally this would imply that both are from MC *lo < OC *ra. However, 廬 'hut' was actually MC *lɨə < OC *ra, which should have become [ly] in both languages. Apparently, Mandarin and Cantonese speakers replaced the original reading of 廬 'hut' with the reading for its phonetic 盧 'house':





廬 'hut'



lou instead of ly

lu instead of [ly]

盧 'house'





The old Mandarin reading is the expected *ly, as recorded in Zhongyuan yinyun and Menggu ziyun.

The OC forms differ only in emphasis. Here's what I think happened:

1. The original root was *ra. (My Maitreya hypothesis also allows *tra or *dra, but I have no evidence for a dental cluster.)

2. Emphasis became an automatic phonetic feature of all stressed nonhigh vowels:

*ra > *ra

3. A prefix with a minimal high vowel (reduced from some earlier *i, *ə, or *u?) was added to the root:

*ra 'house' > *Cɯ-ra 'hut'

4. Syllabic synharmony (cf. Proto-Slavic) deemphasized the root to match the nonemphatic prefix:

*Cɯ-ra > *Cɯ-ra

5. The prefix was lost, and the presence or absence of emphasis became unpredictable and hence phonemic:

*Cɯ-ra [Cɯra] /Cɯra/ > *ra [ra] /ra/ [-emph] 'hut'

cf. *(Cʌ-)ra [(Cʌ)ʀɑʕ] /(Cʌ)ra/ > *ra [ʀɑʕ] /ra/ [+emph] 'house'

6. In the late OC period, the emphatic distinction was replaced by a vocalic distinction after nonemphatic and emphatic vowels 'bent' in different directions:

*ra [ra] /ra/ > *rɨa [rɨa] /rɨa/ (first part of nonemphatic vowel bent upwards)

*ra [ʀɑʕ] /ra/ > *rɑ [rɑ(ʕ)] /rɑ/ (emphatic vowels bent downwards, but a, already being low, couldn't go any lower; it's unclear if emphasis remained after bent vowels became phonemic) The 虍 'tiger' phonetic series has a mixture of emphatic and nonemphatic syllables:

GSR numbersSinographsEarly OCMC
57b虎 琥*hraʔ*xoʔ
69a-c, d, j, k, l, m, n, o, p虍+田, 盧, 壚, 櫨, 爐, 鑪, 籚, 纑, 顱*(Cʌ-)ra*lo
69q, r-s, u廬, 臚, 藘*(Cɯ-)ra*lɨə
69f, v, y-z慮, 鑢, 金+膚*(Cɯ-)ras*lɨəh

(Karlgren does not use the letter w in his Grammata Serica Recensa numbers, so there is no 69w.)

I think there were a lot of lost presyllables, because I think it's highly improbable that OC had ten words pronounced *ra:

69a-c 虍+田, 69d 盧 'a kind of food vessel'

69d 盧 'house'; 'hound'; 'black'

69j 壚 'black and hard soil' (cf. 69d 'black')

69k 櫨 'a kind of fruit tree'

69l 爐, 69m 鑪 'stove'

69d 盧, 69n 籚 'lance shaft'

69o 纑 'hempen threads'

69p 顱 'skull'

This list does not include words which are not attested in early OC which might have been EOC *ra: e.g., 蘆 'reed' (cf. 69d, n 'lance shaft' which has a potential nonemphatic variant 69q 廬*ra).

Some of these may have been *ra, whereas others may have been *ʔʌ-ra, *kʌ-ra, *tʌ-ra, *pʌ-ra, *sʌ-ra, etc.

Some presyllables may have once been the first syllables of disyllabic roots with unstressed nonhigh vowels: e.g., *Cʌ-ra < *Cará, *Cerá, *Corá. DO HOUSES HAVE EYES? (PART 1)

Back in the 30s, Nevsky compared

TT1258 kia 1.18

which he reconstructed as *ka and glossed as дом 'house', семья 'family' with Chinese 家 'house, family' (1960 I: 186).

This word is homophonous with

TT5469 kia 1.18

which he glossed as цена 'price' and стоимость 'cost' and compared to Chn 價 'price' (1960 I: 386).

Gong's reconstruction kia is homophonous to 家 'house' and 價 'price' in his Tangut period NW Chinese reconstruction if tones are disregarded*. These words are 'Grade II' in both his Tangut reconstruction and Late Middle Chinese. In part 2, I will address the following questions:

1. Does it really make sense to apply LMC 'grade' terminology to Tangut?

2. Was grade II really correlated with *-i- in both Tangut and TPNWC?

*In Chinese, 家 'house' has a 'level' tone and 價 'price' has a 'departing' tone, whereas in Tangut, both words have a 'level' tone. These terms could be purely conventional and may not describe the actual tone contours of the words in both languages. A 家 *KRA-ZY IDEA (PART 2)

In part 1, I asked,

Given that 豭 'pig' was a prefix-root sequence *k-ra, was its homophone 家 'house' was also bimorphemic?

If I didn't know how OC *kra 'house' was written, I would see three possibilities:

1. 'house' is monomorphemic

but this would conflict with Sagart's (1999: 20) proposal of native roots without initial clusters

it could be a nonnative root, but there is no obvious known source language

2. house is bimorphemic: its root is *ka (cf. 居 OC *ka(-ʔ/s) 'dwell') and it has an infix *-r- or a metathesized prefix *r-

it seems unlikely that *r-ka would be written with 豭 *k-ra, so the metathesis would have predated the graph - though other phonetic series indicate that metathesis occurred after the birth of sinography

one would expect a nominalizer affix: 'dwell' > 'where one dwells', but *r is not a nominalizer; see Sagart (1999: 111-120) for a list of its functions

3. house is bimorphemic: its root is *ra and it has a concrete noun prefix *k- (see Sagart 1999: 106-107)

At present, I think the third possibility is correct. There is an OC word 盧 *ra 'house' which looks like it could be the bare root. It's also possible that 盧*ra 'house' might have had a presyllable: e.g., *kʌ-ra. There are two arguments in favor of reconstructing a velar-initial presyllable:

1. The phonetic of 盧 OC *ra 'house' is 虍, an abbreviation of 虎 OC *hraʔ 'tiger'. *hr- may come from an earlier *kr-:

External evidence: OC *hraʔ 'tiger' and its variants with *-(h)l- seem to be loan from Austroasiatic which has *k-l words for 'tiger'; also cf. Proto-Tibeto-Burman *kla 'tiger' which is also an AA loan (Matisoff 2003: 70)

but it's not clear why AA *l corresponds to OC *r as well as *hl

Internal evidence: Jianyang Min has kho for 'tiger' instead of the expected ho

but it's not clear whether this kh- really reflects a root initial or includes a prefix: *k-hr- > kh-?

2. 盧 OC *ra 'house' is homophonous with 蘆 'reed' in Middle Chinese (both are MC *lo). I know of no attestations of 'reed' earlier than Shuowen (c. 100 AD), but it's possible that the two were also homophonous in OC.

In Jian'yang and Jian'ou, 'reed' has an initial s- from Proto-Min *hl- instead of an *l- from PM *l- (Norman 1973: 233). I wonder if this *hl- originated from *kl-. (I have hypothesized that Tangut lh- may had a similar origin: see these two posts.) While mainstream Chinese lost a *k-prefix in 'reed', pre-PM retained it:

OC *kʌ-ra 'reed'

> mainstream late OC *la > MC *lo

> pre-PM *kla > PM *hlo

(I doubt that 盧 OC *ra 'house' has survived as a colloquial word in Min, or in any modern Chinese language. But I would presume that it would have had *hl- if it had survived in PM.)

(can the surname written with 盧 be reconstructed with *hl- in PM?)

cf. OC *kra 'house'

>*kæ with prefix-root initial fusion in both mainstream late OC and PM

the PM vowel is a guess based on the mainstream form and Min forms with e (Xiamen, Chaozhou) and a (Fuzhou, Longtu) in Pulleyblank (1984: 188)

but I am not sure that the Min -a forms are not literary loans

Could 盧 OC *kʌ-ra 'house' and 家 OC *kra 'id.' be uncontracted and contracted forms of the same word?

2.11.1:21: It is also possible that 盧 OC *ra 'house' and 蘆 'reed' were not homophonous in OC. 虍-graphs have several types of MC initials which indicate a variety of OC (pre)initials:

GSR numberSinographMCOCGloss
57b*xo*hraʔ < ?*kraʔtiger
69d*lo*ra, *t-ra (via the 'Maitreya hypothesis'), or *Cʌ-rahouse
69f*lɨəh*Cɯ-ra(ʔ)s (cognate to 知 *tɯ-re 'know'?)think of
69g-i*puə*pra < *pɯ-raskin
69x*ʈhɨə*t-hra < *tɯ-hra (or simply *hra if nonemphatic *hr can harden to a retroflex stop, though I doubt that)extend

家 OC *kra 'house' had a velar prefix, but 盧 'house' and 蘆 'reed' could have had nonvelar prefixes (or no prefixes at all).

( The following paragraph is wrong. See "A Lu-dicrous Error".)

廬 'hut', homophonous with 盧 'house' in MC [no!], could have had a different prefix in OC. It's also possible that 'hut' was the root *ra in OC and that the OC words for 'house' had prefixes attached to it.

