That was the best news I'm likely to hear all week. This month too. Maybe even this year.

Unfortunately I haven't seen the book yet. Google Books has no preview for it. Nonetheless I am confident that I will be impressed. I have seen Guillaume's previous work on Tangut and rGyalrong and look forward to see how he has build upon it. I am particularty interested to see his treatment of the following topics:

1. What shared innovations distinguish his proposed Macro-rGyalrongic group from the rest of Qiangic or - if  Macro-rGyalrongic is his term for Qiangic - the rest of Sino-Tibetan?

Tonight I found the 2011 dissertation of Marielle Prins (whose rGyalrongic database I constantly use) which states  on p. 21 that there is "an absence of common innovations" in Qiangic. Prins proposed that

the similarities between the Qiangic languages may be caused by diffusion rather than be genetic in nature. [...] It is more likely that the shared features of these languages are the result of contact induced structural convergence, and that the Qiangic group should be considered an areal language group rather than a group of genetically related languages. (p. 22)

I wonder what Guillaume would say about that.

I am not sure whether Prins is denying that the Qiangic languages are related at all, or if she is just rejecting Qiangic as a subgroup. The latter position need not entail a complete absence of a genetic relationship: e.g., Qiangic could consist of languages from multiple Sino-Tibetan branches which have converged. Is Prins' Qiangic like my Altaic (completely unrelated languages tha have converged) or like the Balkan languages (which are from different branches of Indo-European)?

2. I assume Guillaume is still using Gong's 1997 reconstruction of Tangut which has three grades of rhymes (his III corresponds to my III and IV):

Gong's source
This site
-Ø- *-Ø- -Ø-
-Ø- + lowering of high vowel
*-r- -j-
*-j- long vowel

Gong's Tangut -j- and Old Chinese *-j- were retentions from his Proto-Sino-Tibetan *-j-. On the other hand, Guillaume does not reconstruct Old Chinese *-j-. How does he account for Gong's Tangut *-j-?

3. Pre-Tangut *a was raised and fronted ('brightened') to various degrees. I have tried to explain the multiple reflexes of *a by reconstructing presyllables with front vowels:

*CI-Ca > Ci

*CE-Ca > Cie

More recently I have hypothesized that some 'brightening' might have been conditioned by a suffix *-j: e.g.,

1749 *kwa-j > 1kwe 'hoof'

cf. Ersu nkhuɑ⁵⁵ 'id.'; more cognates at STEDT

How does Guillaume explain 'brightening' in Tangut?

4. Gong reconstructed long vowels that do not correspond to long vowels in Tangut transcriptions of Sanskrit. I am now agnostic about those vowels and reconstruct them with an abstract symbol ' to differentiate them from their much more common '-less counterparts (short vowels in Gong's reconstruction). The zero-' distinction does not seem to correspond to anything in rGyalrong; both types of Tangut vowels correspond to the same Japhug rGyalrong vowels (Jacques 2006): e.g.,

Tangut rhyme
This site
-i, -e, -o

What does Guillaume think is the source of vowel length in Gong's reconstruction? Does that length reflect a disticnction lost in Japhug?

5. I reconstructed *-H as the source of the Tangut second ('rising') tone; syllables without *-H developed the Tangut first ('level') tone. This type of tonogenesis has parallels in Chinese, Tibetan, and Burmese, but not. Southern Qiang (Evans 2007). What is Guillaume's account of the origin of Tangut tones? A *E(YE)-GRADE ROOT?

If Tangut 1new 'breast', 2niu 'to drink milk', and 2niụ 'to give milk' are from the root *√n-w that I proposed in my last post, there ideally should be other sets of -ew ~ -iu words. Unfortunately, I still haven't gotten around to looking for them, but today it did occur to me that Tangut

4684 1me 'eye'

and Old Chinese 目 *muk 'eye' might share a root *m-kʷ:

*m-e-kʷ (e-grade) > *mew > 1me

(See this series of posts on Tangut *labial-w syllables: 12.7.23 / 12.7.28 / 12.7.29).

*m-kʷ (zero-grade) > 目 *muk

This word is widespread in Sino-Tibetan (STEDT roots #33, 681, 682). It often has an -i-: e.g., Tibetan mig 'eye'. Was there an i-grade? (I have borrowed the terms e-grade and zero-grade from Indo-European studies. There is no such thing as an i-grade in Indo-European, but perhaps it existed in Sino-Tibetan.) Or was the root *mʲ-kʷ with an initial palatalized consonant that was vocalized as -i- in the zero grade in many languages?

One might want to resurrect an old-fashioned reconstruction *mjuk for Old Chinese 目 *muk and view its *mj- as a reflex of *mʲ-, but I have never seen any evidence for *-j- in modern Chinese languages, and there is no trace of *-j- in Sinoxenic:

Taiwanese bak (colloq.), bok (lit.)

Cantonese muk

Mandarin mu (in earlier reconstructions, *mj- > w-, not m-)

Sino-Vietnamese mục (not *dục < *mjuk or *mʲuk)

Sino-Korean mok

Sino-Japanese moku, boku

One might also be tempted to regard 覓 Middle Chinese *mek 'to seek' as being from an e-grade *m(ʲ)ekʷ like Tangut 1me 'eye', but the earliest attestation of the word that I can find is in Yupian (c. 543 AD), so I do not know if it should be reconstructed at the Old Chinese level. It could be a later unrelated innovation that has nothing to do with m-words for 'eye'. A *N-W WORD FAMILY?

The Tangut word

4834 2niụ < *S-nuH 'to give milk'

from my last two posts is a causative derivative of

4614 2niu < *S-nu 'to drink milk'

and I think those two words are related to


2123 1new 'breast' =

left of 3588 1new 'radish' (phonetic) +

left and center of 5275 2nɪʳ 'breast' (semantic).

Tangut *-w can be from *-k or *-w. If 2123 1new 'breast' is from *new and not *nek*, then it and the niu-words may share a root *√n-w:

root consonant 1
root consonant 2
Ø n-
Ø breast
to drink milk
to give milk

The *-w of the zero-grade root would have been pronounced as a vowel [u].

Could the grade hypothesis account for the vocalic diversity of these cognates?

Old Chinese 乳 *Cɯ-noʔ 'nipple, milk' could be from an o-grade *n-o-w. (The prefix could be *pɯ- if 孚 *phu is phonetic.)

If the above scenario is correct, are there other cases of *Cew ~ *Cu alternations in Tangut, and what is the significance of the different grades?

Li (2008: 832) regarded 5275 2nɪʳ 'breast' as a loan from Chinese. The only similar Chinese word I know of is 奶 'breast, milk'. But I cannot find any attestations of 奶 before the Qing Dynasty. If 奶 had existed in northwestern Middle Chinese, it would have been pronounced *nəjˀ, and if Tangut speakers added a *T-prefix, the resulting *T-nəjˀ would have developed into 2nɪʳ. The Old Chinese source of *nəjˀ would be *Cʌ-nəʔ which might have come from an even earlier *Cʌ-nəw-ʔ: i.e., a schwa-grade form of *√n-w. Perhaps the pre-Tangut prefix directly reflects the Old Chinese prefix if the latter had survived in the colloquial speech of the northwest during the Middle Chinese period:

OC *Tʌ-nəw-ʔ > *Tʌ-nəʔ > MC *T(ʌ)-nəjˀ > pre-Tangut *T(ʌ)-nəjˀ > Tangut 2nɪʳ

However, all that is highly speculative.

2nɪʳ could be an unrelated lookalike from a pre-Tangut source such as *Cʌ-nirH.

In any case, I cannot think of a way to derive 2nɪʳ from *√n-w within Tangut. If it is ultimately from that root, it would have to be a Chinese loanword.

*One might be tempted to reconstruct *-k since Maru has nuk⁵⁵ 'breast, milk', but Maru -k is an innovation. A BOVINE DYNASTY? (PART 2)

Guillaume Jacques (2010) equated the second half of ngo.snuHi, the Tibetan transcription of the name of the mythical first Tangut emperor, with Tangut

2niụ < *S-nuH 'to give milk'

In 2008, Guillaume rejected the temptation to go further and equate Tibetan s- with pre-Tangut *S- (his *s-):

Une hypothèse plus audacieuse pourrait être de voir dans ce s- une notation du préfixe causatif *s- qui doit se reconstruire pour ce verbe. En tangoute, nju.² [= 2niụ]  est dérivé de nju² [= 2niu]  (#4614) 'boire du lait'; le préfixe causatif *s- a disparu, laissant comme seule trace la 'voix tendue' notée par un point en dessous de la voyelle (Gong 1999). Cette hypothèse, toutefois, est très improbable dans la mesure où elle supposerait que soit conservée dans la graphie tibétaine une prononciation du tangoute plus ancienne que le système reconstruit à partir des dictionnaires du XIIème siècle, et donc antérieure d’au moins quatre cent ans aux textes tibétains eux-mêmes.

He regarded the Tibetan s- as "un simple artifice orthographique" since

dans le tibétain central du XIVème siècle, les consonnes préinitiales étaient déjà probablement confondues, voire amuies

but I wonder if ngo.snuHi reflects a nonstandard 14th century Tangut dialect which preserved pre-Tangut *S-. Tangut may have been internally diverse, and this dialect may have been to 12th century standard Tangut what modern Cantonese (which preserves final stops) is to Tangut period northwestern Chinese (which lost final stops) or what Ladakhi (which preserves some s-clusters: examples here and here) is to 14th century central Tibetan.

The -Hi may reflect a -j which was another trait of this 14th century Tangut dialect. Summing up the differences between the two types of Tangut and their common parent pre-Tangut:

Word cow to give milk
Pre-Tangut *ŋwə(-j)-H *S-nu(-j)-H
Standard Tangut 2ŋwɪ 2niụ
Later nonstandard Tangut ŋwə or ŋ(w)o snuj
Tibetan transcription ngo snuHi

The standard dialect had a *-j suffix in 'cow' absent from the nonstandard dialect. Conversely, the nonstandard dialect had a *-j suffix in 'to give milk' absent from the nonstandard dialect.

It is remotely possible that the -iụ of standard 2niụ could be a metathesis of *-u-j rather than a breaking of *u. But even if that were true - and I don't think it is - many or even most -iu could not be from *-u-j, as -iu regularly corresponds to Japhug rGyalrong < Proto-rGyalrong *-u (Jacques 2004: 143, 2006: 16-17). Moreover, if such a metathesis had occurred in standard Tangut, I would expect Chinese *-uj or perhaps even *-wi to correspond to Tangut -iu in very early loans. No such loans have been identified.

In any case, *-j cannot go back very far because probable cognates lack it, and nothing else leads me to believe that Tangut preserved a *-j lost elsewhere.

Next: A *n-w word family? A BOVINE DYNASTY? (PART 1)

Guillaume Jacques (2010) equated 2339, the first syllable of the Tangut imperial surname 2ŋwɪ 1mi, with its homophone (and near-homograph) 0395 2ŋwɪ 'cow':


The shared center and right components are phonetic. The surname tangraph has 'sage' on the left, whereas 'cow' has the center of 'bear' according to Precious Rhymes of the Tangraphic Sea:


0395 2ŋwɪ 'cow' = 5605 2riẽ 'bear' + 2139 2ŋwɪ 'a kind of bird'

Without looking outside Tangut, I could reconstruct the pre-Tangut source of 2ŋwɪ 'cow' as

*Cʌ-ŋwiH (if the -w- is original) or

*Pʌ-ŋiH (if the -w- is from a presyllable)

with a low presyllabic vowel to condition the lowering of *i. However, it is unlikely that the root vowel was once *i.* Probable external cognates such as Old Chinese 牛*ŋʷə 'cow' and Written Burmese nvāḥ (< *ŋwaH?*; many more here) point to a nonfront vowel. This word was borrowed into southwestern Tai as *ŋuaA 'ox'**.

I used to reconstruct the rhyme of 2ŋwɪ 'cow' as -əi. Could 2ŋwɪ or 2ŋwəi be from *ŋʷə-i-H?

The name of the first Tangut emperor was transcribed as ngo.snuHi in Tibetan. Guillaume Jacques identified that as Tangut

0395 4834 2ŋwɪ 2niụ 'the cow gives milk /  [someone] fed milk by the cow'

whose meaning was rendered in Tibetan as

ba-la Hthung-ba

cow-DAT milk drink-NMLZ

'he who drinks milk from the cow'

ngo might be a transcription of a nonstandard Tangut *ŋwə without my proposed suffix *-i. There is no character for schwa in the Tibetan script, so Tibetan o might represent a schwa. It is also possible that a pre-Tangut *ŋwə could have become *ŋo in that dialect (whereas *-wə did not fuse into -o in standard Tangut).

*Was *ŋw- > *nw- a regular change in Proto-Lolo-Burmese? I can't remember if my unpublished reconstruction from twenty years ago had either cluster. Matisoff's (1972) reconstruction has only one word with *ŋ(w)- which has a variant initial *mw- (not *nw-!).

**Do variants with w/v- and h- reflect different sources of borrowing? See Gedney's list of forms in Hudak 2008: 95. Unfortunately I could not find the word in Pittayaporn's  2009 dissertation on Proto-Tai. It may have been excluded because it could not be reconstructed at the Proto-Tai level. WAS THE TANGUT IMPERIAL FAMILY THE MI OF WEI?

Two years ago I saw Guillaume Jacques' derivation of the Tangut imperial surname

2339 1903 2ŋwɪ 1mi

from a hypothetical homophonous phrase

0395 4542 2ŋwɪ 1mi 'the cow feeds [someone]' / 'fed by the cow'

At the end of last month, I saw another derivation but couldn't remember what it was. I found it last night in Nishida (2010: 233):

It is very probable that the second syllable, miɦ (level 11), of ŋʷwɪ-miɦ (level 11) meaning "imperial family" was one of the corresponding cases of the [Tangut autonym] Mi. Its meaning might have been the Mi of Wei 魏.

The Tangut imperial family claimed descent from the Tuoba clan of the Northern Wei. Although this etymology is initially appealing, it has phonological problems.

First, the Tangut called the Wei

4962 2vɪ or 5574 2vɨi

rather than 2ŋwɪ. v- in those transcriptions reflects the loss of *ŋ- in the Tangut period northwestern Chinese pronunciation of 魏. Perhaps the imperial surname contains an earlier borrowing of 魏 preserving its nasal initial.

Second, the Tangut autonym

2344 2mi < *miH

has a 'rising' tone, whereas the second syllable of the surname has a 'level' tone. This tonal difference does not necessarily rule out a connection between the two names. The 'rising' tone of the autonym may be a reflex of a final glottal suffix *-H absent from the 1mi < *mi of the surname. Both 2mi and 1mi may be cognate to Tibetan mi 'person'. A SILKEN SOURCE FOR THE RED RADICAL?

I'm surprised I was able to account for all uses of the 'red' radical

(Boxenhorn code: qie; Nishida radical 226)

in a straightforward manner in my last post. It means 'red' and/or is phonetic in all but one case (E):

A. n-phonetic B. 'red'
E. 1tʂhɨĩ 'Chen' (a family on the land of the 2nie family?) C. xŨ-phonetic in < B. 1xʊ̃ 'red' D. -iã-phonetic in < B. 2ʔiã (1st syl of 'rouge')

I am normally at a loss to explain the function of a component in one or more tangraphs containing it. For instance, I have no idea what

the right side of

1671 1nie 'red'

is doing. It is in 65 other tangraphs. I think it is phonetic in

1674 2nie (second syllable of 2mi 2nie 'younger sister')

1809 2nie (second syllable of 1ɣɤə 2nie 'few')

which are near-homophones of 1nie 'red'. But what is it doing in, say,

3528 2tho' 'to harm, endanger'

whose analysis is unknown? Did red signify danger?

Going back to the other half of 1617 'red', I think

might be derived from the seal form of the top half (幺) of the Chinese 'silk' radical 糸 on the left side of Chinese 紅 'red'. The vertical line at the top of the Chinese 'silk' radical corresponds to the horizontal line of the Tangut 'red' radical, and the two circles correspond to the two

of the Tangut 'red' radical. If the admittedly vague similarity between the two radicals is just pareidolia on my part, did the Tangut simply draw a random line pattern and declare it to be 'red' and/or nie? A RED RADICAL

Half of the nie-tangraphs (Tangut characters) from my previous entry contained the element

(Boxenhorn code: qie; Nishida radical 226)

that appears in twenty other tangraphs. Here are all 31 qie-tangraphs. Asterisks indicate words which are only in dictionaries to the best of my knowledge.

Class Tangraphs Li Fanwen 2008 # Reading Gloss
A1 0529
1nie 2nd syl of 'be stifled to death'*
2nd syl of 'servant'
to try
2nie surname syl
2nd syl of 'kind of insect'*
2nd syl of 'kind of grass'
2nd syl of 'chin'
2nd syl of 'colored silk'*
2nd syl of 'to hide'*, 'to turn around'
bird name syl
A3 =+ 0363 1nʊ transcription
A4 =+ 1235 1nĩ red

red jade necklace*
red sand
red (Chn)
1st syl of 'rouge' (Chn)
red soil*
red wood*
2nd syl of 'rouge' (Chn)
C =+ 1741 1xõ transcription
D =+ 2049 2siã
E =+ 0298 1tʂhɨĩ surname 陳 Chen

The 31 fall into five categories:

A. qie as n(ie)-phonetic: 13 tangraphs

A1. qie as 1nie-phonetic: 4 tangraphs

A2. qie as 2nie-phonetic: 7 tangraphs

A3. qie as n-phonetic: 1 fanqie tangraph (2nie + 1tʊ = 1nʊ)

A4. qie as n-phonetic / semantic for 'red' (see category B below): 1 fanqie tangraph (1nie 'red' + 1ʔĩ = 1nĩ 'red')

B: qie as semantic abbreviation of 'red': 15 tangraphs

C. qie as xŨ-phonetic: 1 tangraph

Cf. 1402 1xʊ̃ 'red' (Chinese loanword) in category B.

The Tangraphic Sea derived 1741 1xõ from 1671 1nie 'red' (whose Chinese translation was 紅 *xʊ̃) and 3682, first syllable of 2mə 1ʔɤõ 'merit' (which could be translated into Chinese as 勳 *xiũ)

D. qie as -iã-phonetic: 1 fanqie tangraph (1si + 2ʔiã = 2siã)

E. qie as abbreviation of a surname Ne or a surname containing the syllable ne: 1 tangraph

Was a Chen family in the Tangut Empire on the land of a Tangut family whose name contained Ne? The Tangraphic Sea derived the right half of 0298 from 2107 1tsɪʳ 'earth'. PROXIMATE PRONUNCIATION

Yesterday I wrote about the transcription evidence and potential cognates of Tangut

1nie 'relative'

which originally may have meant 'near' (relatives being the people nearest to oneself).

Fanqie spellings expressing the pronunciation of tangraphs ( Tangut characters) in terms of the initials and rhymes of other tangraphs are only available for a little over half of the 6,000+ known tangraphs. Unfortunately, no fanqie are known for either 1nie or its second ('rising') tone counterpart 2nie.

Usually first ('level') tone tangraphs have fanqie in the surviving first volume of the Tangraphic Sea, but that volume is missing some pages including those which probably contained tangraphs for 1nie and other syllables with the 36th rhyme of the first tone.

The fanqie for most second tone tangraphs is probably in the lost second volume of the Tangraphic Sea. (Some second tone fanqie are in the surviving third volume Mixed Categories.)

Homophones lists 22 characters in a homophone group mixing 1nie and 2nie. All but one (0548) can also be found in Precious Rhymes of the Tangraphic Sea which has no fanqie for any of them.

Homophones Tangraph Li Fanwen number Reading Gloss Tangraphic Sea Precious Rhymes
13B31 1723 2nie second syllable of 2ŋwəʳ 2nie 'colored silk' (only in dictionaries?) in missing second volume?
13B32 1671 1nie red in missing pages of first volume?
13B33 0547 2nie the surname Ne (occurs as a first or second syllable in disyllabic surnames but unclear if it can occur by itself); transcription character in missing second volume?
13B34 1858 2nie second syllable of 1lɨa 2nie 'to hide' (only in dictionaries?; the first half can mean 'to hide' by itself) and 1gie 2nie 'to turn oneself around, look around; the other way around'
13B35 0593 2nie second syllable of 2khwa 2nie 'a kind of grass'
13B36 0548 2nie second syllable of 1lhə 2nie 'a kind of insect' (only in dictionaries?) not in either book
13B37 1678 2nie second syllable of 2miə 2nie 'chin' in missing second volume?
13B38 1774 1nie second syllable of 2nieʳ 1nie 'servant' in missing pages of first volume?
13B41 0529 1nie second syllable of 1nie' 1nie 'to be stifled to death' (only in dictionaries?)
13B42 1732 1nie first syllable of the surname 1nie 1xɤu
13B43 0806 2nie second syllable of 2mɪ 2nie 'wind' (only in dictionaries?; the first half by itself is the name of the 'wind' trigram ☴) in missing second volume?
13B44 1674 2nie second syllable of 2mi 2nie 'younger sister' ('ritual language'? only in dictionaries?)
13B45 1809 2nie second syllable of 1ɣɤə 2nie 'few'
13B46 1926 2nie in the past
13B47 0213 1nie relative in missing pages of first volume?
13B48 2231 1nie to try, second half of 1lɨe 1nie 'emissary' (the first syllable is 'to serve' by itself), first half of 1nie 2ʔwiəʳ 'writing on silk [cf. 1723 above with a different tone], written correspondence' (the second syllable is 'writing' by itself)
13B51 2239 2nie second half of 2biu 2nie 'nightingale', first half of 2nie 2no 'cuckoo, oriole' in missing second volume?
13B52 3671 1nie first syllable of 1nie 2riaʳ 'father' (only in dictionaries?; the second half needs to be combined with either 1nie- for 'father' or -2si for 'mother') in missing pages of first volume?
13B53 5147 2nie first syllable of 2nie 1ɣa 'dog' (only in dictionaries?) in missing second volume?
13B54 3846 2nie optative prefix (< 'downward'), you (is this pronoun only in dictionaries?)
13B55 3817 2nie to present a gift (only in dictionaries?)
13B56 0638 2nie to compel, drive

Since Tangut dictionaries - both ancient and modern - are character-based, one might think 22 characters stood for 22 monosyllabic words pronounced nie, but in fact there are only three monosyllabic 1nie words and only three or four monosyllabic 2nie words:

1nie: 1. 'red', 2. 'relative', 3. 'to try'

2nie: 1. 'the surname Ne' (? - unsure if it can occur by itself), 2. 'in the past', 3. 'to present a gift', 4. 'to compel, drive'

8.21.2:42: It is tempting to try to derive the polysyllabic words from the monosyllabic nie-words, particularly since some of them are combinations of nie with monosyllabic words: e.g.,

1995 0806 2mɪ 2nie 'wind'

whose first syllable is also the Tangut name of the 'wind' trigram ☴. Could that word literally be 'red wind'? The trouble with that case and others is that 'red' is 1nie, not 2nie. One could try to salvage the etymology by proposing that 2nie in compounds is from 'red' plus an *-H suffix conditioning the second tone. But it is dangerous to build speculations atop speculations. Moreover, in this particular case, perhaps 2mɪ is an abbreviation of a monomorphemic, disyllabic 2mɪ 2nie 'wind'. PROXIMATE PEOPLE

Today I saw Tibetan nye 'near' which brought to mind a possible Tangut cognate:

0213 1nie 'relative' (i.e., one's near relations; I covered related characters here)

This word was transcribed in Tangut period northwestern Chinese as 你 *ni. No Tibetan transcription is known, but its near-homophone

3830 2nie 'king'

with a different tone was transcribed in Tibetan as nye(H) and ne(H). (Tibetan ཉ ny- [ɲ] and ན n- are different letters.)

I normally derive Tangut rhyme 37 -ie from pre-Tangut *Cɯ-e:

*Cɯ-ne > *Cɯ-nie > nie

The -i- is a trace of the lost presyllabic high vowel *ɯ.

However, Tibetan nye makes me wonder if Tangut -i- in 'relative' is primary rather than secondary:

*ɲe > nie = [ɲe]? [nje]?

Similar Qiangic and rGyalrongic words for 'near' (see sections 3.2 and 3.3 of this list) have palatal ȵ- (= ɲ) or dental n-. (See items #1757-1758 here for very different rGyalrong words.)

Possible Old Chinese cognates have n-:

*Cɯ-ne(j)ʔ (< *n-e-j + -ʔ?) 'near'

*Tnik (< *T- + √n-j + -k?) 'near, be familiar with'

Could those words contain e-grade and zero-grade forms of a root *n-j? Could the root-initial consonant have been *ɲ-? Were *Cɯ- and *T- the same prefix with and without a presyllabic vowel? Were *-ʔ and *-k variants of the same suffix? SREDNJI KITAJSKI JĘZYK

I felt uncomfortable about mentioning Middle Chinese reconstructions in my last post because they may give the false impression that Chinese in the past was more homogeneous than it actually was.

It occurred to me last night that Middle Chinese is about as real as Interslavic, my favorite constructed language. If future linguists knew nothing about Russian, Polish, Serbo-Croatian, etc. - i.e., specific actual languages - Interslavic would have to do for comparisons with other European languages. The title of this post is Interslavic for 'Middle Chinese language'.

I suspect that diversity within Middle Chinese was like that between Slavic languages today. So my *kon for 昆in this table is to real Middle Chinese forms what Interslavic koń 'horse' is to these modern Slavic words: similar but not necessarily identical. Interslavic koń happens to match the actual Polish word for horse, but its vowel is very different from that of Ukrainian кінь [kinʲ] 'horse', and it is completely different from Russian лошадь [loʂətʲ] 'horse', a loan from Turkic. If a language borrowed a word kin 'horse' from Ukrainian, it would be strange to say that kin is from a 'Slavic' koń. Yet how many would blink if I wrote that Sino-Vietnamese mã is a loan from 'Middle Chinese' 馬 *mɤaˀ? (The actual source of mã was more like *ma with a 'rising' tone in a southern late Tang variety of Middle Chinese.)

'Middle Chinese' may sound specific, but it's actually a generic term like 'Middle Indic' which could refer to Pali, Gandhari, Ardhamagadhi, etc.

Unfortunately there is no analogous established terminology for specific varieties of Middle Chinese. It is easier to type a simple name like Pali than a phrase like 'Tangut period northwestern Chinese' (TPNWC), the dialect in the Timely Pearl in the Palm that is also the source of Chinese loans in Tangut.

Tonight I momentarily considered renaming TPNWC 'Zaric' after Tangut

1ɮar 'Chinese'

but that term would make no sense to those who didn't know the Tangut word. Although my older term is more tedious, it is also more transparent.

One could think of Middle Chinese reconstructions as being as open to intrepretation as Interslavic pronunciation: e.g., the ę of język 'language' in the post title could be [ʲa] ~ [ʲɛ] ~ [ʲɛ̃] ~ [ʲɔ̃] ~ [ɛ]; [ʲæ] is a suggested average.

That description of Interslavic states that "[a]ccentuation is free." Hence there is no way that one could figure out Serbo-Croatian tones from Interslavic: e.g., the falling tone of konj 'horse'. (Most of Slavic lacks tones, so Interslavic also lacks them.)

The situation is a bit different with Middle Chinese tones. The Old Chinese sources of Middle Chinese tones are known (e.g., *-ʔ in 'horse'), but their phonetic realizations are not. It is likely that *-ʔ left a trace as glottalization which disappeared at different times in different places (and is still present in today's Xiaoyi), and pitches once associated with glottalization became phonemic.

Although 'rising', the traditional name of the tone category for 'horse', suggests the tone was rising, that may not have been true in all Middle Chinese varieties, and it is certainly not true today: e.g., in Taiwanese, 馬 'horse' has a high falling tone (indicated with an acute accent in romanization!). See Sagart (1998) for more on Chinese tonal history.

I have similarly used an acute accent to indicate the 'rising' tone in Middle Chinese varieties after glottalization was lost, but that accent may imply a rising or even high tone though I am actually agnostic about its contour, so I am reluctant to use it now. Maybe it's time to dust off my tone codes.

All of the above also applies to Old Chinese except for the part about tones since Old Chinese didn't have any. Old Chinese was not uniform before the mid-first millennium AD. In fact, 揚雄 Yang Xiong (53-18 AD) wrote the first Chinese dialect dictionary, 方言 Fangyan 'Areal Speech', toward the end of the old chines period. I think that oddities in Chinese loans in Vietnamese and Tai may in part reflect Old Chinese diversity that has been lost. Proto-Indo-European must also have been diverse.

Speaking of Proto-Indo-European, I don't understand how Proto-Indo-European *ḱem- 'hornless' became Proto-Slavic *konjь; why didn't PIE *ḱ- become PS *s- (cf. Sanskrit śama- 'domestic'), and why didn't PIE *-m- become PS *-m-? WHEN B IS SPELLED G

If Vietnamese mắm [mam] < *ɓamʔ 'salting' could be written with the velar-initial phonetic 禁 cấm [kəm] (see my last two entries), could labial-initial syllables be written with velar-initial phonetics in sawndip, the traditional Zhuang script, as well? I looked through Sawndip sawdenj [Traditional Zhuang script dictionary] which I admit is a problematic source* and found the following characters with velar-initial phonetics for Zhuang [p]-initial syllables:

Standard Zhuang reading IPA Semantic component Phonetic (?) component and Middle Chinese reading Zhuang reading of phonetic (?) component Meaning
boenq pon³⁵ 土 'earth' *kon (> some northern Pinghua readings with khw-; the aspiration is irregular) goen [kon³⁵] dust
bomx poːm⁴² 足 'foot' *kuŋ 'bow' (archery) no reading for 弓 in isolation; 弓 is a phonetic in goem [kom³⁵] to crouch
byaij pjaːj⁵⁵ *ŋwajʰ 'outside' (> three northern Pinghua dialects have m-!) vaih [waːj³³] to walk
byangj pjaːŋ⁵⁵ 強 'strong' *kɔŋ (> early Mandarin *khjaŋ) gangj [kaːŋ⁵⁵] hot pain?
byoq pjo³⁵ 火 'fire' *khɨak (> early Mandarin *khjaw, northern Pinghua readings khio, khyo) cog [ɕoːk³³] to bake
byouz pjow³¹ *gu caeuz [ɕaw³¹] to boil
byuk pjuk³⁵ 虫 'bug' *kok goek [kok³⁵] white ant
byuz pju³¹ 瓜 'melon' *ɣo (> some northern Pinghua readings with f-: e.g., Guilin [Yanshan zhuyuan dialect] fu) no reading for 乎 in isolation; 乎 is a phonetic for fouj [fow⁵⁵], fuj [fu⁵⁵], hued [hut³³], huz [hu³¹], ruz [ɣu³¹], and youq [jow³⁵] gourd
bywngj pjɯŋ⁵⁵ 足 'foot' *khəŋˀ haengj [haŋ⁵⁵] verb suffix
扌 'hand'

(8.18.0:54: Added Zhuang reading of phonetic component column. The title of this post should makes more sense now. I was referring to how Zhuang b-syllables were written with g-phonetic components.)

At least two phonetic components may actually be semantic: e.g.,

*kuŋ 'bow' (archery) could refer to bending down in 足+弓 bomx 'to crouch'

*ŋwajʰ 'outside' could refer to going outside in 足+外 byaij 'to walk'

or it could have been chosen for a labial or labiodental initial ([w]? [v]? [m]?) close to by- [pj]

*ɣo may be a reference to its homophone, the first syllable of 葫蘆 'gourd'.

The other phonetic components are baffling. If they are really phonetics, were they chosen only for their rhymes? Or did they have labial-initial readings in local varieties of Chinese?

*Holm (2011: 2) pointed out that

The Sawndip sawdenj is a useful compendium, but it provides no information about where the dialect forms come from, so it is impossible to see any patterns in geographic variation from this source.

Moreover, all the readings in Sawndip sawdenj are in standard Zhuang, even though the characters could be from all over the Zhuang-speaking world. Hence I presume many actual readings have been converted into hypothetical standard Zhuang equivalents. Such readings are strictly speaking not readings at all, since no literate native speaker would have ever used those hypothetical readings. Nonetheless I hope those hypothetical readings are close enough to the originals for my purposes here: e.g., b-[p] readings are most likely from nonstandard [p]-readings. WHEN B IS SPELLED C

In my last entry, I wrote about three types of nom characters for Vietnamese mắm 'salting':

1. m-phonetic characters: e.g., 𩻐 = 魚 'fish' + right of 鎫 m- 'head ornament for a horse' (Sino-Vietnamese reading unknown but presumably similar to its nom reading mâm)

2. c-phonetic characters: e.g., 鹵 'salt' + 禁 cấm 'to forbid'

3. b-phonetic characters: e.g., 酉 'liquor' + 稟 bẩm 'to receive from above'

The third type of mắm-character must have been devised at a stage when 'salting' had an initial closer to the initial of 稟 (i.e., stage 1 or 2 below):



b [ɓ] m [m]

The first type of mắm-character must date from stage 2 or 3.

The second type of mắm-character continues to baffle me. If I didn't know anything about Vietnamese or Chinese, I might propose a solution involving a labiovelar, but labiovelars did not exist in earlier Vietnamese, and禁 never had a labiovelar or a velar-labial cluster *kw- in Chinese. Did 'salting' once have a cluster *kɓ- in Vietnamese? There is no support for *k- in other Vietic languages.

I looked for other cases of c-phonetics for syllables with *ɓ- and other labial initials in the Nom Foundation's Kiều index and only found a single example: biếng khuây 'unforgettable' was written as 更亏 in line 246 of the 1872 version of Kiều. 更 is normally read as canh 'watch of the night' and cánh 'more'. Khuây is 'forget', and I doubt 更 has semantic relevance in 更亏: why write 'unforgettable' as 'watch forget' or 'more forget'? 更更 canh cánh 'obsessed' appears earlier in the line, so I wonder if 更 for biếng later in the line is an accidental substitute for the b-phonetic character that appeared in earlier editions. FORBIDDEN SALT

Last night I mentioned two examples of phonetics representing Vietnamese syllables with different onsets in the nom script. Here's a third.

As Vietnamese cuisine becomes more popular, more Americans are becoming familiar with nước mắm 'fish sauce'. Nước is literally 'water' and mắm is 'salting'. I do not know of any Sino-Vietnamese reading like mắm. The only similar Middle Chinese syllable was 鋄/鎫 *muamˀ 'head ornament for a horse'. I cannot find a Sino-Vietnamese (SV) reading for that rare character; in theory it should have been *vãm or, if it was borrowed earlier, *muộm. 鎫 was used as a phonetic symbol for the native Vietnamese word mâm 'tray', so its SV reading must have contained the consonant sequence m-m. Variations of its right side were used as a phonetic in nom characters for mâm 'tray' and mắm 'salting': e.g., 𩻐 mắm (with 魚 'fish' instead of 金 'metal' on the left side). ( also lists a similar character with the codepoint U+29DE0 which may be a typo for U+29ED0, the codepoint for 𩻐. U+29DE0 is for a different character 𩷠 from a source in Taiwan. I cannot find the other 𩻐-like nom character in Unicode.)

There are two other types of characters for mắm which aren't in Unicode yet, so I have to describe them in terms of their semantic and phonetic components:

variations of 鹵 'salt' + 禁 cấm 'to forbid' (the latter is also a phonetic loan for the native Vietnamese word bấm 'to press')

酉 'liquor' + variations of 稟 bẩm 'to receive from above' (more on 稟 here)

Why was mắm written with a b-

Why was mắm written with a b-phonetic? Were the latter two types of characters devised when mắm still had an initial implosive *ɓ-? (Many other Vietic languages still have b- in this word: e.g. Sơn La Muong bam³. Is their b- implosive?) And why was bấm 'to press' written with a c-phonetic 禁?

8.16.1:32: I suspect Proto-Vietic *ɓamʔ 'salting' (as reconstructed in the SEAlan

phonetic? Were the latter two types of characters devised when mắm still had an initial implosive *ɓ-? (Many other Vietic languages still have b- in this word: e.g. Sơn La Muong bam³. Is their b- implosive?) And why was bấm 'to press' written with a c-phonetic 禁?

8.16.1:32: I suspect Proto-Vietic *ɓamʔ 'salting' (as reconstructed in the SEAlang Mon-Khmer Languages Project database) is a Vietic innovation. I have not found any potential true cognates in other langauges in that database. Halang măm 'salt fish' and Mnong măm 'salted fish' are probably Vietnamese loans in those Bahnaric languages, and Bolyu mjaːm¹³ 'salt' may be a lookalike; its -j- matches nothing in Vietic. BIT-TẢI-R ROOF

In my last entry, I couldn't explain why 宰 was read as tể instead of tải in Vietnamese. Today I checked various nom dictionaries and found the reading tải in Vũ Văn Kính's Bảng tra chữ nôm sau thể kỷ XVII (18, 19, 20) (Table for Finding Nom Characters after the 17th Century (18, 19, 20)). Unfortunately the book did not provide a context for tải, so I don't know if that syllable was a now-extinct Sino-Vietnamese reading or (part of) a native word. I also don't know if that reading predates the 18th century. My guess is that the taboo substitution occurred in the 18th century (hence the inclusion of the original reading tải in Vũ's book), and that most works only include the later altered reading tể and its spinoffs tẻ and tỉa. (It would be unusual for an -ai character to be used to write -e and -ia syllables, so I assume the latter two readings postdate tể.)

The Nom Foundation's Kiều index lists yet another reading in line 2873 of the 1870 version: tề. However, page 206 of its romanized text of that version has the usual reading tể.

I just realized that although characters could be used as nom phonetic symbols and components without regard for tone (e.g., 宰 tể for tề in Kiều?), all taboo deformations I have seen retained tones along with onsets*. Final consonants could be slightly changed: e.g., hoàng [hwaːŋ] became huỳnh [hwiɲ]. The hierarchies of 'loyalty' for nom phonetics and taboo deformation were slightly different:

nom phonetics: vowel quality > onsets, codas, tones**

taboo deformation: onsets, tones > codas > vowel quality

Nom phonetics were generally used for syllables with similar vowels: e.g., 宰 tể could not represent a syllable like tổ even though it had the same onset, tone, and zero coda. However, native Vietnamese words had more onsets and onset-coda sequences than Sino-Vietnamese, so there was more freedom to use phonetics to represent syllables with different onsets or codas: e.g.,

la as a phonetic with semantic 出 xuất 'to go out' in 𠚢 ra 'to go out' (there are no r-syllables in Sino-Vietnamese)

n as a phonetic with semantic 口 khẩu 'mouth' in 𠵘 mồm 'mouth' (there is no syllable môm in Sino-Vietnamese)

I'll look at another example tomorrow. Note how tone is disregarded in the latter case. 羅 la can represent là 'to be' with a different tone.

*The spellings of initial onsets could change because of quốc ngữ spelling conventions: e.g., kiểu [kiəw] became cảo [kaːw].

**8.15.1:30: Vietnamese tones historically had 3 x 2 categories. Each tone name exemplifies its tone.

 voice quality *plain *creaky *breathy
*voiceless initial > *upper register ngang sắc hỏi
*voiced initial > *lower register huyền nặng ngã

The reconstructed category names no longer necessarily describe the modern tones: e.g., huyền is breathy, ngã is not breathy and is higher than hỏi, etc.

There seems to be a hierarchy of tonal 'loyalty' in nom:

1. Retention of original tone in phonetic symbol/component.

2. Use of phonetic for syllable with opposite-*register tone: e.g., 羅 la for là above.

3. Use of nonplain tone phonetic for syllable with any other *nonplain tone: e.g., 禮 lễ for lấy, lạy, and rẻ in Kiều. (Lễ 'ceremony' and lạy 'to bow' are in fact the same Chinese word borrowed into Vietnamese during two different periods.)

4. Use of phonetic for syllable with any tone: e.g., 永 vĩnh for the *plain tone syllable vành as well as *nonplain vắng and vạnh in Kiều. (Is there an example of a phonetic used for syllables with all six tones?)

The most 'loyal' phonetics have readings ending in stops. Stop-final syllables in Vietnamese can only have *creaky tones: e.g., 越 việt also represented the *creaky-tone syllables vượt, vết, and vớt, but could not represent *noncreaky tone syllables like *vVt, *vV̀t, *vV̉t, or *vṼt which were impossible in Vietnamese. BIT-TỂ-R ROOF

I regret not paying attention to Vietnamese until I started reading Bernhard Karlgren's books after my first semester of graduate school over twenty years ago. I remember flipping through a Vietnamese-English dictionary and being astounded by all the words I could recognize because they were Chinese borrowings. (Of course the native words were totally alien to me, as I had never studied an Austroasiatic language before, much less one that was closely related to Vietnamese like a variety of Muong. I didn't even know what Muong was!) I soon learned the sound correspondences between Sino-Vietnamese and what I was more familiar with (Mandarin, Cantonese, Sino-Japanese, and Sino-Korean). Since then I've committed many Sino-Vietnamese readings to memory and can guess still others using those correspondences. When I look at Vietnamese, I can usually 'see' the characters for Chinese loans. However, there are 'blind spots': i.e., exceptional readings.

One such reading that I can't explain is tể instead of the regular reading *tải for 宰 'minister'. (The title refers to the components of the character: 宀 miên 'roof' and 辛 tân 'bitter'. Why those add up to 'minister' is a topic for another time.) 宰 belongs to the Middle Chinese 海 *-əj (> later *-aj) rhyme category which usually corresponds to three rhymes in Sino-Vietnamese:

Old borrowings: -ơi [əːj]

Later borrowings with nonlabial initials: -ai [aj]

Later borrowings with labial initials: -ôi [oj]

tể is the only instance of [e] in this category. It is probably not an archaism from Old Chinese since Middle Chinese *-əj goes back to *-ə, not *-e. I do not know of any cases of *-ai becoming in Vietnamese.

宰 must have been read with a front vowel when it was used as a nom phonetic symbol to write the unrelated native Vietnamese words lẻ tẻ 'scattered' and tỉa 'to trim'.

Although there are modern Chinese languages in which this rhyme has become e-like, they are geographically distant from Vietnamese with the sole excpetion of only one variety of Pinghua (Guilin Yanshan Zhuyuan which has tse with an irregular tone). I doubt tể is a borrowing from Guilin which is 400 miles from Hanoi.

Is tể the last survivor of a long-dead trend of monophongization in earlier Vietnamese or the source dialect of Sino-Vietnamese? I doubt it.

8.14.1:42: Could the sui generis reading tể be the product of taboo deformation? But would 宰 have been used in a name? Tể is not in this long list of deformed readings that I just found. (The original readings are in the "Âm chính" 'main sound' columns; the altered readings are in the "Âm trại" 'mispronounced sound' columns.)

I tried looking for tể in de Rhodes' dictionary to see if tể existed in the 17th century. However, I couldn't find it or my theoretical regular *tải with any of the meanings of tể.  THOUGHT-BEARING HAPPY PROGRESS?

Thai names contain many Indic elements, so they should be transparent to me. However, they often contain surprises. For instance, last night I encountered the name

จินตหรา สุขพัฒน์ <cinthrā sukhbaḍhn˟> [tɕintaraː sukʰapʰat] (?) 'Chintara Sukapatana'

which looks like it should be from an Indic *cinta-harā sukha-baḍhana-. However, only [sukʰa] < Sanskrit/Pali sukha- 'happy' is straightforward. The remaining three components puzzle me:

- I would expect the final long vowel of Sanskrit/Pali cintā 'thought' to remain intact in compounds; this same shortening is also in regular Indic loanwords in both Thai and Khmer (so perhaps the shortening is of Khmer origin)

- apparently <hrā> is pronounced as if it were a monosyllabic native Thai word [raː] rather than the expected [haraː] from Sanskrit/Pali harā 'bearing' (f.). There is a Thai word หรา <hrā> [raː], but I don't think it is part of this name because it is an adverb 'boldly', not an adjective which should follow [tɕinta] 'thought'.

- although I'm accustomed to Pali vaḍḍhana- 'increase' (< Sanskrit √vṛdh) becoming the regular Thai word พัฒนา <baḍhnā> [pʰattʰanaː] 'progress', I didn't expect it to be clipped to [pʰat]. (์ <˟> indicates silent characters.) Was *[sukʰapʰattʰan] too long? Is the feminine ending [aː] always absent from Thai surnames?

Chintara's birth name is

จิตติมาฆ์ <cittimāgh˟> [tɕittimaː]

which has mysteries of its own. [tɕitti] is from Sanskrit/Pali citti-, a variant of cintā 'thought', but what is [maː] from Sanskrit māgha- 'name of a constellation' (> 'third lunar month' in Thai and Khmer) doing, and why was its final consonant dropped? Compare มาฆ์ <māgh˟> [maː] with เมฆ <megh> [mek] < Sanskrit/Pali megha- 'cloud' whose final <gh> [k] is not silent.

Has the phenomenon of dropping perfectly pronounceable segments in Indic loans in Thai been studied? (Some dropping is required to make Indic loans fit the constraints of Thai phonology: e.g., จันทร์ <candr˟> 'moon' is [tɕan] because Thai does not permit final consonant clusters.) C-RUTSUBO

Li (2008: 721)  listed the first syllable of

4538 5544 2ko' 1riuʳ 'crucible'

as a borrowing from the second syllable of Chinese 坩堝 *kã ko 'crucible'.

I am skeptical of a connection between the two words for the following reasons.

First, I do not know of any cases of the rhyme of 堝 *ko borrowed as Tangut rhyme 54 -o'. I use the symbol ' to indicate that rhyme 54 was similar to rhyme 51 -o yet different in some unknown way. I only know of a single case of rhyme 54 transcribing Chinese *-o:

5388 2bo' for Chinese 摩 *mbo (Gong 2002: 436)

Chinese *-o was normally transcribed with rhyme 51 -o (see Gong 2002: 456 for many examples).

I originally was going to write that I thought it was unlikely that 坩堝 *kã ko would be cut in half by the Tangut, but in fact 堝 *ko is attested as an independent word in the Song Dynasty. I suspect it is a specialized use of 鍋 *ko 'cooking pot' written with a radical 土 to match 坩 which is attested as an independent word in the Tang Dynasty.

So my third objection is now my second:  Even if 4538 is a borrowing from Chinese, what is 5130? No homophone of 5130 is an adjective that would make sense as a modifier of 4538. Here are Li's (2008) glosses for the other tangraphs pronounced 1riuʳ:

0968 'all'

1403 'complain'

2147 'sweep'

2324 'sigh'

2542 'gadfly'

2543 'hate'

2812 'cherish'

3491 'bright star'

3493 'firefly'

3737 'frivolous'

4364 'wooden framework'

4437 'auspicious'

4713 'world'

5130 'subdue'

I think Tangut 2ko' 1riuʳ 'crucible' is an indivisible disyllabic word that is a coincidental soundalike of 堝 *ko.

As for the title, Nishida (1986: 43) translated Tangut 2ko' 1riuʳ 'crucible' as Japanese rutsubo. No native Japanese word can begin with r-, so the word must be of at least partly foreign origin. I think it might be a compound of Middle Chinese 爐 *lo (borrowed into Japanese as *ro which then became ru after raising) 'stove' and the native word tsubo 'pot'. FROZEN WHITE WATER IN THE BLACKSMITH'S CRUCIBLE

Andrew West pointed out that 4053 1ʔwọ 'ice' from my recent entries occurs three times in The Ode on Monthly Pleasures: e.g., in the 'common language' line 3B of the section on the second month. The 'ritual language' line 3A is slightly longer:

2.3A 1 2 3 4 5 6 7

Li Fanwen number 0804 4051 3052 1659 5441 4538 5544
Reading 2diə 1kiʳw 1nioʳ' 2lew 2swi 2ko' 1riuʳ
Gloss PERF cold water white blacksmith crucible

The 'common language' line only has a single direct match:

2.3B 1 2 3 4 5 6 7

Li Fanwen number 1490 4053 1572 0185 1452 3956 -
Reading 1tsʊʳ 1ʔwọ 1phɤõ 2nwiə 1nia 1dʐi -
Gloss winter ice white spring PERF melt -

Nishida (1986: 43) translated those lines as topic-comment sequences:

A. 'The cold, white water - a crucible of materials'

B. 'Spring melts the white water of winter'

He proposed four parallel pairs:

A1-2 'cold' : B1 'winter'

A3-4 'white water' : B2-3 'white ice'

A5 'materials' : B4 'spring'

A6-7 'crucible' : B5-6 'melts' (my 'melted')

A1 0804 2diə is a perfective prefix originally indicating motion toward the speaker. Perhaps its combination with A2 4051 1kiʳw 'cold' could be translated as 'frozen' (i.e., finished becoming cold). A 'common language' perfective prefix is not what I would expect in the 'ritual language' if the latter was an unrelated substratum language.

As far as I know, A2 4051 is only in dictionaries and this ode. If it is a 'ritual language' word, it demonstrates that 'common language' affixes can be attached to 'ritual language' vocabulary.

A3 3052 1nioʳ' is a 'common language' word for 'water' that is not very common. It is the Tangut name of the Chinese trigram ☵ for water. If the 'ritual language' were a low-prestige substratum language, I would not expect its words to be used to refer to concepts from a high-status culture. I should look into the names of the seven other trigrams; none match the common words for their concepts.

Similarly, A4 1659 2lew is a 'common language' word for 'white' that is not very common.

I do not know why Nishida translated A5 5441 2swi as 'materials'. A note in Homophones text D equates 2swi with 'iron artisan' (i.e., 'blacksmith'). 5441 is also a verb 'to (s)melt' (Kychanov and Arakawa 2006: 542), so a blacksmith was a 'smelter'. 5441 can also mean 'mother-in-law' (Li 2008: 858), but I assume the character is used for two different unrelated words. Unfortunately, I do not see this term for 'mother-in-law' in Jacques 2012 which mentions the term

3986 4893 1niə 1vɨə

A6-7 4538 5544 2ko' 1riuʳ 'crucible' is a indivisible disyllabic word which is not in the 'common language' to the best of my knowledge. Neither half can stand alone. Li (2008: 721) regarded the first syllable as a loan from the second syllable of 坩堝, but this is problematic for reasons I'll go into in my next entry.

A5-7 'blacksmith's crucible' makes little sense as a gloss for 'spring melted'. It is an odd metaphor for spring, as a crucible is much hotter. Nishida's 'crucible of materials' is even more puzzling.

I would translate the B line as a topic-comment sequence:

'As for the white ice of winter, spring melted it.'

B6 1452 1nia is a perfective prefix originally indicating downward motion, so B6-7 1452 3956 1nia 1dʐi 'down-melt' (i.e., 'melted') is reminiscent of English melt down, though they are not equivalents: the former could be translated as 'melted down' but not 'melt(s) down' and the latter has an extended meaning '(emotionally) collapse'. MINING THE ODE ON MONTHLY PLEASURES

Andrew West pointed out that 5952 'ore, mine' from my last entry does occur outside dictionaries: e.g., in the 'ritual language' line 4A of the section on the tenth month in The Ode on Monthly Pleasures:

10.4A 1 2 3 4 5 6 7

Li Fanwen number 2992 0026 5429 2431 5952 5072 1420
Reading 2bia 2ŋwʊ 1po 2khwa 2nɤa' 1ɣiəʳ 2tʂɨụ
Gloss the Ba clan territory the Po clan Chinese (? - see below) make wing

The corresponding 'common language' line does not seem to match it at first glance:

10.4B 1 2 3 4 5 6 7

Li Fanwen number 1518 3437 2344 5882 0795 3497 4999
Reading 1tʂɨẹ 2ʔõ 2mi 1ɮaʳ 2riəʳ 2lʊ̣ 1giẹ
Gloss the Che clan the On clan Tangut Chinese PERF obstruct scissors

Nishida (1986: 62) translated those lines as topic-comment sequences:

4A. 'As for the Po and Chinese of the Pa [= my Ba*] territory - wings made of iron'

4B. 'As for the Tangut and Chinese of the Che Hon [= my Che On**] - scissors that obstruct'

He proposed four parallel pairs:

4A1-2 'Pa [= Ba] territory' : 4B1-2 'Che Hon [= Che On]'

4A3 'Po' : 4B3 'Tangut'

4A4 'Chinese' : 4B4 'Chinese'

4A5-7 'wings made of iron' : 4B5-7: 'scissors that obstruct'

Only the third pair is obvious.

I have suspected that the 'ritual language' is a non-Sino-Tibetan substratum language glossing the superstratum Sino-Tibetan 'common language'. For other interpretations of these 'languages', see Andrew's "The Myth of the Tangut Ritual Language".

But my hypothesis is problematic even for the third pair. If 4A4 2431 2khwa is a substratum word for 'Chinese' glossing the superstratum word 4B4 5882 1ɮaʳ 'Chinese', why do both words appear in the foreword to the Timely Pearl in the Palm, a text otherwise in the 'common language'? Is that a case of a substratum word borrowed by the superstratum language? Neither word has any connection to Chinese autonyms. (The Chinese autonym 漢 *xã 'Han' was borrowed as

5916 1xã

which like 2431 and 5882 is written with the unflattering components 'little' and 'insect'.)

The fourth pair only makes sense if 4A5-7 2nɤa' 1ɣiəʳ 2tʂɨụ 'wings made out of 5592' was the substratum term for 4B7 4999 1giẹ  'scissors'. 4999 can also be a verb 'cut' - is 'scissors' a derived noun 'cutter'? - but Kychanov and Arakawa 2006: 327 do not list any compound verb 3497 4999 'obstruct-cut', and a verb sequence 'obstructed and cut' is even harder to relate to the noun phrase 'wings made of 5592'.

Li (2008: 938) and Kychanov and Arakawa (2006: 296) agree that 5592 means 'ore'. (As Andrew pointed out, none of the textual examples in Li 2008 support 'mine'. Li's English gloss for 5592 seems to be a translation of his Chinese gloss 礦 which means both 'mine' and 'ore', but the Tangut word may have had a narrower meaning.) However, 'ore' might be odd in this context. Nishida translated 5592 as 'iron', even though 'iron' is a distinct word that combines with 5592 to form the phrase

5592 4995 2nɤa' 1ʂɨõ 'iron ore'.

in Homophones 14B33. Could 5592 also refer to some other metal? English ore is "partly from Old English ār brass". Also cf. the ambiguity of Japanese kane 'metal, gold'.

4B5-6 0795 3497 2riəʳ 2lʊ̣̣ 'obstructed' (0795 3497) has no substratum equivalent in line 4A. Was the superstratum verb understood even by speakers of the substratum language, or was it just omitted to make line 4A the same length as 4B?

Why would glosses have to match the lengths of the lines they glossed? Were the glosses meant to be poetry in their own right? And why place the glosses before the lines they gloss?

The '-ed' of my translation 'obstructed' corresponds to the perfective prefix 0795. Nishida (1966: 579) regarded 0795 as the Tangut equivalent of Classical Chinese 所, so I would have expected his translation to be 妨げるところ  'that which is obstructed' (cf. his translations of 0795 in 1966: 279), but that would make no sense before 'scissors'. His actual translation 妨げる 'obstructs' only corresponds to 3497.

4A6-7 5072 1420 1ɣiəʳ 2tʂɨụ 'wing made of ...' consists of words also in 'common language' texts. I don't know of any Sino-Tibetan cognates for those words. Could they be substratum loans in the superstratum language?

The first two pairs are the most baffling. When I see Tangut names, I feel as if I'm reading about characters in a TV show I've never seen. Who were the Ba, Po, Che, and On? I don't know, but I'm certain that Po is not a synonym of 2mi 'Tangut'. The names Che and On are together in Homophones 35B37, so they may have been a common collocation ('Che [and] On') or a disyllabic name ('Che'on'). Kychanov and Arakawa (2006: 315) favor the latter interpretation. Was the territory of the Ba also the land of Che and On (or Che'on)? If so, then the Po were a Tangut clan in that land, and 'the Po of the Ba territory' and 'the Tangut of Che (and?) On' refer to the same group of people. This hypothesis is impossible to explore further without learning more about these clan names.

4A2 0026 2ŋwʊ 'territory' in the 'ritual language' line is also in 'common language' texts. Is it a superstratum loan in the substratum, or vice versa? Like 'make' and 'wing' later in that line, it too has no known cognates.

*8.10.3:23: I follow Gong (1997) and Li Fanwen (1986: 218) who reconstructed the initial of 2992 as b-. Nishida (1986: 62) reconstructed it as p-; in 1966 he reconstructed it as m- (p. 396). Sofronov (1968 II: 307) reconstructed it as mb-. The character is in the labial chapter of Homophones, so there is no doubt that its reading had a labial initial of some kind. The only transcription for a member of its homophone group that I know of is *mba for

2bia 'belly'

in Timely Pearl in the Palm 19.1. I think the diacritic might indicate that the initial was b- (absent from Tangut period northwestern Chinese) rather than mb-. In any case, the Chinese transcription rules out Nishida's later reconstruction p-; it tells us that the initial was voiced (though whether it was nasal or prenasalized may be debated).

**8.10.4:04: I follow Gong (1997), Li Fanwen (1986: 425), and Sofronov (1968 II: ) who reconstructed the initial of 3437 as ʔ-. Nishida (1986: 62) reconstructed it as x-; in 1964 he reconstructed it as ɣ- (p. 134). (Nishida 1964 does not explicitly mention 3437, but it does list the reconstruction ɣõ for a rising tone rhyme 47 syllable without any homophones in chapter VIII of Homophones. 3437 is the only syllable that matches that description.) I do not know how those scholars reconstructed the initial of 3437, as its character has no transcriptions or homophones. Given that there are three basic chapter VIII initials (x-, ɣ-, ʔ-) and that the following rising tone rhyme 47 syllables can be reconstructed in Homophones chapter VIII -

xõ, ɣõ

- 3437 is likely to be ʔõ by a process of elimination. Tangut did not seem to permit xw- and ɣw- before o in native words, so xwõ and ɣwõ are unlikely. However, ʔw- was possible before o: e.g., in

4053 1ʔwọ 'ice'

which led to my interest in

2040 2nɤa' 'ice' (with 'water' on the left instead of the mystery element ヒ on the right)

and its homophones like 5952 'ore'. So perhaps 3437 was ʔwõ. Maybe it would be safest to write its reconstruction as Xõ with X representing an unknown back consonant. AN OCTET OF ICY HOMOPHONES

The last of the Tangut words for 'ice' that I have been writing about belongs to a set of eight characters in Homophones:

Homophones location Li 2008 number Tangraph Reading Gloss
14B24 2189 2nɤa' < *nraXH second half of 1937 2189 2kha 2nɤa' 'to stutter; sad' (both only in dictionaries)
14B25 2249 wrist (only in Homophones; synonym of the more common word 0682 1khwiə?; not sure if it can stand alone)
14B26 3556 to apply, smear
14B27 2726 colored glaze
14B28 2040

ice (only in dictionaries; not sure if it can stand alone)
14B31 2582 mud (only in Homophones; not sure if it can stand alone)
14B32 4765 yarn (only in Homophones and Miscellaneous Characters; not sure if it can stand alone)
14B33 5952

ore; mine (only in dictionaries)

Out of these eight characters,

- six may only be attested in dictionaries, judging from the absence of nondictionary examples in Li (2008)

- at least three are freestanding words; the others are only found in dictionaries next to other characters

I am still surprised there can be so many 2nɤa'. 2726 'colored glaze' may be an extended usage of 2040 'ice', and 3556 'smear' and 2582 'mud' may be the same root, judging from the analysis of the former which implies 'muddy' semantics:


3556 2nɤa' 'to apply, smear' =

'water' and 'earth' = left and center of 2005 1tʂɤoʳ 'mud' +

bottom left of 4737 2ma 'apply'

But the others do not appear to be related to each other. I doubt pre-Tangut had six homophonous roots. Were those roots nonhomophonous in pre-Tangut: i.e., is 2nɤa' a merger of *nrakH, *nratH, *nrapH, etc. (if *X was a consonant)? DISTRIBUTIO-NA-L ODDITIES IN TANGUT

I was surprised to find that

2040 2nɤa' < *nraXH

had seven homophones in Homophones. If my pre-Tangut reconstruction is correct, I would expect simple syllables to be more common than complex syllables. *na was more common than other *na-type syllables, but it's surprising that there were no *naH, *nra, *nraH, *naX, or *nraX while there were multiple *naxH and *nraXH. Nonexistent syllables are in gray.

Rhyme Grade Tangut syllable Pre-Tangut Number of characters per syllable
17.1.17 I 1na *na 12
17.2.14 (*2na) (*naH) 0*
18.1.18 II (1nɤa) (*nra) 0
18.2.15 (2nɤa) (*nraH) 0
(Grade III rhymes like 19.1.9/2.16 do not normally occur after dentals like n-. But see rhyme 21 below.)
20.1.20 IV 1nia *Cɯ-na 1
20.2.17 2nia *Cɯ-naH 3
22.1.22 I (1na') (*naX)o 0**
22.2.19 2na' *naXH 5
(23/1.X) II (1nɤa') (*nraX) 0
23.2.20 2nɤa' *nraXH 8
21.1.21 III 1nɨa' *Cə-naX? 2
21.2.18 2nɨa' *Cə-naXH? 3***
24.1.23 IV (1nia') (*Cɯ-naX) 0
24.2.21 (2nia') (*Cɯ-naXH) 0

Although it's not impossible for a language to have a complex syllable while lacking simpler, similar syllables (e.g., English has strength but not streng, trength, treng, rength, etc.), I wouldn't have predicted multiple instances of a complex syllable. Having eight 2nɤa' < *nraxH is like having eight unrelated strengths in English. I'll look at the eight 2nɤa' next time.

For now, I'll close by noting two peculiarities involving Grade III rhyme 21. First, it was placed before the Grade I and II rhymes in the Precious Rhymes of the Tangraphic Sea, disrupting the usual I-II-III-IV pattern. Second, normally Grade III rhymes do not combine with dental initials, yet there are five Grade III nɨa' (but no nɨa!). I am not happy with my pre-Tangut reconstructions for their sources; they are placeholders. See how *Cə-naXH became 2nɨa' here.

*5 according to Arakawa (1997: 30). Gong reconstructed those five as 2da.

**1 according to Arakawa (1997: 30). Gong reconstructed that syllable as 1daa (= my 1da').

***0 according to Arakawa (1997: 30), who used Nishida's reconstruction in which these three syllables had t- instead of n-. PRE-TANGUT *NRAXH 'ICE'

The last of the three words for 'ice' in Li (2008) is

2040 2nɤa' < *nraXH

In the past I would have reconstructed it as 2nææ with a low front long vowel. It is not clear how its Grade II (i.e., -ɤ-medial < *-r- + low vowel) a-type rhyme differs from that of the more common Grade II a-type rhyme -ɤa. Since I no longer think Tangut had long vowels, I write the less common rhyme with a ' reminiscent of a prime symbol to represent its unknown distinctive feature(s). Arakawa also uses ' for this rhyme (-ya' in his system), but I do not know if he regards it as a phonetic symbol or as a notational device.

I used to think that '-vowels (my former long vowels) came from vowel-consonant sequences, as the Tangut autonym

3752 3296 2miə 2nɨa' < *mə-naXH

corresponds to Tibetan mi-nyag which must have been borrowed into Tibetan before the loss of a final obstruent *X (probably *k; more details on the development of this word here*).

If that was the case, then ideally all Tangut -V' should correspond to -VC in related languages that preserve obstruent codas. But that is not the case: e.g.,

5700 2ni' < *Ci-naXH 'nose' (not *2ni!)

corresponds to Japhug rGyalrong tɯ-ɕna and Tibetan sna 'id.' which lack final obstruent codas. Could *-X in such cases be a pre-Tangut suffix absent from other languages?

Conversely, there are cases in which non-Tangut obstruent codas correspond to zero instead of -': e.g.,

5700 1sia < *Cɯ-sa 'to kill' (not *1sia'!)

corresponds to Japhug rGyalrong kɤ-sat, Tibetan gsod-pa, and Old Chinese 殺 *ksat 'id.' Were some codas lost in pre-Tangut under certain conditions before they could condition -' in Tangut?

Both types of cases require explanation.

The problems with 2040 2nɤa' go beyond the mystery of its rhyme. I'm not even completely sure it means 'ice'. I'm surprised Kychanov and Arakawa (2006: 296) also glossed it as 'ice'; they often disagree with Li (2008). I have not seen any attestations of 2040 outside dictionaries. That is a lexicographical red flag; it means any glosses cannot be confirmed in context. Moreover, its Tangraphic Sea definition is presumably in the lost second volume. Here are the only two instances of 2040 known to me:

Homophones: 2040 4053 2nɤa' 1ʔwọ

Homophones text D note: 2040 - 3058 2765 2ɮiəʳ' 1nwie

Li (2008: 339) regarded 2040 4053 as a pair of synonymous nouns: 'ice ice', whereas Kychanov and Arakawa (2006: 296) translated it as a disyllabic verb 'turn to ice'. Given that 4053 can also mean 'frozen', another possibility is a noun-adjective phrase 'ice frozen'; each entry in Homophones has one or two clarifier characters, and the clarifier 'frozen' would distinguish 2nɤa' from its homophones (more on them later). Have Kychanov and/or Arakawa seen 2040 4053 as a verb in a text?

3058 is 'water' without a doubt, but 2765 only occurs in dictionaries. Its Tangraphic Sea entry says,

'[The character 2765 is from] the left of earth (2627) and all of blood (2734). 2765 is 0975. It is what blood uniting (2734 3591) is called.'

Unfortunately 0975 is only known from dictionaries, and the Tangraphic Sea defines it as ... 2765 and 'blood gathering (2734 0269)'.

Li (2008: 454) translated 2765 as a verb 'to swell, coagulate', but Kychanov and Arakawa (2006: 296) translated it as a noun 'coagulated blood'. I favor 'coagulate' as 3058 2765 would make no sense as a note for 'ice' if it meant 'water [and] coagulated blood'.

The verbs

3591 2ni' 'to unite' and 0269 1khiə' 'to gather'

may only have those meanings in dictionaries; their characters are attested with different meanings in nondictionary texts:

3591: 'to attack; a shield; to cover; to die'

0269: 'second half of 4059 0269 investigate; hide; rigid'

I presume these sets are unrelated homonyms apart from the noun 'shield' and the verb 'to cover'.

I could try to force three of the above words into a single word family:

2040 2nɤa' < *nra-X-H 'ice'

2765 1nwie < *Pe-nra 'to coagulate'

3591 2ni' < *Ci-nra-X-H 'to unite'

However, I would then need to account for the functions of the various affixes. Moreover, it is not possible to determine for sure whether grade IV words like 2765 and 3591 originally had *-r-. If I did not link them to 2040, I would have reconstructed them in pre-Tangut without *-r- or front-vowel prefixes to condition *a-raising: *Pɯ-ne and *niXH.

*3752 3296 2miə 2nɨa' has a strange second syllable. Normally n- does not combine with Grade III (-ɨ-medial) rhymes. Perhaps the word developed like this:

Pre-Tangut *mə-naXH

Breaking of *a before nonlow vowel *ə: *mə-nɨaXH

Breaking of *ə: *mɨə-nɨaXH

Tonogenesis: *mɨə-2nɨaX

Tone spread: *2mɨə-2nɨaX

The timing of tonogenesis relative to the vocalic changes is unknown.

The first syllable may be an unstressed form of pre-Tangut *mi 'Tangut' which became

2344 1mi (cf. Tibetan mi 'person')

The second syllable may be cognate to

0176 1nɨa' < *Cə-naX (cf. Tibetan nag-po 'black')

which also has an anomalous n-Grade III rhyme combination. Was *mə-naXH originally *mi-naX-H 'black people'? 'Black' brings to mind the term

2750 0176 1ɣɤu 1nɨa'


which is a term for one subgroup of the Tangut people. PRE-TANGUT *TɅ-KU-H 'ICE'

The second of the three words for 'ice' in Li (2008) is

3177 2kʊʳ (also 'frozen')

which may be cognate to

4034 1kiụ < *S-ku 'cold' (adj.?; see below)

if it is from *Tʌ-ku-H with a root *ku instead of *Cʌ-kur-H with a root *kur.

The prefix *Tʌ- conditioned the lowering and the retroflexion of the root vowel:

*Tʌ-ku > *Tʌ-kʊ > *T-kʊ > *r-kʊ > *r-kʊʳ > kʊʳ (ignoring *-H; see below)

The suffix *-H conditioned the second ('rising') tone.

The semantic difference, if any, between 3177 2kʊʳ 'ice' and

4053 1ʔwọ 'ice'
is unknown (apart from the fact that 3177 can also mean 'frozen').

Unfortunately I do not know of any other pairs of the type

*Tʌ-√-H (noun) : *S-√ (adjective)

*S- is normally a verbalizing prefix but not in the adjective 4034 *S-ku 'cold' or in the pre-Tangut source of the noun 4053 1ʔwọ 'ice' which could have been *S-ʔʌ-pam (as reconstructed last week) or *S-P-ʔo (as reconstructed last night).

Perhaps I should not call 4034 *S-ku an adjective, as Li (2008: 652) does not list any instances of it as an independent word (or in any text outside a dictionary). It may occur only in the disyllabic words

4034 4051 1kiụ 1kiʳw < *S-ku T-kuk 'cold' (?)

4034 4077 1kiụ 1miẹ < *S-ku Sɯ-me 'cold' (?)

which might have originated as synonym compounds.

I am not certain about these glosses. Li (2008) does not list any attestations of 4034 4051 and 4034 4077 outside dictionaries. Both 4051 and 4077 appear as independent words for 'cold', but the latter is only in the Tangraphic Sea. I have no doubt that the meanings of 4034 4051 and 4034 4077 have something to do with cold, but without textual examples, I cannot be sure of their parts of speech.

Unlike Li (2008: 652) who regarded 4034 as an adjective 'cold', Kychanov and Arakawa (2006: 489) defined 4034 as nouns 'frost' and 'cold' (i.e., 'coldness'?). Perhaps they have seen it in contexts where those glosses are appropriate.

The Tangraphic Sea equates 4034 with

1. 4034 4051 (see above)

2. 4089 0143, lit. 'cold (?) cold (adj.)' (compound attested only in dictionaries; only second half confirmed by nondictionary textual examples)

3. 2720 'cold' (adj.; confirmed in nondictionary textual example)

4. 4077 (see above)

5. 1918 0115 'not hot' (adj.; confirmed in nondictionary textual example)

On the other hand, the meanings of 3177 can be confirmed in the nondictionary textual examples in Li (2008: 518). AN ICY *P-REFIX?

While writing about a possible *p-prefix in the Chinese word for 'ice' in my last entry, I forgot to mention that *P- could also be a source of the -w- in Tangut 

4053 1ʔwọ < *S-P-ʔo? 'ice'

I reconstructed *P- to account for pairs of semantically and phoneticaly similar words such as

1829 1tsha < *Kɯ-tsa 'hot' : 1825 1tshwia < *P-Kɯ-tsa 'to heat'

in which one member has -w- and the other does not.

Ideally I would like to pair 1ʔwọ 'ice' with a word like ʔo 'ice, freeze, cold, etc.' But none of the ten words glossed as 'cold' in Li (2008) sound anything like 1ʔwọ 'ice' or ʔo. Nor do words with similar meanings like 'frigid' or 'snow'. I am hesitant to reconstruct *P- if I cannot find a -w-less potential relative, though there is no guarantee that a language must have a bare form alongside each affixed form.

Moreover, Gong (2002: 46) found that most zero ~ -w-pairs "clearly show a morphological process of forming verbs from adjectives or nouns". Hence *P-, the source of -w-, was often a verbalizing prefix. Obviously 1ʔwọ 'ice' was not a verb. Gong found only one zero ~ -w-pair whose -w-member was a noun:

3354 1ɣɤi < *Cʌ-Kri 'power' : 5307 1ɣwɤi < *Pʌ-Kri 'power'

I would add

3596 1ɣwɤi < *Pʌ-Kri 'power' (homophonous with 5307!)

to this set.

By analogy with this pair, a hypothetical ʔo that was the root of 1ʔwọ 'ice' would also mean 'ice'.

I wonder if the 'power' set actually consists of two reflexes of *Pʌ-Kri rather than *Kri with two different prefixes. I reconstruct prefixes with low vowels to account for ɣ- which is (sometimes? often? always?) from a lenited velar obstruent and -ɤ-, the reflex of *-r- after a low presyllabic vowel (see the second table in "G-*r-adation in Tangut (Part 2)"). Incidentally there is a tangraph

5309 1ʔo

that Li (2008) glossed as ... 'power'! Alas, not the 'ice' I was hoping for.

If the -w- in 1ʔwọ 'ice' is not a lenition of a prefix *P- or a medial (root-initial?) *-P-, then I wonder if Cw-clusters such as ʔw- come from original clusters or unit phonemes such as the Old Chinese *ʔʷ- reconstructed by Baxter and Sagart (2011). S-PRƏNG FROM SOME COMMON SOURCE? (PART 2)

In my last entry, I forgot to address the fact that Old Chinese 冰 *prəŋ 'ice' had an *-r- absent from pam-words for 'ice' or 'snow' in non-Chinese Sino-Tibetan languages. I could try to explain away the *-r- as an infix or as a prefix that metathesized: *T-p- > *pr-. (Medial *-r- is so common in Old Chinese that I suspect it came from a variety of sources - *t- and *l- as well as *r- - that I symbolize as *T-.) The dissimilation of *-m to *-ŋ after *p- could have occurred before *T- moved into medial position as an *-r- that would have blocked the shift. However, this scenario would still require a pre-Shijing dissimilation, long before the dissimilation is evident in poetry.

Here's a very different scenario. In Guangyun, 冰 'ice' has two Middle Chinese readings, *pɨŋ and *ŋɨŋ. I know of no other case in which a character has both *p- and *ŋ-readings. The *ŋ-reading is homophonous with 凝 *ŋɨŋ 'to freeze'.

Was 凝 used to write two unrelated words for 'ice' which happened to have identical rhymes?

Or were the two words related? Zhengzhang reconstructed them as *pŋrɯŋ and *ŋrɯŋ. (His *ɯ is equivalent to in other reconstructions.)

This internal etymology has no phonological problems if one accepts the simplification of *pŋ- to *p-, but it does raise the question of what *p- was. In this case it could be a nominalizer or even a participial prefix (a fossil of an earlier system of conjugation?): 'ice' < 'frozen'. Are there other pairs of the type

*X 'verb' : *p-X 'nominalized verb' / 'verb-ed'?

Next: More Tangut words for 'ice'. S-PRƏNG FROM SOME COMMON SOURCE?

The title is a reference to Sir William Jones' famous phrase "sprung from some common source" which is also the title of this book I borrowed two decades ago.

Schuessler (2009: 117) reconstructed the Old Chinese word 冰 for 'ice' as *prəŋ and suggested that it may be cognate to Proto-Tibeto-Burman *pam, the source of Tangkhul Naga pʰam and Kanauri pom 'snow' and rGyalrong (variety unspecified) ta-rpam 'ice'. I don't believe in the traditional binary view of Sino-Tibetan with Chinese in one branch and all other languages in a Tibeto-Burman branch. So I do not think there ever was a Proto-Tibeto-Burman - a single ancestor of all non-Chinese Sino-Tibetan (or 'Trans-Himalayan') languages. Nonetheless, could a word like *pam be reconstructed at the Proto-Sino-Tibetan level?

If such a word existed, how did it develop into Old Chinese *prəŋ? Let's look at two changes that occurred in similar syllables:

Old Chinese *-əm dissimilated to *-uŋ in Middle Chinese after labial initials unless blocked by a medial *-r-: e.g., *pəm 'wind' (writen with a character 風 containing the phonetic 凡 *bam) became Middle Chinese *p.

Early Old Chinese *-əŋ assimilated to *-uŋ in Late Old Chinese after labial initials unless blocked by a medial *-r-: e.g., 夢 *məŋ 'dream' became Late Old Chinese *muŋ(h). (The final *-h is irregular, but does not concern us here.)

Middle Chinese 冰 'ice' was *pɨŋ, not *puŋ, so it could not have come from Old Chinese *pəm or *pəŋ. Schuessler's Old Chinese *-r- blocked the dissimilation of *-əŋ dissimilated to *-uŋ after *p-, and *-rə- regularly became Middle Chinese *-ɨ-.

冰 'ice' rhymed with *-əŋ words in two poems in Shijing (2, V, 1, 6 and 2, V, 1, 6).

To force a connection between Old Chinese *prəŋ and pam-type words elsewhere in Sino-Tibetan, I would have to claim that final *-m irregularly dissimilated to *-ŋ after *p- long before such dissimilations were first attested. Moreover, such dissimilations were blocked by *-r- which did not block dissimilation in this scenario. Old Chinese 品 *phrəmʔ 'class' regularly developed into Middle Chinese *phɨmˀ, not *phɨŋˀ.

On the other hand, Old Chinese 稟 *prəmʔ 'to receive from above' corresponds to Mandarin bing, a homophone of 冰 bing 'ice' (disregarding tone), rather than the expected *bin which would be the regular reflex of Middle Chinese *pɨmˀ. Could 稟 bing descend from an Old Chinese dialect in which *-m dissimilated to *-ŋ after *pr- - a dialect not ancestral to the dialects underlying the Middle Chinese lexicographical tradition? Is Old Chinese 冰 *prəŋ 'ice' a word from such a dialect? Or is 稟 bing the result of a later change? In Mandarin -ng sporadically assimilated to -i- by fronting to -n (e.g., 拼 pin in pinyin is from *ping), but I don't know of any cases of the reverse.

Perhaps it would be preferable to find an internal etymology for 冰 *prəŋ 'ice' instead of forcing it into a foreign mold. I'll evaluate such an etymology next time. PRE-TANGUT S-ʔɅ-PAM 'ICE'

When writing last night's entry on pre-Tangut *Si-pa 'snow', I rediscovered a 2009 entry in which I reconstructed pre-Tangut *sʌ-pam 'ice' as well as *Si-pa 'snow'. Although I still think *Si-pa is valid five years later, *sʌ-pam cannot be correct because it would become Tangut 1vọ, not

4053 1ʔwọ

which is one of the three actual Tangut words for ice. In 2009, I accidentally reconstructed 4053 as 1vọ with a class II (labiodental) initial v- instead of a class VIII (glottal) initial ʔ-. Errors in Tangut lead to errors in pre-Tangut. I now reconstruct the pre-Tangut ancestor of 1ʔwọ as *S-ʔʌ-pam:

*S- conditioned the tenseness of the vowel (indicated with subscript dot).

The unaccented presyllabic vowel *-ʌ- was later lost, though the presyllabic initial *ʔ- remained.

The unaccented presyllabic vowel could not have been high because a high vowel would have conditioned a high vowel in the main syllable: *S-ʔɯ-pam would have become *1ʔwiọ*.

*-p- lenited to -w- between the vowels *-ʌ- and *-a-.

*-am became *-o

I am assuming** the root *pam is cognate to Proto-rGyalrong *lpaˠm 'ice' (as reconstructed by Jacques 2004: 249; see attested forms in item 1290 of Nagano and Prins' database). (Jacques first identified 4053 as a cognate of Japhug rGyalrong tɤ-jpʰɣom in 2003.)

Next: Does Chinese also share that root?

*8.2.0:28: There are no rhymes with Grade IV -io, -iọ, or -ioʳ after 1ʔ-. Are these chance gaps, or clues to a constraint of ̣pre-Tangut phonological structure?

**8.2.0:52: This assumption may be false. 4053 1ʔwọ has many other potential sources:

-w- could also be original or a lenited reflex of *ph- or *b- or even *m- if nasals lenited. (So far I have not yet seen any evidence for Irish-style nasal lenition: e.g., v-/ʔw- ~ m-alternations within Tangut and/or Tangut v-/ʔw-words with m-cognates in other languages. However, I have not yet looked for such evidence. Hence I cannot rule out the possibility of nasal lenition.)

-o may also be from *-aŋ or *-o.

*ʔ- may be part of the root and might be from *q-, though I would expect a uvular to condition the Grade II medial -ɤ- that is absent in 4053 1ʔwọ.

Only *S- is certain. PRE-TANGUT *SI-PA 'SNOW'

Blench and Post (2013) gathered words for 'snow' from 190 Sino-Tibetan (or as they prefer, Trans-Himalayan) languages and dialects and found that

there are some 30% unidentifiable forms [i.e., apparent isolates], the remainder assigned to some ten different roots, each of lowfrequency. In Sinitic, we find attestations of four of these roots suggesting that this may infact represent a complex network of borrowing rather than reconstructions of greatantiquity. Accordingly, the probability is low that 'snow' was part of the environment of early Sino-Tibetan speakers.

One of those unidentifiable forms was Tangut

4091 1vɨị 'snow'

That syllable could have a variety of pre-Tangut sources:

- the tenseness of its vowel (indicated by a subscript dot) is from a pre-Tangut *S-

- v- could be from *w- or from an unknown labial obstruent *-P- that lenited in intervocalic position

- the Grade III medial -ɨ- is automatic between v- and a high vowel; it need not be projected back into pre-Tangut.

- -i could be original or be from a pre-Tangut *-a that raised after a presyllabic *-i-

- if there was a coda in pre-Tangut, it was lost in this environment. Tangut generally did not permit nasalized tense vowels (from earlier *S- ... -vowel-nasal sequences) and lost all stops after *-a. That coda could not have been a final glottal and/or fricative *-H which would have conditioned a second ('rising') tone in Tangut instead of a first ('level') tone. (8.1.1:48: Nor could that coda have been *-ŋ; pre-Tangut *-aŋ became Tangut -o, not Tangut -a.)

The possibilities could be summed up as *(S(I))-Pi/a(C).

If the Tangut word is cognate to rGyalrong words for 'snow' (item 1288 in Nagano and Prins' database) such as Japhug tɤ-jpa as first proposed by Jacques in 2003, then its pre-Tangut ancestor was *Si-pa.

*Si-pa > *Si-pia > *Si-pi > *Si-βi > *Si-vi > *Svi > *vvi > *vvị > *vị > 1vɨi

The proto-rGyalrong root was something like *lpa(k). Some rGyalrong varieties (e.g., Yophyi, Marspang, Sabarkyo) have -k or or even -ʔk; others like Japhug do not. I cannot confidently reconstruct *-k at the proto-rGyalrong level because Somang ta-jpâ 'snow' lacks the final *-k I would expect. I do not know whether the pre-Tangut form had *-k.

(23:39: Could pre-Tangut *-i- be from a preinitial *l-: *S-lpa > *S-jpa > *Si-pa?)

This *pa-type root may be unique to Qiangic. Some or even all of the words that Blench and Post derived from #pu[n] and #[te] van in Qiangic may actually be from *pa. But I don't think it's possible to link that root to their #pham which ends in a nasal. Could va-type words for 'snow' in Loloish be from *pa with an initial that lenited (as in Tangut)?

23:36: What about Naxi be 'snow'? Could it be from *pa with brightening of the vowel (see Jacques and Michaud) and an initial that voiced after a now-lost presyllable: *CV-p- > *CV-b- > b-?

8.1.1:39: Could Naxi b- be from *N-p- with a nasal prefix?

8.1.1:36: I forgot that I had already written about the Tangut word for 'snow' five years ago! But at least this time I included the character for that word. SINO-TIBETAN AND/OR TIBETO-BURMAN AS SUBGROUPS OF TRANS-HIMALAYAN

Last night I rediscovered Blench and Post's 2013 paper "Rethinking Sino-Tibetan phylogeny from the perspective of North East Indian languages". After a year I have yet to fully absorb it, and here I only intend to comment on a couple of bits in it. However, before I get there, I need to outline pre-Blench/Post views on Sino-Tibetan.

The traditional view of the Sino-Tibetan family is that it consists of Chinese in the east and everything else ('Tibeto-Burman') in the west.

Tibetan, Burmese, Tangut ...
Mandarin, Cantonese, Taiwanese ...

W. South Coblin (2010) observed that the late Gong Hwang-cherng's Proto-Sino-Tibetan reconstruction "was, phonologically at least, virtually the same language as Old Chinese."

Similarly, Matisoff's (2003) reconstruction of Proto-Tibeto-Burman resembles Classical Tibetan: e.g., his PTB *b-r-gyat 'eight' is almost identical to Classical Tibetan brgyad.

Are Old Chinese and Classical Tibetan really so archaic? Blench and Post would probably say no:

It cannot be emphasised too strongly that these [languages with early written records: i.e., Chinese and Tibetan] are, if not indeed irrelevant, of relatively very low significance for the reconstruction of proto-forms of a phylum the great majority of whose members have never been written and which must be far beyond the reach of epigraphy. This emphasis on 'major' languages has had another consequence: 'minor' and often poorly documented languages have generally been excluded from consideration. This is particularly true of the languages of North East India, where the way of life hardly matches the settled agricultural lifestyle depicted for Proto-Sino-Tibetan speakers. (p. 2)

In Blench and Post's model of Sino-Tibetan (or 'Trans-Himalayan') in figure 6 on page 18, at least three of the 'major' languages turn out to be the tip of just one branch of the family which I call 'Sino-Tibetan':

Trans-Himalayan (traditional Sino-Tibetan)
'Greater Nagish' 2-11 other primary branches
'Greater Kachinic-Karenic' Tani Nagish
West Kachinic Karenic East
'Qiangic-Sino-Tibetan-Nungish' Tujia Bai
North Qiangic South Qiangic 'Sino-Tibetan' redefined Nungish
Sinitic Lolo-Burmish-Naxi Greater Tibetic (Bodish)

My placeholder names for nodes are in single quotes. I am not happy with 'Qiangic-Sino-Tibetan-Nungish' which is overly long (maybe just 'Qiangic-Nungish'?) or 'Greater Kachinic-Karenic' for the non-Tani-Nagish languages of the 'Greater Nagish' branch. ('Greater Kachinic-Karenic and Tani are earlier and later offshoots of Nagish proper rather than sisters of Nagish proper.)

The number of primary branches is uncertain since Blench and Post are only certain about Mikir and Mruish as primary branches. Six other potential primary branches (Kamengic, Puroik, Mishmi, Miji, Hruso, Siangic) may not all be Trans-Himalayan. Blench and Post's tree has three more primary branches "for which there is apparently no data, so their position is simply a default."

The term 'Tibeto-Burman' could be recycled within the Blench-Post framework if Lolo-Burmish-Naxi and Bodish could be demonstrated to share an innovation absent from their sister Sinitic.

I don't know where Blench and Post would place Tangut.

Nishida (1976) regarded Tangut as "rather isolated from Lolo-Burmese proper", but still more closely related to Lolo-Burmese than to Sinitic or Bodish. If Tangut is Lolo-Burmese-Naxi, it would be 'Sino-Tibetan' under my new narrower definition.

More recently, Tangut has been regarded as Qiangic. Blench and Post split Qiangic into two branches. I presume Tangut would be a North Qiangic language as it is the northernmost Qiangic language. If so, it would not be 'Sino-Tibetan' in my new narrow sense.

In either case, Tangut is just part of 'Qiangic-Sino-Tibetan-Nungish' and not a primary branch.

All this reminds me of Austronesian. Just as Gong reconstructed Proto-Sino-Tibetan using just four languages (Chinese, Tibetan, Burmese, and Tangut), Dempwolff reconstructed Proto-Austronesian in 1934 using just three languages (Javanese, Tagalog, and Toba Batak). But eighty years later, we know that those three languages belong to just one branch of Austronesian, and that all the other primary branches are on Taiwan. Similarly, in the Blench-Post framework, all of Gong's four languages belong to just one branch of Trans-Himalayan, and all the other primary branches are in northeast India.

