I used to think that Old Chinese (OC) and Tangut didn't share a word for 'tooth' until yesterday when I saw Baxter and Sagart (2014)'s reconstruction of OC 齒 'tooth' as *t-[kʰ]ə(ŋ)ʔ resembling Tangut

0039 2korn1 [kõʳ] < *R-koN-H 'tooth'

OC *-ŋ is projected back from the 集韻 Jiyun (1037) alternate fanqie 稱拯 for 河東 Hedong Middle Chinese *tɕʰɨŋˀ. Similar *-ŋ ~ zero alternations are in

能  *nˁə ~  *nˁəŋ 'a kind of bear' *nˁə(ʔ) *nˁəŋ 'able, ability'

*nˁə(ŋ)ʔ, phonetic in 仍 *nəŋ 'repeat'

*C.nəʔ 'ear' had a 河東 Hedong Middle Chinese reading *ɲɨŋˀ and Min reflexes with nasal vowels or final -ŋ. Should it be reconstructed with *-ŋ? Cognates like Written Tibetan rna, Written Burmese nāḥ, and Tangut

4681 1nu4 'ear'

have no nasal coda, but that may just mean that is a Chinese innovation. Baxter and Sagart (2014: 158) reconstructed rightward spreading of nasality from onset to coda in 'ear' and brought up the possibility of such spreading in 齒 'tooth' as well:

*t-ŋ̊əʔ > Proto-Min *kʰi(ŋ?) > Chaozhou lit. kʰi, colloq. tsʰĩ (sic)?

The forms above are from 漢語方音字匯 (the 2003 edition?) as reproduced in 小學堂; I would expect colloq. kʰĩ (which would be from Proto-Min) and lit. tsʰi (which would not be from Proto-Min). No other Min forms in 小學堂 have nasal rhymes.  The 1962 edition of 漢語方音字匯 only lists Chaozhou ki (sic). 广东闽方言语音研究 (p. 222) lists Chaozhou colloq. kʰi (sans nasalization) and lit. tsʰi.

If the true root of 齒 'tooth' had a nasal initial, it may be an ablaut variant of 牙 *ŋˁ<r>a 'tooth' sans an infix for multiple objects. However, Baxter and Sagart derive the initial of 牙 from *m-ɢˁ-. Could *t-ŋ̊- be from *t-m-ɢ-? Such m-ɢ(ˁ)-sequences are reminiscent of the *mɴɢ- that Guillaume Jacques (2014: 297) reconstructed for Proto-Japhug *-mɴɢam 'vise'. Japhug tɤ-mɢom 'vise' even has a tɤ- resembling Baxter and Sagart's *t-. Tangut *R- may partly be from *t-.

I am reluctant to link the Chinese, Japhug, and Tangut words for two reasons.

First, I cannot explain how a cluster like *mɢ- would become Tangut k-. Normally nasal-stop sequences became Tangut voiced (or prenasalized?) stops. Then again, maybe the root initial was *q-, judging from Somang tə-mkám 'vise' (Somang shifted uvulars to velars) and Written Burmese aṃ 'molar'. But that *q- would no longer match the *ɢ(ˁ)- of Old Chinese 牙 *m-ɢˁ<r>a and 齒 *t-m-ɢəʔ. Moreover, I expect *q- to condition Tangut Grade II, not Tangut Grade I: *t-qoN-H > *2korn2, not 2korn1.

Second, I cannot reconcile the possible Old Chinese coda *-ŋ with Japhug -m. Tangut -oN could be from *-am, so the rhyme is not an obstacle to a relationship between *R-koN-H (< *t-kam-H?) and tɤ-mɢom < *-mɴɢam. NINE ELBOWS

I know little about the early Chinese script, so I didn't know that the character for the Old Chinese (OC) word 'elbow' was used to write a nearly homophonous word 'nine' until I read pages 31-32 of Baxter and Sagart (2014). The two words rhymed in both Old Chinese and Middle Chinese (MC):

'elbow': OC *t-[k]<r>uʔ > MC *ʈuʔ (now written 肘)

'nine': OC *[k]uʔ > MC *kuʔ (now written 九)

Brackets indicate "either *X, or something else that has the same Middle Chinese reflex as *X" (Baxter and Sagart 2014: 8).

Angled brackets indicate that *<r> is an infix.

One might expect those words' apparent Tangut cognates to rhyme, but they don't:

1298 1kirw4 < *R-k(r)uk 'elbow'

3113 1gy'4 < *NGəX 'nine'

*R- might be from an earlier *t- matching the Chinese *t- prefix for inalienable nouns (Baxter and Sagart 2014: 57) and the presyllable of Japhug -zgrɯ 'elbow'. In fact, the Tangut preinitial could have been a presyllable *Rɯ- with a high vowl as in Japhug. (A low presyllabic vowel would have conditioned the lowering of the main vowel: *Rʌ-uk > -erw.)

It is not possible to tell whether the Tangut word once had a medial *-r- or not, as *R-uk and *R-ruk merged as -irw.

The final *-k does not match either Old Chinese *-ʔ or the zero coda of Japhug. Is this *-k a suffix, or is the Tangut word unrelated?

Guillaume Jacques (2014: 189) suggested that 1298 'elbow' could actually be cognate to

1377 1kirw4 < *R-k(r)uk 'bad, crooked, slanting, inclined'

which might be cognate to Japhug *kɤɣ < *kɔk 'to bend' and, I would add, Old Chinese 曲 *kh(r)ok 'to bend.' I suppose 'elbow' (i.e., a bent thing) must have developed from 'bend' before it became 'crooked'.

As for 'nine', I cannot explain why pre-Tangut has instead of *u. (1.10.0:: This might not be a problem for Gong Hwang-cherng who reconstructed Old Chinese 'nine' as *kjəgwx, but see Baxter and Sagart 2014's section on reconstructing the rhyme of 'nine'.)

I am tempted to derive the *-X of 'nine' from a *-t like the coda of Japhug kɯngɯt* 'nine', but *-X can also correspond to Japhug zero: e.g. (example added 1.10.0:40),

2205 1lyr' < *R-ly-X 'four' : Japhug kɯβde*-pə-tlej 'id.'

Could *-X in 'four' be a suffix absent from Japhug and other languages (see below)?

Tangut *R- may correspond to Japhug preinitial *-t- which may be a prefix absent in Written Tibetan bzhi < *blyi 'four', Written Burmese leḥ 'four', and Old Chinese *s-li-s 'four'.

*1.10.0:15: Guillaume Jacques (2014: 158) pointed out that the -t of Japhug 'nine' was carried over from kɯrcat 'eight'.

A -t-less form is in kɯngɯ-rtsɤɣ 'nine stages'. BAXTER AND SAGART (2014): INTRODUCTION

I first learned three months ago that William H. Baxter and Laurent Sagart's Old Chinese: A New Reconstruction (2014) had been released in the US, but I didn't get my own copy until tonight.

Although I have some doubts about the authors' reconstruction, I do agree with everything they wrote in their introduction, and I wish to make four points:

1. They reject this approach to reconstruction (p. 5):

One traditional view is that historical linguists have certain scientific procedures at hand that, if correctly applied, will produce reliable results and will not lead them into error. Conclusions resulting from the correct application of these methods may be regarded as "proved." (It follows from this view that if two scholars reach different results, one of them - at least - must have applied the methods improperly.)

I would only add that such a view assumes that the procedures are infallible. Are they? How do we know that?

2. Instead, Baxter and Sagart use the hypothetico-deductive method. They gave "the famous case of the solar eclipse of May 29, 1919" as an example of how predictions based on laws considered to be "scientifically proved" could be tested:

In the event, Einstein's theories turned out to fit the observations much more closely than Newton's (Dyson, Eddington, and Davidson 1920).

I share Baxter and Sagart's stance, and I sum it up as: theorize, test, theorize, test ...

3. Baxter and Sagart note that the English word reconstruct is problematic because it is an "accomplishment verb" with "both a process and an endpoint" rather than an "activity verb" whose endpoint is not normally presupposed. But the reconstruction of Old Chinese - and Tangut - is an ongoing process without any end so far. My thoughts about Tangut phonetics have changed a lot over the last year. I expect them to continue to change as I find new evidence and reinterpret old evidence.

4. I realized that Baxter and Sagart's "conventional transcription" for Middle Chinese is like my recent transcription of Tangut: both

are not phonetic reconstructions but conventional representations of the information about pronunciation given in Middle Chinese [and, in my case, Tangut] written sources. Accordingly, they are not preceded by asterisks; for typographical convenience, and to emphasize the fact that they are not reconstructions, they are restricted to ordinary ASCII characters (in italic type), rather than the International Phonetic Alphabet. EVENLY WEIGHTED EVIDENCE FOR 'PRIMAL' NEUTRALIZATION

Last night I erred and thought I had found a case of -V and -V' words with homophonous -Vq derivatives. These sets from Gong (2002: 178-179, 185-187, 191) aren't quite what I was looking for, but they're close:

Set Base noun q-derivative verb
Tangraph Li Fanwen # Reading Gloss Tangraph Li Fanwen # Reading Gloss

1737 1ka1 to be even (v.i.)

1576 2kaq1 to make even (v.t.)

5592 1kar'1 balance for weighing

2907 to measure, weigh on a balance


5682 to measure



5890 1ku2 loose

2668 1kuq1 to loosen

3177 1kur1 cold

3358 ice

The first set has a root *ka 'even' which became 1ka1. The other forms have affixes:

*R-ka-X > 1kar'1 'balance for weighing'

*S-ka-H > 2kaq 'to make even'

*S-R-ka-X-H > 2kaq (not *2karq'!) 'to weigh on a balance'

Here I treat *X, the source of the mysterious 'prime' (') distinction in Tangut, as a suffix, but it could have been something else.

I once thought that Tangut might have had -rq rhymes with retroflexion and tension toward the end of the 105-rhyme list after the -q and -r rhymes. Nishida (1964) reconstructed rhymes 99-101 with both features. However, if  Tangut had such -rq rhymes, I would expect 'to weigh on a balance' to have a high-numbered rhyme instead of rhyme 61 -uq1 which is the first of the -q rhymes. Did pre-Tangut have *-rq rhymes that merged with *-q rhymes?

(1.8.21:08: Were *q- and/or *' < *X- difficult to pronounce with a retroflex vowel? Here is a table of possible vowel quality combinations:

V V' Vn Vn'
Vq - Vnq -
Vr Vr' Vrn

Vnq is only possible if V = e and Vn' and Vrn are only possible if V = o. Assuming these isolated rhyme types [-enq, -on', -orn] are correct [they may not be**], a few subtypes (-on'2, -enq4, -orn1) are of relatively high frequency:

Rhyme Rhyme # Tone 1 Tone 2
-on'2 59: 1.57 17 -
-on'3 60: 2.50 - 5
-on'4 2
-enq3 65: 1.62/2.55 3 7
-enq4 7 12
-enq2 75: 2.65 - 4
-orn1 94: 1.91/2.82 11 8
-orn4 95: 2.83 - 6

Is that high frequency the result of mergers?

*-anq, *-enq, *-inq, *-onq, *-unq, *-ynq > -enq?

*-an', *-en', *-in', *-on', *-un', *-yn' > -on'?

*-arn, *-ern, *-irn, *-orn, *-urn, *-yrn > -orn?

If so, why did the vowels merge in two different ways: i.e., into e in -nq-type rhymes and into o in -n' and -rn-type rhymes?

Why are -enq1, -on'1, -orn2, and -orn3 missing?

Why are some of the above rhymes only with one tone and not the other: e.g., why is there no 1-enq2?

And why is -enq2 listed far from the other -enq rhymes?)

The second set has a root *qu 'loose' (cf. Mawo Qiang qhə qhəʴ and Ergong quə quə, but probably not  Guanyinqiao Wobzi Lavrung kú*** or Japhug ɴɢu****) which became 1ku2.

1kuq1 'loosen' had a causative prefix *S-:

*S-qu > *1kuq2 > 1kuq1 (the rhyme -uq2 does not exist; it might have merged with -uq1)

The third set has a root *ku. 'Cold' had a prefix *R- not in all of its probable cognates:

Muya tu³⁵ ku⁵⁵

Wobzi rkhô

Rangtang Puxicun rGyalrong su (with a uvular!)

Ganzi Daofu Xianshuizhen rGyalrong ʂkʰur (with a final -r!)

1kuq1 'ice' had a nominalizing prefix *S-.

One might try to conflate nominalizing and denominalizing***** *S- as a 'part-of-speech-switching' prefix, but I suspect that various phonetically distinct prefixes merged as *S-: e.g., *sʌ- and *ɕɯ-, etc.

1.8.22:16: The initial of 'ice' is uncertain.

Gong (2002: 187, 191) reconstructed it as k-, but his reconstruction in Li Fanwen (2008: 544) has l-, perhaps because it is next to 2luq1-tangraphs in the Precious Rhymes of the Tangraphic Sea and may have been homophonous with them.

Shi Jinbo et al. (2000: 285) approximated the pronunciation of 'ice' with the fanqie 菊祖. I do not know the basis of that fanqie.

'Ice' is not in Homophones, so even its initial class is unknown.

*This variant of 0640 is in Gong (2002: 187).

**1.8.20:16: Arakawa reconstructed both -enq and -onq, which makes me wonder if his *-enq is from *-inq and *-enq and his *-onq is from *-unq and *-onq. (What would have happened to his *-anq and *-Inq = my *-ynq?)

-orn and -on' may be unique to my reconstruction. I mechanically derived them from Gong's -(j)owr and -i/joow, and I have little confidence in them.

***I would expect a Wobzi Lavrung form with *q-.

Perhaps I am wrong to derive Tangut Grade II velars partly from *uvulars. 'Head' is another problematic set:

2750 1ghu2 < *Cʌ-qu

Wobzi ʁú (could ʁ be a lenited *q?)

but Japhug tɯ-ku with a velar even though Japhug has q-!

****Japhug ɴɢ- is from Proto-rGyalrongic *ɴɢ-, not *nq-, so the root of ɴɢu cannot be *qu.)

*****See the table at the top of the last entry for examples. BLADE WOUNDS IN PURSUIT OF THE 'PRIME' PHONEME

Recently I have replaced Gong Hwang-cherng's long vowels with a prime symbol (') to signify a distinction of unknown nature in my Tangut reconstruction. I write its pre-Tangut source as *X.

Last night I found this instance of a V ~ V' merger V' becoming Vq in Jacques (2014: 255):

Base noun q-derivative verb
Tangraph Li Fanwen # Reading Gloss Tangraph Li Fanwen # Reading Gloss

1823 1ma4

4688 1maq4 to cut, pierce, bite

5702 1ma'4 wound

5628 to wound

(1.8.1:54: "mja¹" [= my 1ma4] for 1823 is a typo for "mjaa¹" [= my 1ma'4] in Jacques 2014. Hence 1823 and 5702 are homophones, just as 4688 and 5628 are homophones, and there was no merger of derivatives of roots with V and V'.)

5702 and 5628 share a root *ma with Japhug tɯ-ɣmaz < *-km- 'wound' and Written Tibetan rma 'wound'. Could Tangut 'prime' preserve something lost in Japhug and Written Tibetan, or is it an innovation: e.g., a trace of an affix?

I have adopted Arakawa's use of -q to indicate tension in the preceding vowel, but one can interpret it in other ways. Gong was the first to note that -q may correspond to non-Tangut sibilants. The comparisons below are from Jacques (2014), though the pre-Tangut reconstructions are mine:

0124 2luq3 < *SluH 'head' : Old Chinese 首 *l̥uʔ < *sl-? 'id.'

0385 2viq3 < *Ci-SpaH 'to be able' : Japhug spa 'to know, be able'

0527 1vaq3 < *Sɯ-pap 'tumor' : Japhug zbɤβ 'goiter'

2814 2lhiq4 < *Si-lha-H 'moon' : Japhug sla, Written Tibetan zla (< *sla) 'id.'

2878 1biq2 < *Sʌ-mbri 'willow' : Japhug ʑmbri 'id.'

Hence both Jacques and I reconstruct its pre-Tangut source as *S-.

Regardless of what *S- became in Tangut, it apparently could not exist with 'prime': e.g., the -q-derivative of 5702 1ma'4 with 'prime' is 5628 1maq4 without 'prime', not *1maq'4 with 'prime'. Did *1maq'4 ever exist? The absence of *-q' may indicate that -q and 'prime' were difficult or even impossible to pronounce together.

Next: Further incompatibilities. WEAK EVIDENCE FOR A 'STRONG' RHYME

As I was filling out the 2015 column in my 105rhymes file (Excel / HTML), I noticed that rhyme 102 appears as Grade I -oor in Gong's 1997, 2003, and 2008 lists of rhymes but as Grade III -joor with a medial -j- in all of his reconstructions of individual readings with the sole exceptions of

2628 1goor 'man, male' = 1gor'1 in my transcription

4748 1koor 'brocade' = 1kor'1 in my transcription

The other rhyme 102 syllables are

Reading Tangraph Li Fanwen # Gloss

3386 first half of 3386 1161 1sjoor 1śjịj 'levity'

2947 strong (the name of rhyme 103 in the Tangraphic Sea)

0980 full, excessive

1746 false, fake

5944 flame, light

1126 a unit of length

2869 the surname Lhor

4247 span

The last three are in the Mixed Categories of the Tangraphic Sea along with other lh-tangraphs.

The only transcriptions of any of these tangraphs I can find are

*2ŋgo1 for 2628 in Timely Pearl 294

*2kwo1 for 4748 in Timely Pearl 256

*3lõ3 for 4247 in Timely Pearl 324

Gong may have chosen to reconstruct rhyme 102 as Grade I because two out of three Chinese transcriptions were Grade I. Yet he reconstructed -j- in all coronal-initial rhyme 102 syllables, perhaps because 4247 was transcribed with Chinese Grade III (characterized by *-j- in Gong's reconstruction of Tangut period northwestern Chinese).

If rhyme 102 had a variant -joor after coronals, that variant would be homophonous with rhyme 103 which Gong reconstructed as -joor after velar kh- and coronal n-. Why would the Tangut have placed some -joor syllabes in rhyme 102 (1sjoor, 1ljoor, 1lhjoor) and others (1njoor) in rhyme 103?

Without Tibetan transcriptions with r-, I cannot be certain rhyme 102 is retroflex. The cognates in Jacques (2014: 199) do not contain r:

4247 1lhjoor 'span' : Japhug tɯ-ɟom < *-tlj- : Written Tibetan ɴdom < *N-l-

The placement of rhyme 102 toward the end of the list hints at retroflexion, though, as we know for sure that the retroflex rhymes follow the plain and tense* rhymes. If rhyme 102 was retroflex, its retroflexion must have been conditioned by an *r-prefix absent from Japhug and Tibetan. (Another possibility is that a *t-prefix cognate to Japhug tɯ- became *r-. A third possibility is to derive Arakawa's ld- for 4247 from *r-tl-.)

The Chinese transcriptions cannot tell me whether the vowel was long or short.

All I can be sure of is that rhyme 102 was something like 1-o without a rising tone counterpart 2-o. Maybe the best transcription would be -O with capitalization signifying 'something like'.

Rhyme 102 surely cannot be a simple -o or -or because those values have already been taken (rhymes 51 and 95 in Gong's reconstruction) and it is so rare. I use a prime symbol to distinguish 102 -or' from 95 -or. (-' may have been a glottal stop.)

Arakawa reconstructed rhyme 102 as -woq2 (i.e., as a second tense -wọ somehow distinct from rhyme 73, his first tense -(w)oq), projecting the medial *-w- of the Chinese transcription of 4748 onto all syllables of that rhyme. I do not know why he chose to reconstruct it as tense, as it is preceded by 22 retroflex rhymes in his reconstruction and followed by a plain rhyme -ya:n.

*The agnostic can call this second group of rhymes 'nonplain and nonretroflex' as Chinese, Tibetan, and Sanskrit transcription data cannot directly support tenseness as its defining characteristic. TANGUT RHYME DATABASE: 5 JANUARY 2015 EDITION

I added a new column for my latest Tangut transcription to my 105rhymes file (Excel / HTML).

Expanding on what I wrote last night, the transcriptions are in the following format:

tone initial medial vowel retroflexion nasality tension prime grade
0 - unknown 29-32 initials
plus unwritten glottal stop:
p-, ph-, b-, m-, (f-?), v-
t-, th-, d-, n-
(lt-?), ld-, l-, lh-
ts-, tsh-, dz-, s-, z-
c-, ch-, j, (ny-?), sh-, zh-, r-
k-, kh-, g-, ng-, h-, gh-
-(y)w- -a-, -e-, -i-, -o-, -u-, -y- -r -n -q -' -1: I
1 - 'level' -2: II
2 - 'rising' -3: III
4 - 'entering' -4: IV

The tone numbers are based on Chinese conventions. 3 would be a 'departing' tone, but no such tone is in the Precious Rhymes of the Tangraphic Sea which only lists three tones: 'level', 'rising', and 'entering'. The scare quotes indicate that the names taken from Chinese are not necessarily to be interpreted at face value: e.g., the 'level' tone may not have been level (in modern Cantonese, the level tones are falling, and one level tone in Mandarin is rising).

Medial -y- is only in rhyme 105 -ywa4 which may have been [ɥa].

The vowel quality codes (-r, -n, -q, -') are from Arakawa and are in an order that looked aesthetically pleasing to me: Vnq and Vrn are easier on my eyes than Vqn and Vnr because they are close to the English sequences Vnk and Vrn. -nq is also the sequence used by Arakawa. -rn is unique to my system; it is absent from Arakawa's: e.g., my Grade I -orn corresponds to Arakawa's Grade III -o:r. (Arakawa reconstructed vowel length which I rejected.)

I placed the prime symbol at the end to avoid implying that, for instance, o'n2 or a'r1 are o and a followed by syllabic n2 and r1. on'2 and ar'1 look more like single units (albeit one might misinterpret n' and r' as consonants distinct from n and r, though they are vowel qualities, not consonants).

If a rhyme could be Grade III or IV, I assign it to III if it follows a 'vigilant' initial (class II or VII initial or l-); otherwise I assign it to grade IV.

Not all combinations of tones, consonants, vowels, vowel qualities, and grades are possible: e.g., tension cannot coexist with retroflexion. A NEWLY FOUND INSCRIPTION IN A NEW TRANSCRIPTION

Thanks to Andrew West for this article on a Tangut gravestone inscription from 1278 that was found in 2013. The Chinese glosses are from the article and emended by Andrew.

Left line Right line
Tangraph Li Fanwen # Reading Chinese gloss English gloss Tangraph Li Fanwen # Reading Chinese gloss English gloss
3799 2sew1 small (loan from Chinese *2sew4) 1234 1then4 the Chinese surname Tian
1141 2li3 the Chinese surname *2li3 'plum' (Mandarin Li) 0477 1zyq4 maternal surname (loan from Middle Chinese *(d)ʑi(e)ˀ?)
1531 1ga4 army 3118 1hu1 transcription of Chinese *4fu3 'lucky' (Mandarin  fu)?
2805 2bu'4 command 0052 1zhi3 transcription of Chinese *1zhi3 'child' (Mandarin  er)?
2893 2khwe1 great 3654 0a0 monk; kin term prefix borrowed from Chinese *1a1-?
Left: 'Little Li, Great Commander of the Army'
Right: 'Tian family, Fu'er, Mother' (Li's wife)
0092 2ma4 mother

Six or even seven of the eleven words are loans from Chinese.

Both the Tangut and Tangut period northwestern Chinese reconstructions are in a new format:

tone number (1-2 for Tangut, 1-4 for Chinese) + syllable + grade number (1-4)

0 indicates an unknown Tangut tone number or grade.

Tangut syllables have the following consonants which may be followed by medial -w-:

p- t- ts- c- [tʂ] k-
ph- th- tsh- ch- [tʂʰ] kh-
b- d- dz- j- [dʐ] g-
m- n-   (ny- [ɲ]?) ng- [ŋ]
(f-?) (lt- [tɬ]?) s- sh- [ʂ] h- [x] or [h]
  ld- [dɮ]? z- [ɮ] zh- [ʐ] gh- [ɣ] or [ɦ]
v- [ʋ] or [v] l-   r-  
  lh- [ɬ]  

Initial glottal stop is unwritten. Hence w- is [ʔw] contrasting with v- [ʋ] or [v]. (Vietnamese has the same distinction: e.g., oa [ʔwa] vs. va [va])

Tangut syllables have only six vowels (not including the the prime symbol -' indicating a distinction of unknown nature, -n for nasalization, or -q for tension): a, e, i, o, u, y.  Y is a central vowel.

I have changed my mind so many times about Tangut grades that I have decided to use this agnostic notation instead. Removing the numerals, the prime symbol, and -q results in a simplified notation suitable for lay publications.

