I have long been troubled by my reconstruction of Tangut rhyme 41

3798 1tsen1 'small' (which looks like 'person' + 'small' but is from 'few' + 'small')

because the evidence points in different directions:

1. Internal evidence

There are only fourteen known rhyme 41 tangraphs in eight homophone groups: seven in the first ('level') tone volume of the Tangraphic Sea and one with dz- in the Mixed Categories volume:

Homophone group
Tangraphic Sea 'level' tone volume


Mixed Categories of the Tangraphic Sea

The low frequency of this rhyme and the absence of a 'rising' tone counterpart *2-en1 imply that it could not have been something simple like -e.

The circles divide some but not all homophone groups. The reasoning for the implied grouping of, for instance, 1phen1 and 1den1 as distinct from 1ben1 is unknown. (I would understand if 1phen1 and 1ben1 were grouped together since both had Class I [i.e., labial] initials.

The coexistence of v- and l- with dz- indicates that rhyme 41 must have been Grade I. They do not normally coexist in any other grade:

v-, l-




(The table above only shows the general pattern. There are exceptions.)

Tangut rhymes normally form sequences with the following order:

- Grade I-IV V

- Grade I-IV V'

- Grade I-IV Vn

Rhyme 41 is where I would expect 1en1:

- Rhymes 34-37: -e1, -e2, -e3, -e4

- Rhymes 38-40: -e1, -e2, -e3/e4

- Rhyme 41: -en1?

- Rhymes 42-43: -en2, -en3/en4

Yet as we will see, no other evidence supports a nasal vowel. (My -n indicates nasalization; it is not a consonant.)

2. Chinese transcription evidence

As I already noted in my entry on line 104 of the Golden Guide,

1720 1ven1

was a transcription character for the Chinese Grade III (not I!) surname 隗 *2wi3 in the Tangut translation of Sunzi.

1720 also transcribed Chinese Grade I  外 *3wai1 in Sunzi and the Timely Pearl and Grade III (not I!) 偉 *2wi3 in the Ganying Pagoda inscription.

Sofronov (1968 II: 30) listed 1720 as a transcription of Chinese Grade I 磑 *1we1 ~ 3we1.

None of those sinographs would have been read with nasal vowels in the dialect known to the Tangut (unless the nasality of *ŋ-, the former initial of 外 and 磑, spread to the vowel - but 隗 and 偉 never had nasal initials).

No other rhyme 41 tangraphs were used to transcribe Chinese to the best of my knowledge.

In the Timely Pearl,

3798 1tsen1 'small'

was transcribed as 栽 *1tse1, which may indicate that the nasality of rhyme 41 was lost by the end of the 12th century in the author's dialect. I do not know of nay other

3. Tibetan transcription evidence

All three Tibetan transcriptions known to me lack nasals:

dwi, dwe (Nishida 1964: 53; frequency of each unknown; note the absence of -w- in the reconstructions!)

-eH (Tai 2008: 216; initial consonant unknown)

4. Sanskrit transcription evidence

As far as I know, rhyme 41 tangraphs were never used to transcribe Sanskrit. That implies rhyme 41 was unlike anything in Sanskrit: e.g., it was not short i, long ī, or long e [eː] with or without a following nasal or nasalization. (Sanskrit has no short e.)

5. Comparative evidence

Guillaume Jacques (2014: 186) compared

3798 1tsen1 'small'

to Japhug xtɕi < 'id.' without a nasal, but noted the unexpected initial correspondence (cf. Somang kə-ktsî 'id.' with ts-). He derived rhyme 41 from pre-Tangut *-ij without a nasal. Is it possible that Japhug and Somang lost a final nasal?

6. Conclusion

For the time being, I weigh the internal evidence over the external evidence and write rhyme 41 as 1-en1, but I remain uneasy about nasality.

An alternative is to follow Gong and write rhymes 41-43 with -i or -y instead of -n:

41. -ei1 (instead of -en1; cf. Gong's -əj)

42. -ei2 (instead of -en2; cf. Gong's -iəj)

43. -ei3/-ei4 (instead of -en3/-en4; cf. Gong's -jɨj)

I would then change my -on rhymes to -ou or -ow (cf. Gong's -ow) so that there are no nasalized mid vowels.

However, I do not understand why Gong reconstructed final glides in those rhyme groups. GSR 0289

In line 104 of the Golden Guide, the Tangut character


may have transcribed the Chinese surname 薛 which was pronounced something like *4se4 in the dialect known to the Tangut) and as ɕie in the modern northwestern dialect of Xi'an. Those readings lack a labial segment present in the modern standard Mandarin reading Xuē [ɕye]. The [y] of Xuē corresponds to nothing in the prestige Middle Chinese dialects preserved in Chinese traditional phonological sources and in the reading traditions of Japan, Korea, and Vietnam:

Middle Chinese *siet

Phags-pa Chinese ꡛꡦ <see>

Sino-Japanese setsu

Sino-Korean sŏl; idealized Middle Sino-Korean syə́rʔ (this y is IPA [j], not [y])

Sino-Vietnamese tiết < *siət

Baxter and Sagart (2014)'s reconstructions of Grammata Serica Recensa series 0289 also lack labial segments:

Old Chinese
Old Chinese (this site)
Middle Chinese (this site)
0289a *s.ŋat
*sɯ.ŋat *siet

to control, correct, govern
0289d-e spec. of plant; place name (i.e., where the plant grows?)
0289g *ŋ(r)at *
(was *C- = *s-?)
concubine’s son
0289j (shoots from) tree stump

(Karlgren [1957: 89] wrote,

The alternation s- :ng- in this series is probably a trace of some Archaic [i.e., Old Chinese] initial consonant combination.

and Baxter, Sagart, and I would agree.)

Hence I thought the [y] of 薛 Xuē [ɕye] might be a local Mandarin innovation, but it isn't. Forms with labial vowels coexist with ɕie-type forms in all branches of Chinese: e.g.,

Jin: 并州 Bingzhou ɕieʔ (lit.), ɕyəʔ (colloq.)
Wu: 常山 Changshan ɕyʌʔ

Hui: 祁門 Qimen syɐ̆

Gan: 都昌 Duchang siol

Xiang: 雙峰 Shuangfeng ɕya (lit.), se (colloq.)


Southern: 雷城 Leicheng soi

Northern: 石陂 Shibei sye

Yue: 高要 Gaoyao sit (new), syt (old)

Ping: 南寧 Nanning ɬyt

Hakka: 惠州 Huizhou syet

Unclassified languages also have a mix of labial and nonlabial forms: e.g., in 富川 Fuchuan, the 七都 Qidu dialect has si but the 八都 Badu dialect has suɐi.

I don't see any obvious pattern here. In Bingzhou the labial form is colloquial (i.e., likely to be native or at least from an earlier layer of borrowing), but in Shuangfeng, it is literary. (I suppose that if the labial form is an innovation, it must originate outside Shuangfeng.) Labiality is so widespread that it must have been present in some earlier prestige dialect(s), albeit not those recorded in the mainstream phonological tradition.

Nonlabial forms once (?) existed in Beijing Mandarin itself. Giles (1892: 449) listed under the reading* xiē and gave xiě (with a different tone and xuē as alternate readings. Are xiē and xiě now extinct? Do any 薛 families today call themselves Xiē and Xiě?

*I converted Giles' romanization to pinyin for ease of comparison. THE GOLDEN GUIDE: LINE 104: TANGRAPHS 516-520

104. I have a long list of topics, and I can't make up my mind about what to write about next, so I'll fall back on the Golden Guide. Two lines to the last surname ...

Tangraph number 516 517 518 519 520
Li Fanwen number 1720 1456 4686 0881 4760
My transcription 1ven1 1chhi2 1khwan4 2se4 1an1
Tangraph gloss the surname Ven the surname Chhi; Sanskrit chi, che? the surname Khwan; transcription of Chinese 郡 *3khwin3 'administrative region' the surname Se the surname An
Word the surname 隗 Wei (*2wi3) or 韋 Wei (*1wi3) the surname 翟 Zhai (*4chhe2) the surname 權 Quan (*1khwan2) the surname 薛 Xue (*4se4) the surname 安 An (*1an1)
Translation Wei, Zhai, Quan, Xue, An

Now I have Kotaka's six-part series on the Golden Guide on hand, so I'll use that as reference in addition to Nie Hongyin and Shi Jinbo's article. Kotaka's notes make me realize how poorly understood the relationship between the phonologies of Tangut, Tangut period northwestern Chinese, and Sanskrit still is.

516: I suppose this analysis of 1720 somehow describes the Ven family:


1720 1ven1 = right of 1105 1khon4 'to give' + left of 5659 1ver1 'luxuriant'

Grade I 1720 appears in the Tangut translation of Sunzi as a transcription of the Chinese Grade III surname 隗 *2wi3 which had no nasality. Why wasn't 隗 transcribed as Grade III *vi3?

Nie and Shi identified 1720 as the Chinese Grade III surname 韋 *1wi3 which also had no nasality. Kotaka noted that 韋 was transcribed as

5287 1vi1 (with Grade I, not III!)

in The Forest of Categories, so he thought 1720 was unlikely to be 韋. The fact that *wi3 (with different tones) was transcribed as 1ven1 and 1vi1 may indicate that Chinese *-i3 was unlike either Tangut -en1 or -i1 and had no exact match in Tangut.

517: 1456 is a fanqie character:


1456 1chhi2 = 1796 1chhuq3 'to lure' + 4972 1chi2 'to amuse'

The Tangraphic Sea states that 1456 is for transcribing mantras. Arakawa (1997: 116) thought 1456 might represent Sanskrit chi or che, but Sanskrit ch- was normaly transcribed as Tangut tsh-, not chh-. Moreover, the rhyme -i2 is not in Arakawa's table of attested Sanskrit transcriptions, implying that -i2 was somehow unlike Sanskrit short i, long ī, or long e [eː]. (Sanskrit has no short e.)

The use of 1456 1chhi2 for Chinese *4chhe2 may imply that neither Tangut -i2 nor Tangut -e2 precisely matched Chinese *-e2.

518: 4686 is a semantic compound:


4686 'administrative region' = 4719 2keq2 'boundary' + 2725 1wo2 'circle'

4686 1khwan4 is a poor vocalic match for Chinese 郡 *3khwin3 'administrative region'.

4686 appears in Sunzi as a transcription character for 權 *1khwan2 which can be a surname. The vowel types match but the grades (Tangut IV, Chinese II) don't.

519: The analysis of 0881 is unknown.

The left side may not be phonetic since I cannot find any se4-graphs containing it. Perhaps it describes the Se family: e.g., if it is short for

4773 2luq3 'silk',

the Se might have be known as sellers of silk (and Se brings to mind Latin sericum 'silk', though the similarity is probably coincidental).

The right side must be from 2888 2my1 'surname'.

Nie and Shi identified 0881 as a transcripiton of Chinese 薛 *4se4, but Kotaka pointed out that 薛 was transcribed as

3683 2sa4 (first syllable of 3683 2532 2sa4 1de4 'day after tomorrow' 1de4 is 'day')

in The Forest of Categories.

520: 4760 is a fanqie character:


4760 1an1  = top of 4940 2y4 '' + bottom left and center of 4685 1an1

4940 either represented initial glottal stop or zero. Homophones defined 4760 as a surname and its homophone 4685 as a place name, but 4760 also appeared in the place name

4760 3628 1869 1an1 1ghwan4 1po1  ' 安原堡 *1an1 1(ngg)wan3 1po1 Anyuan Fortress'

The correspondence of Tangut gh- to Chinese *ngg- or zero should be investigated. MEETING TANGUT CROWS

Do non-Chinese Sino-Tibetan languages have different sets of consonants corresponding to the (labio)velars and (labio)uvulars that Baxter and Sagart (2014) reconstructed? I would like to see the comparative evidence in Pan Wuyun's "喉音考" (On laryngeals, 1997) mentioned in Baxter and Sagart's 2010 paper on uvulars. Baxter and Sagart focused on Chinese-internal evidence, though they did point out that they thought

Written Tibetan g- corresponded to Old Chinese *ɢʷ- (which I used to reconstruct as *w-)

Written Burmese ဟောင်း <hoŋḥ> 'old' might be cognate to 公 'father, ruler' (< 'elder'?) which they now reconstruct as *C.qˁoŋ (*Cə.qˁoŋ in 2010)

cf. the velar-velar correspondence of

Written Burmese ကိုး  <kuiḥ> : Old Chinese *kuʔ 'nine'

It would be especially nice if comparative evidence supported the pharyngealized/nonpharygealized distinction in Baxter and Sagart's velars and uvulars.

I proposed that uvulars may be one source of Tangut Grade II based on these comparisons:

2750 1ghu2 < *ɢu 'head' : Old Chinese 后 *ɢˤ(r)oʔ 'sovereign', Written Tibetan mgo 'head'

4046 1khi2 < *CI-qha 'bitter' : Zhongu qʰɐⁿde 'bitter', Ronghong Qiang qʰɑ(q)

but Old Chinese *kʰˤaʔ has a velar!

Could Tangut clarify whether Old Chinese 烏 *qˤa  ~ 鴉 *qˤra 'crow' and 迓 *ŋˤ<r>ak-s (*m-[qʰ](r)ak-s?) 'to meet'* had uvulars?

Nishida (1964: 203) and Grinstead (1972: 114) glossed

1550 3110 2ka1 0jiq3  < *kra or *qa  + S-ji(-H) or *SI-ja(-H)

as 'crow', but the Chinese gloss in the Timely Pearl is 老鴟, literally 'old bird of prey' ('old' is a common noun prefix; cf. English old expressing familiarity rather than age). In any case, 2ka1 is Grade I rather than Grade II *2ka2 which I would expect for a cognate of Old Chinese *qˤ(r)a.

2ka1 can appear by itself, but 0jiq3 is a bound morpheme. (0 indicates an unknown tone.)

Other Tangut words for 'crow' cannot be cognate to 烏/鴉:

2261 0176 2on4 1na'3

2261 is also the second syllable of 1ta1 2on4 'swallow'; apparently an on was a kind of bird, and a crow was a 'black on'.

2262 2114 1jwon3 1leq2

The graph for 2114 is derived from 2262 'bird' plus 0176 'black'.

As for 迓 *ŋˤ<r>ak-s (*m-[qʰ](r)ak-s?) 'to meet', the only vague match I could find was Grade II

4040 1khu'2 'to invite'

which could be from *khru-X, *qhu-X, or *qhru-X. *qhru is the most likely since its cluster matches that of Japhug qru 'to invite'*.

''.*X (the pre-Tangut source of the mysterious phonemic attribute that I write with an apostrophe as a convenient subsitutte for a prime symbol) must be an affix***, as cognates identified by Guillaume Jacques (2014: 59) lack it:

2791 2khu4 < *Cɯ-qhru-H 'to call, invite'

3254 2khu4 < *Cɯ-qhru-H 'imperial edict'

(The presyllable *Cɯ- conditioned Grade IV and the *-H conditioned the second tone.)

I don't think any of these khu-words are related to the Chinese word, as the rhymes cannot be reconciled.

Guillaume compared 4040 to Written Burmese ကြို  <krui> 'to meet someone on arrival' which has <k-> corresponding to Japhug q-. If Japhug q- corresponded to Old Chinese *q-, does that mean *q- remained as a stop in Written Burmese clusters?

*q- > h- in ဟောင်း <hoŋḥ> 'old'

*qr- > kr- in ကြို  <krui> 'to meet someone on arrival'

*1.24.4:48: The phonetic series of 迓 is largely uvular, and 迓 may be related to 御 *m-[qʰ](r)aʔ 'to ward off', so perhaps 迓 could be reconstructed with a uvular root initial.

**1.24.4:57: Although Guillaume Jacques (2014: 58) proposed Japhug qru as a cognate, he reconstructed the pre-Tangut form of 4040 as *khjoo without a uvular or *-r-. His *-j- is a carryover from Gong Hwang-cherng's Tangut reconstruction.

***1.24.4:57: I conventionally write *-X as a suffix, but it could have been a prefix or infix. A 'DENTAL' UVULAR SERIES

At the end of my last entry, I wrote,

Perhaps 鴉 *q(r)a once had an *m-prefix for animals that justified the choice of 牙 *m-ɢˤ<r>a 'tooth' as a phonetic but vanished without a trace.

Baxter and Sagart (2014) reconstructed most other members of Grammata Serica Recensa series 0037 牙 'tooth' with the structure *N-Qra:

Old Chinese (B&S)
Old Chinese (this site)
Middle Chinese

*m-ɢ<r>a *ŋæ

*[N]-qʰraʔ *ŋæˀ covered galleries

*m-ɢ<r>a *ŋæ shoot, sprout

*ŋ<r>ak-s *ŋæʰ to meet
kind of musical instrument
proper, refined

to walk slowly
2nd syllable of mountain name
interrogative particle

I added 0047 since it too contains 牙 'tooth' as a phonetic. (Karlgren did not recognize that. Hence he placed it in a separate series. It seems that the trend is to combine his series. I can't think of any series of his that has been split by later scholars.)

Notes on individual members:

0037a 牙 *m-ɢˤ<r>a: I long assumed that 'tooth' had *ŋ- and might have been a loan from a Southeast Asian ŋa-word for 'ivory'. But if Baxter and Sagart are correct, then the word could have spread throughout Southeast Asia after *m-ɢˁ- fused into a nasal. Or *m-ɢˁ- was borrowed as a nasal. In either case, the only non-Chinese support I know of for a medial liquid is Bahnar ŋə-la 'ivory' (from Schuessler 2007: 550). I could not find that word in SEAlang's Bahnaric data which contains other words that appear to be reflexes of Shorto's (2006) Proto-Mon-Khmer *[m]laʔ 'ivory' and *bluək 'id.' I presume Shorto considered ŋa-words to be borrowings. Perhaps Chinese traders lacking their own word for tusks called them 牙 'teeth' and Southeast Asians borrowed that word for 'tusk'. It is unlikely the semantic shift went in the other direction: i.e., Southeast Asians sold ŋa 'tusks' to the Chinese who then adopted that word for 'teeth'.

0037b is a variant of 0037a.

0037c 庌 *[N]-qʰˤraʔ 'covered galleries' belongs to the same word family as ⾑ *qʰˤraʔ (this site: *qʰraʔ) 'cover'.

0037d 芽: Is this the word for 'teeth' applied to plants? Are sprouts like teeth growing from the earth?

0037e 訝: Not listed. Synonymous with 0037f 'to meet', so I supposed it was also *ŋˤ<r>ak-s.

Can also mean 'astonished'. Could 'astonished' be *[N]-qʰˤrak-s which would be in the same word family as 虩 *qʰrak 'to fear'?

0037f 迓 *ŋˤ<r>ak-s: I would rather not reconstruct a velar word in a uvular series, but I think Baxter and Sagart did so because it belongs to a velar word family with

0699d 迎 *ŋ<r>aŋ(-s) 'to meet'

0766n' 輅 (no reconstruction given; *ŋˤ<r>ak-s since 0766 is a mostly velar series*) 'to meet'

0788a 屰 *ŋrak 'to go against'

(23:36: Also cf. Written Burmese ငြား <ŋrāḥ> 'to meet'.)

Then again, Schuessler (2009: 551) thought 0037f was cognate to

0060l 御 *m-[qʰ](r)aʔ 'to ward off'

Should all of those words be reconstructed with the same root initial, and if so, should that initial be velar or uvular?

0037g 雅 *[N-ɢ]ˤraʔ ~ *N-ɢˤraʔ: I guess Baxter and Sagart are less sure about how to reconstruct 'kind of musical instrument' than 'proper, refined' because they regard the latter as cognate to 夏 *N-ɢˤraʔ 'great' whereas the etymology of the former is unknown (at least to me), so there is no word-family evidence to favor a specific preinitial or root initial.

0047a 邪 *sə.ɢA ~ *sə.la ~ *ɢ(r)A ~ *[ɢ](r)A: Until now I would have reconstructed something like *sɯ-ŋlja to accomodate its various readings and 牙 which I would have reconstructed as *ŋra or *rŋa.

(My *ja is equivalent to Baxter and Sagart's cover symbol *A for an *a which has an unusual Middle Chinese reflex. See pages 223-224 of their book. I once considered reconstructing a seventh vowel, but as they pointed out, there is no rhyme evidence for one.)

I think they feel compelled to reconstruct 邪 *sə.la 'to walk slowly' with a liquid because it is an alternate spelling of

0062p 徐 *sə.la 'to walk slowly'

in the Classic of Poetry - or at least the text as we have it now. How old is the use of 邪 for 'walk slowly'? Could it postdate the merger of *ɢ- (my *sɯ-ɢ-) and *l- as *j-? There is no doubt that 0062 is a lateral series. WHY DO SOME CHINESE CROWS HAVE 'TEETH'?

Last night I mentioned the near-homophony of Old Chinese
*Cɯ-qa (Baxter and Sagart: *[ʔ]a) > Middle Chinese *ʔɨə 'in'

*qa (Baxter and Sagart: *qˤa) > Middle Chinese *ʔo 'crow'

which were written with variations of a drawing of a crow and are regarded as members of the same phonetic series (Grammata Serica Recensa 0061). Ideally I'd want them to have the same initial consonant, though Baxter and Sagart reconstruct them with different consonants while leaving the option of *q- open for 於. (Brackets indicate uncertaintly; in this case, *[ʔ] means 'either or something else that has the same Middle Chinese reflex as *ʔ: i.e., *q.)

The Middle Chinese readings of the 烏/於 phonetic series only has initial *ʔ-, so without additional evidence, there is no way to tell whether that Middle Chinese *ʔ- was from Old Chinese *ʔ- or *q-.

Transcriptions such as 烏弋山離 'Alexandria' and 烏桓 'Avar' from the Records of the Grand Historian (c. 100 BC) tell us that 烏 was something like *a toward the end of the first millennium AD, though the fine details are uncertain: e.g., was the initial consonant in the underlying dialect(s) zero, a glottal stop (with or without pharyngealization?), or even a pharyngeal fricative *ʕ- (cf. how Arabic ʕ- is borrowed as phonemic zero in English)? They do not rule out the possibility of another initial consonant at an earlier period: e.g., *q-.

I favor *q- because it enables me to regard

*qa (Baxter and Sagart: *qˤa) > Middle Chinese *ʔo 'crow'

*qra (Baxter and Sagart: *qˤra) > Middle Chinese æ 'crow'

as members of the same *qa-word family.

By establishing that 牙 was a uvular series, Baxter and Sagart solved the mystery of why Middle Chinese 牙 *ŋæ is phonetic in 鴉 whose reading probably never had a nasal.

If Schuessler (2007: 83, 517) is correct, 鴉 never had *-r- (which is unlikely to be an infix in 'crow'*) and its Middle Chinese low vowel is an archaism. Hence 烏 and 鴉 could have been homophones, and Mandarin 烏鴉 wuya 'crow' would be a reduplication if pronounced in Old Chinese as *qa qa.

Moreover, *qa matches the global sound-symbolic archetype for 'crow': e.g., Sanskrit kāka-. (See Wiktionary for more examples. Thai kaa 'crow' could be a borrowing from Chinese, though it could also be the independent product of sound symbolism.)

*1.22.4:52: Old Chinese *-r- indicates double or multiple objects (Baxter and Sagart 2014: 58) and would be out of place in 'crow' unless 鴉 *qra once meant something like 'a flock of crows'.

If 鴉 'crow' had *-r-, then 烏 *qa and 鴉 *qra were a word family only in the weak sense that they were based on the same sound symbolism (cf. the English 'word family' of gl-words: gleam, etc.). I would not consider *qra to be a derivative of a root √*qa.

Perhaps 鴉 *q(r)a once had an *m-prefix for animals that justified the choice of 牙 *m-ɢˤ<r>a 'tooth' as a phonetic but vanished without a trace. GIVING IN TO THE UVULAR HYPOTHESIS

For over a decade I had a vague notion that Old Chinese had distinct velar and uvular phonetic series, but I never worked out any criteria to sort out which was which (beyond assuming that uvular series normally retained or conditioned low[ered] vowels - an assumption I retain today). Baxter and Sagart (2014) present such criteria, building upon Pan Wuyun's 1997 proposal: e.g., phonetic series mixing velar and laryngeal initials in Middle Chinese (MC) pronunciation such as 1173/1189/1190 were once uvular in Old Chinese (OC).

1173a 公 MC *koŋ < OC *C.qˁoŋ (this site: *C.qoŋ) 'father, prince; impartial'

cf. Karlgren's *kuŋ, Schuessler's *klôŋ 'prince', *kôŋ 'impartial'

Schuessler compared *klôŋ 'prince' to Khmer ខ្លោង <khloŋ> 'chief' and *kôŋ to Written Tibetan dgung 'middle'

1173g 瓮 MC *ʔoŋʰ < OC *qˁoŋ-s (this site: *qoŋs; earlier  'earthen jar'

cf. Karlgren's *ʔuŋ, Schuessler's *ʔôŋ

1189a 妐 MC *tɕuoŋ < OC *t-qoŋ (this site: *tɯ.qoŋ) 'father-in-law'

cf. Karlgren's *tjuŋ, Schuessler's *toŋ

1190a 松 MC *zuoŋ < OC *sə.ɢoŋ (this site: *sɯ.ɢoŋ) 'pine'

cf. Karlgren's *dzjuŋ, Schuessler's *s-loŋ

Karlgren (1957) was unable to unite the three series, and Schuessler only united two of the three (1173 and 1190) using *-l- as a bridge, but Baxter and Sagart managed to link all three as a uvular series.

Once I would have more or less agreed with Schuessler and reconstructed *-l- in all three series:

*kloŋ, ʔloŋs, tɯ-loŋ, sɯ-loŋ

But as Baxter and Sagart (2009: 233) noted, medial *-l- is not supported by comparative evidence (aside from Schuessler's Khmer comparison. For instance, Written Burmese ဟောင်း <hoŋḥ> 'old', a possible cognate of 公 '', lacks <l>. I have already mentioned -l-less Written Tibetan dgung 'middle', a possible cognate of 公 'impartial'. Furthermore, the 公 series does not have most of the Middle Chinese initials typical of uncontroversial lateral series:

MC *d- < OC *lˁ-

MC *ɕ- < OC *l̥ -

MC *tʰ- < OC l̥ˁ-

Baxter and Sagart (2014: 171) solved an old mystery for me: the etymology of Taiwanese 予 hōo [hɔ˧ ] 'to give' whose proto-Min initial is *ɣ-. Until Sunday morning I thought hōo could have been a substratum word of unknown origin. But Baxter and Sagart regarded it as a descendant of 與 'to give' which they reconstructed as *m-q(r)aʔ. *m-q- fused into *ɢ- which weakened to *ɣ-. My reconstruction of 與 as *Cɯ-laʔ (whose *C may have been *k-*) could not be related to the Taiwanese word. Moreover, the phonetic series of 與, like that of 公, also lacked the three Middle Chinese initials that I listed earlier. Baxter and Sagart's approach accounts for more facts. So at last I give in and accept their reconstruction as a point of departure.

1.21.4:05: I forgot to mention the reason I italicized in in the title of the article. My attempt to integrate the Baxter-Sagart hypothesis with my own ideas about the origins of 'emphasis' in Chinese requires me to reconstruct high-vowel presyllables in type B ('nonemphatic') syllables with uvulars: e.g.,

與 OC *mɯ-q(r)aʔ (cf. my old *Cɯ-laʔ) > MC *jɨə

One problem with my hypothesis is that I am required to reconstruct presyllables in type B function words which were more likely to be monosyllabic: e.g..,

於 OC *Cɯ-qa (B&S: *[ʔ]a) > MC *ʔɨə 'in'

於 is a variant of a drawing of OC 烏 *qa (B&S: *qˤa) 'crow', so I assume the two words were nearly homophonous.

于 OC *Cɯ-q(r)aʔ (B&S: **ɢʷ(r)a) > MC *wuo 'in'

Compare the synonymous type A syllable 乎 *ɢa (B&S: *ɢˤa) 'in' which became Middle Chinese *ɣo without a high vowel characteristic of type B syllables in Middle Chinese.

Baxter and Sagart's reconstructions of the first two words for 'in' are simpler, but their system has many more uvulars than mine.

1.21.4:12: After finishing the body of my post, I realized that 與 'to give' and 予 'to give' (whose character represents Taiwanese hōo) were both *Cɯ-laʔ in my old reconstruction, whereas they now have different consonants in Baxter and Sagart's reconstruction (and in my revision of my reconstruction):

與 B&S: *m-q(r)aʔ, this site: *mɯ-q(r)aʔ

予 B&S: *laʔ, this site: *Cɯ-laʔ

Is it a coincidence that these two verbs rhyme?

*In my old reconstruction, 與 *Cɯ-laʔ 'to give' may have been homophonous with 舉 *kɯ-laʔ 'to raise' whose *k- survives today in Cantonese geoi [kɵy˨˥]. Could 'to raise' have become 'to give' as in the case of Japanese ageru 'to raise, give'? THE (K)HÍ-STORY OF HƠI

For many years I have been puzzled by hơi [həːj] 'gas, air, breath, odor', an early Chinese loan in Vietnamese corresponding to Sino-Vietnamese 气/氣* khí [xi] which was borrowed later. Although this word now has h- in Cantonese, that fricative is from *kʰ- which independently weakened to [x] in Vietnamese sometime between the 17th century (when it was [kʰ] and first romanized as kh-) and modern times. Schuessler (2009: 305) reconstructed *kʰ- for Late and Early Old Chinese. How can Vietnamese hơi be older than khí (as indicated by its rhyme and tone**) yet have a seemingly newer initial?

Yesterday morning the solution came to me as I read Baxter and Sagart (2014)'s section on 气. On page 170, Baxter and Sagart reconstructed 'breath' as

*C.qʰəp-s > *C.qʰət-s > *kʰət-s

which then became Late Old Chinese *kʰɨəs, Early Middle Chinese *kʰɨ(ə)jʰ, and Late Middle Chinese *kʰɨi. That last form was borrowed as khi.́

The uvular root is preserved (possibly with an infix) as

*qʰ(r)əp > Middle Chinese *xip 'to inhale'

A "tightly attached preinitial *C-" (p. 169) conditioned the fronting of uvular *qʰ- to velar *kʰ-. If that preinitial (probably a prefix in the case of 气***) were not present, *qʰ- would have weakened to *x-. And that is what I think happened to the source of hơi in the Chinese dialect spoken in Vietnam under Chinese rule:

*qʰəp-s > *qʰət-s > *χəs > *xəjʰ

The h- of hơi is from that *x- and has nothing to do with Cantonese h- from *kʰ-.

I hypothesize that Old Chinese partly bent upward to *ɨə after velars but remained unbent after uvulars****. (It may have lowered and backed to *ʌ.) The vowel stayed unbent even after uvular *χ- fronted to velar *x-. Hơi preserves that unbent vowel.

*The complex variant 氣 is well known in the West as Mandarin qi / ch'i and Japanese/Korean ki.

**The rhyme -ơi [əːj] is closer to Early Middle Chinese *-ɨ(ə)jʰ than to Late Middle Chinese *-ɨi which is reflected in khí.

The ngang tone of hơi [həːj] in old loans corresponds to earlier Chinese *-ʰ < *-s and the sắc tone in newer loans.

***The period in *C.qʰəp-s indicates that Baxter and Sagart (2014: 7) were "not confident that a presyllable [or preinitial] is a synchronic prefix". It does not rule out the possiiblity that the presyllable or preinitial was a prefix prior to Old Chinese.

In this case, I think *C- was some sort of nominalizing prefix added to *qʰ(r)əp 'to inhale'.

I do not know why Baxter and Sagart reconstructed *(r) indicating the possibility of *-r- in 'inhale' but not 'breath'. The Middle Chinese reflexes of *-rəps and *-əts are identical. Perhaps semantics make *-r- unlikely in 'breath', as 'breath' does not belong to Baxter and Sagart's (2014:57-58) three categories of *-r-infixed words:

1. actions with multiple agents, patients, or locations or repeated actions

2. intensified stative verbs

3. double or multiple objects

I am certain that there was no *-r- in the source of hơi, because *qʰrəp-s would have become *xɛ (= xeajH in Baxter and Sagart's Middle Chinese notation) and would have been borrowed into Vietnamese as *he [hɛ].

****Baxter and Sagart (2014) allow what I call upward bending after bare nonpharygealized uvulars: e.g., 氣 originally represented a verb 'to present food' (hence its semantic component 米 'rice') which they reconstructed as

*qʰət-s > my Late Old Chinese *xəs > my early Middle Chinese *xɨ(ə)jʰ (= xj+jH in Baxter and Sagart's Middle Chinese notation)

I don't distinguish between pharyngealized and nonpharygealized uvulars. In my reconstruction, only downward bending of higher vowels could occur after uvulars unless they were preceded by a high-vowel presyllable or converted to velars at an early stage. Hence I reconstruct 'to present food' as *Cɯ.qʰət-s.

The *C- of 'to present food' might have been identical to the *C- of 'breath', facilitating the use of the character for the former to write the latter.

'Breath' might have been *Cɯ.qʰəp-s at an earlier stage, though the unstressed presyllabic vowel must have dropped by the time uvulars weakened because *Cɯ.qʰ- became a fricative *χ-, whereas *C.qʰ- became a velar stop *kʰ-. FOUR SERIES OR EIGHT? OLD CHINESE (LABIO)VELARS AND (LABIO)UVULARS

As I read Baxter and Sagart (2014), I am becoming increasingly convinced that they are right in many ways, yet I remain skeptical about their eight series of back consonants:

Syllable type Series Stops Nasals
B 1 Velars *k- *kʰ- *g- *ŋ̊- *ŋ-
A 2 Pharyngealized velars *kˁ- *kʰˁ- *gˁ- *ŋ̊ˁ- *ŋˁ-
B 3 Labiovelars *kʷ- *kʷʰ- *gʷ- *ŋ̊ʷ- *ŋʷ-
A 4 Pharyngealized labiovelars *kʷˁ- *kʷʰˁ- *gʷˁ- *ŋ̊ʷˁ- *ŋʷˁ-
B 5 Uvulars *q- *qʰ- *ɢ- No uvular nasals
A 6 Pharyngealized uvulars *qˁ- *qʰˁ- *ɢˁ-
B 7 Labiouvulars *qʷ- *qʷʰ- *ɢʷ-
A 8 Pharyngealized labiouvulars *qʷˁ- *qʷʰˁ- *ɢʷˁ-

I reconstruct only four series:

*K : *Kʷ : *Q : *Qʷ

Here is how Baxter and Sagart's series map onto mine:

Syllable type Baxter and Sagart This site
Stage 1 Stage 2
B *K- *Cɯ-KA *KIA
*(Cɯ-)KI *KI
A *Kˁ- *(Cʌ-)KA *QA
B *Kʷ- *Cɯ-KʷA *KwIA
*(Cɯ-)KʷI *KwI
A *Kʷˁ- *(Cʌ-)KʷA *QwA
*Cʌ-KʷI *QwAI
B *Q- *Cɯ-QA *XIA
*Cɯ-QI *XI
A *Qˁ- *(Cʌ-)QA *XA
*(Cʌ-)QI *XAI
B *Qʷ- *Cɯ-QʷA *XwIA
*Cɯ-QʷI *XwI
A *Qʷˁ- *(Cʌ-)QʷA *XwA
*(Cʌ-)QʷI *XwAI

All researchers agree that Old Chinese had two types of syllables: A and B. I consider A syllables to be 'emphatic' and B syllables to be 'nonemphatic'. Baxter and Sagart project the A/B distinction far back, whereas I think it only became phonemic after the loss of presyllables that conditioned the distinction.

Syllables with nonuvular initials and lower vowels (*A = *e, *a, *o) and syllables with uvular initials (regardless of vowel) are type A unless preceded by a high-vowel presyllable:

*Cɯ-CA > *Cɯ-CIA > *CIA

*IA represents warped lower vowels (*ie, *ɨa, *uo).

Syllables with nonuvular initials and higher vowels (*I = *i, *ə, *u) are type B unless preceded by a low-vowel presyllable:

*Cʌ-CI > *Cʌ-CAI > *CAI

*AI represents warped higher vowels (*ei, *əɨ, *ou).

*X represents weakened (former) uvulars:

*q- > *ʔ- (followed by the shift *K- > *Q- before lower vowels and warped higher vowels)

*qʰ- > *x- before higher vowels and warped lower vowels; χ- before lower vowels and warped higher vowels

*ɢ- > *ɣ- before higher vowels and warped lower vowels; *ʁ- before lower vowels and warped higher vowels

*ɣ- later weakens to *j-, merging with *j- from stage 1 *l-, etc.

In stage 1, uvulars are phonemic.

In stage 2, uvulars are allophones of velars and need not be specified: e.g., *[qɑ χɑ ʁɑ] /ka xa ɣa/. THE SCEPTER SERIES: (LABIO)VELAR, (LABIO)UVULAR, AND/OR (LABIO)GLOTTAL?

In my last entry I mentioned GSR (Grammata Serica Recensa) phonetic series 0879 圭 which Baxter and Sagart currently reconstruct with a mix of velar and uvular initials. What if it was originally a purely labiouvular series?

Sinographs Schuessler 2009 Baxter and Sagart 2014 This site Middle Chinese
圭珪桂閨卦挂掛 *kw^- *[k]ʷˤ- *Cqʷ- *kw-
(*kw^-), *gw^- (kʷˤ-, *m-kʷˤ- [1]*) *Cqʷ-, *m-Cqʷ- (*kw-), *ɣw-
*kw^-, *gw^- *[k]ʷˤ- *Cqʷ-, *N-Cqʷ- *kw-, *ɣw-
*gw^- (*gʷˤ-) *NCqʷ- *ɣw-
*gw^-, (*ʔw^-) *Nqʷˤ-, (*qʷˤ-) *NCqʷ-, *qʷ- *ɣw-, (*ʔw-)
刲奎 *khw^- (*kʰʷˤ-) *Cqʷʰ- *kʰw-
*khw- (*kʰʷ-) *Cɯ-qʷʰ- [2]*
佳街 *k^- *[k]ˤ- *Pqʷ- *k-
厓崖涯睚 *ŋ^- *ŋˁ- *mQʷ- *ŋ-
蛙窪洼 *ʔw^- *qʷˤ- *qʷ- *ʔw-r
*ʔw^-, *w^- *qʷˤ-, *m-qʷˤ- *qʷ-, *m-qʷ- *ʔw-, *ɣw-
*ʔw^-, (*w^-) (*qʷˤ-, *Nqʷˤ-) *qʷ-, *N-qʷ- *ʔw-, (*ɣw-)
*w^-, *kw^- (*qʷˤ-, *C.qʷˤ-) *N-qʷ-, *Cqʷ- *ɣw-, *kw-

For simplicity I have ignored Old Chinese medial *-r-.

Schuessler's circumflex on vowels corresponds to Baxter and Sagart's *ˁ.

Parenthesized initials in Schuessler's column are what I would expect in his reconstruction to be the sources of Mandarin initials that are not regular reflexes of Middle Chinese initials: e.g., the w- of Mandarin 哇 wa should be from *ʔw^-, not *gw^-.

I have also placed *kw^- in parentheses since it is my attempt to fill in the missing reconstruction corresponding to 鮭 Middle Chinese *kwej and Mandarin gui.

Parenthesized initials in Baxter and Sagart's column are my attempts to fill in the gaps in their PDF according to my understanding of their system.

Unlike Baxter and Sagart, I do not reconstruct phonemic pharyngealization for uvulars.

Parenthesized Middle Chinese initials are not in the phonological tradition but are the likely sources of unexpected Mandarin initials.

Nearly all Middle Chinese readings of 0879 contain *-w-, so I presume the labiality is original. The exceptions may have lost their labiality to dissimilate from a lost labial preinitial:

*Pqʷ- > *Pq- > *k-

Middle Chinese *ŋ- of 厓崖涯睚 may be a fusion of *m- and a uvular:

*mQʷ- > *mQ- > *ŋ-

(*Qʷ- could be *qʰʷ- or *ɢʷ-; *mqʷ- [= Baxter and Sagart's *mqˁʷ-] became Middle Chinese *ɣ-, not *ŋ-; see section 4.4.2 of Baxter and Sagart 2014.)

Series which have an occasional *-w- in Middle Chinese might have secondary labiality: e.g., *pk- > *kʷ- (cf. *PC- > Cw- in Tangut). If such a change occurred, it might have predated the reductions covered in section 4.4.4 of Baxter and Sagart (2014): e.g., *pk- > *p- (not *kʷ-). Perhaps there was a chain shift:

Reduction type Phase 0 Phase 1 Phase 2 Phase 3 Phase 4
Early *pV-k- *p-k- *kʷ-
Middle *pV-k- *p-k- *p- *p- [3]*
Late *pV-k- *p-k-

In that scenario, various words with presyllables were reduced at different speeds. Early reducers developed initial *kʷ- (phase 2), middle reducers developed *p- (phase 3), and late reducers may have also developed *p-. No presyllables or preinitials remain in phase 4.

Another possibility is that *pk- > *kʷ- and *pk- > *p- occurred in different dialects. In any case, I expect reduction to have occurred in different ways at different rates in Old Chinese dialects over time.

*1. 1.18.5:54: I think the voicing of the initial of 鮭 'demon' may be from the *m-animal prefix. See Baxter and Sagart (2014: 53-54).

*2. 1.18.6:01: The high vowel of *Cɯ-qʷʰ- was preceded by a nonemphatic (i.e., nonpharyngealized) consonant; that presyllable conditioned the loss of all pharyngealization in the following syllable and the bending of *e to partly match the height of *ɯ:

*Cɯ-qʷʰeʔ [Cɯ-qʷʰˤɛˤʔˤ]? > *Cɯ-kʷʰeʔ > *Cɯ-kʷʰieʔ > Middle Chinese *khwieˀ

*3. 1.18.6:15: I don't know for sure if late reducers underwent the same change as middle reducers. Could the *p-k- remaining in phase 3 have become, say, *ph- in phase 4? (I reconstruct *K- to account for word families with aspiration alternations in Tangut.) WHAT DO MOUNTAINS PRODUCE?

Last month I asked if Tangut and Chinese shared a word for 'mountain'. They might if Baxter and Sagart's 2014 reconstruction is correct:

4871 1ngyr1 : 山 Middle Chinese *ʂɤen < Old Chinese *s-ŋrar (B&S 2014: 148)

cf. my *ksan, Schuessler's *sran, Zhengzhang's *sreːn, Starostin's *sraːn, etc.

At that time I didn't have Baxter and Sagart's book, so I didn't know the logic behind *s-ŋrar. But now I have a copy (hence the page number above) and have seen that they think 山 'mountain' belongs to a 'slope, nearly vertical side' word family including

巘 Middle Chinese *ŋɨanˀ ~ *ŋɨenˀ < B&S OC *ŋ(r)ar(ʔ) 'hill'

厓涯崖 Middle Chinese *ŋɤe < B&S OC *ŋˁrar 'river bank, limit'

cf. my *ŋre, Schuessler's *ŋrê, Zhengzhang's *ŋreː, etc.

顏 Middle Chinese *ŋɤan < B&S OC *C.ŋˁrar 'face, forehead'

They quote the 釋名 Shi ming 'Explaining Names' (200 AD; translation theirs):


' 'Mountain' is 'river bank'; it produces things.'

I'm not sure what 'it produces things' means. Does that phrase refer to things growing on a mountain?

I am still skeptical of *s-ŋrar for four reasons:

1. 山 transcribed the [(k)san] of Alexandria in 烏弋山離 from the Records of the Grand Historian (c. 100 BC), three centuries before Shi ming. Obviously *s-ŋrar was not the reading underlying that spelling. Did the Shi ming dialect preserve a *-ŋ- in 山 that was lost in the RGH dialect?

2. Is 涯 a phonetic gloss for 山 in Shi ming? Bodman (1954: 111) regarded

產 Middle Chinese *ʂɤenˀ < Old Chinese *s-ŋrarʔ (B&S 2014: 148)

in that passage as the phonetic gloss of 山.

3. 厓涯崖 belong to GSR (Grammata Serica Recensa) phonetic series 0879 圭 for open syllables, so I would not expect them to end in *-r.

4. Moreover, phonetic series 0879 mostly has stop initials, suggesting that the nasal of 厓涯崖 is from an earlier *NK-cluster. I know of no reason to reconstruct such clusters in 巘 'hill' and 顏 'forehead' which may share a root-initial *ŋ-.

1.17.5:07: At least I finally understand why Baxter and Sagart think the root initial of 山 was nonpharyngealized. They point out on p. 395 that 經典釋文 Jingdian shi wen 'Explaining the Text of the Classics and the Canon' (late 6th c. AD) has another Middle Chinese reading *ʂɨen whose vocalism indicates a nonpharyngealized initial. Jingdian shi wen lists various older readings for -words indicating earlier nonpharyngealized initials corresponding to later readings that have what appear to be regular reflexes of pharyngealized initials. Given that trend, it is likely that 山 once had a nonpharyngealized initial, though it is also possible that 山 had coexisting variants with different initials. I would prefer to be agnostic and write the pharyngealization of the initial of 山 in parentheses: *s-ŋ(ˁ)rar. SO NEAR, YET SO FAR: A REAP-PRÈS-SAIL

For a long time I assumed that Vietnamese gần [ɣən] 'near' was borrowed from 近 Late Old Chinese (LOC) *gɨənh or Early Middle Chinese (EMC) *gənʰ 'to approach'. The 'softened' Vietnamese initial [ɣ] is not a direct retention of Chinese *g; it reflects a lost presyllable: [ɣən] < *CV-gən. Baxter and Sagart (2014: 118) reconstructed in Old Chinese (OC) as


*s-: 'valence increaser' (here, v.i. > v.t.?; see B&S 2014: 56; how does this differ from *-s?)

*N-: 'intransitivizer'? (implying that *kərʔ was transitive?; see B&S 2014: 54)

also in 近 *N-kərʔ 'near'

*kərʔ: root

Words written with the phonetic 斤 *kər rhymed in the *-r category in the Classic of Poetry;  see Starostin (1989: 580-581)

I would expect the root to occur elsewhere as a transitive verb given that 近 'near' has the intransitivizer prefix *N-. But as far as I know, there is no root *kərʔ in Old Chinese words other than 近 'near' / 'approach'.

I would rather reconstruct the root with a simple initial *g-. The complex sequence *s-N-k- seems to be motivated by a problematic comparison with Vietic (see below).

On the other hand, the earliest Sino-Japanese reading kon < *kən for 近 may reflect a dialect with an unprefixed root initial *k-.

*-s: 'outwardly directed action' (here, v.i. > v.t.; see B&S 2014: 59)

> LOC *-h and EMC *-ʰ

I suppose the double marking *s- ... -s of a transitive verb is like English embolden; *s-N-kərʔ-s is to 'en-near-en'.

Baxter and Sagart (2014: 118) compare OC *s-N-kərʔ-s to Rục tŋkɛɲ whose initial tŋk- matches their *s-N-k-. (Rục has no s-preinitials, so t- is the closest possible match of OC *s-.) The finals, however, do not match even if one keeps in mind that Old Chinese *-r had shifted to *-n (but not palatal *-ɲ!) at the time of borrowing. Old Chinese did not have any palatal codas, yet a palatal coda is reconstructible in Proto-Vietic *t-kəɲ (without *-ŋ-; is the velar nasal secondary in that variety of Rục?*) and even in Muong forms such as Hoa Binh kʰəɲ¹.

I conclude that Proto-Vietic *t-kəɲ and Old Chinese 近 *gərʔ(-s) (or *N-kərʔ(-s) if Sino-Japanese k- preserves a root-initial *k-) are unrelated lookalikes unless one can explain why Late OC/EOC *-n was borrowed as Proto-Vietic *-ɲ. Are there other examples of Chinese *-n corresponding to Proto-Vietic *-ɲ?

1.15.23:40: The semantics of Vietic 'near' and Old Chinese *s-N-kərʔ-s 'to approach' do not quite match. I cannot derive the Proto-Vietic form from Old Chinese *N-kərʔ 'near' because the Vietic forms do not reflect an LOC final glottal stop or EMC glottalization, whereas the Vietnamese huyền tone in early loanwords can correspond to Chinese voiced initials followed by reflexes of *-s (B&S 2014: 382).

1.15.23:48: Then again, is it a coincidence that *t-kəɲ is apparently unique to Vietic within Mon-Khmer and similar to a Chinese word? Could it be a borrowing unlike the native word Proto-Vietic *s-ɗəː 'near' which was inherited from Proto-Mon-Khmer *t₂ɗəh (Shorto 2006)? (The absence of *-h in Proto-Vietic is irregular.)

*Other Rục forms at SEAlang lack the velar nasal: təkiɲ¹, ckìːɲ ~ ckɨ̀ːɲ. AN EXPENSIVE RANGE REVISITED: GSR 0540

On Monday I proposed that Old Chinese (OC) phonetic series GSR (Grammata Serica Recensa) 540 貴 had uvular initials. That might explain why

貴 Middle Chinese *kujʰ < OC *Cɯ-quj-s 'precious, expensive' (Baxter and Sagart: *kuj-s)

could be phonetic in

隤 Middle Chinese *dwəj < OC *N-rˁuj < *Nʌ-ruj 'exhausted' (Baxter and Sagart: *N-rˁuj)

See Baxter and Sagart (2014: 122) for MC *d- < OC *N-rˁ-

and its homophones 穨 'bald' and 僓 'natural, easy, gentle'. 僓 has another reading:

Middle Chinese *xwɛjʰ < OC *r̥ˁuj-s < *Nʌ-r̥uj-s (Baxter and Sagart: *qʰrˁuj-s?)

See Baxter and Sagart (2014: 116) for MC *x- < OC *r̥ˁ-

I will refer to the two readings of 僓 as 僓a and 僓b.

If OC emphatic *rˁ was uvular *[ʀ] or *[ʁ], it would make sense to write *rˁ-words with uvular phonetics like 貴 for *Quj-syllables. Its voiceless counterpart *r̥ˁ could have been *[ʀ̥] which later became *[χ] in western dialects.

The emphasis of *rˁ and *r̥ˁ was conditioned by a low-vowel presyllable at some point before those words were written with a uvular phonetic:

*Nʌ-r- > *Nˁʌ-r- > *Nˁʌˁ-r- > *Nˁʌˁ-rˁ-

*Nʌ-r̥- > *Nˁʌ-r̥- > *Nˁʌˁ-r̥- > *Nˁʌˁ-r̥ˁ-

Pharyngealized allophones of consonants in the vicinity of low vowels became phonemic when those low vowels were lost:

隤穨僓a *Nʌ-ruj /Nʌruj/  [Nˁʌˁʀˁʊˁjˁ] > *r̥ˁuj /r̥ˁuj/ [ʀ̥ˁʊˁjˁ]

僓b *Nʌ-r̥uj-s /Nʌr̥ujs/  [Nˁʌˁʀ̥ˁʊˁjˁsˁ] > *r̥ˁuj-s /r̥ˁujs/ [ʀ̥ˁʊˁjˁsˁ]

The presyllable of 隤穨僓a was reduced to *N-:

*Nʌ-rˁ- > *N-rˁ- > *N-lˁ- > *lˁ- > *d-

or *N-lˁ- > *nd- > *d-?

The presyllable of 僓b was lost before it could reduce to *N-.

I could also dispense with presyllabically condiitoned emphasis and simply reconstruct primary uvular and *ʀ̥:

隤穨僓a *N-ʀuj

僓b *ʀ̥uj-s

But do those two added phonemes illuminate anything else about Old Chinese, or are they just ad hoc devices that simplify these reconstructions?

隤 'exhausted' is cognate to 儽 Middle Chinese (MC) *lwəjʰ 'exhausted' which could be from

*Cʌ-ruj-s or *ʀuj-s

If the 'exhausted' word family originally had a uvular initial, I would expect its phonetic series to be purely or at least predominantly uvular. However, series 577 畾 in Schuessler (2009: 294) is about evenly split between words that could be reconstructed with *ʀ- (11) and words that couldn't (9), and it contains no words that could be reconstructed with voiceless *ʀ̥-. I would rather reconstruct 577 畾 (and by extension 隤) with original *r- that became uvular if preceded by a low vowel or if the only vowel in its syllable belonged to the low class (e.g., *o but not *u):

畾 OC *ruj > MC *lwi ~ OC *Cʌ-ruj > *ʀuj > *lwəj 'raised path between fields'

累 (phonetic 畾 abbreviated to 田 'field')

OC *ruj > MC *lwi 'to bind'

OC *Cɯ-rojʔ > MC *lwieˀ 'to accumulate'

OC *Cɯ-roj-s > MC *lwieʰ 'to implicate'

OC *rojʔ > MC *lwaˀ 'naked'

Regardless of whether Old Chinese ever had primary uvular rhotics, I think some sort of uvular rhotic better accounts for the unusual range of the phonetic 貴 than previous solutions:

Karlgren (1957: 145): The reading 僓a arose "through confusion with 544a [i.e., 隤].

but if 隤 is not in the phonetic series of 貴, what is the semantic function of 貴 'expensive' in 隤 'exhausted'?

Schuessler (2009: 294) regarded 貴 as a partial (i.e., rhyme-only phonetic) in 僓a.

Zhengzhang Shangfang reconstructed 貴 as *kluds and 隤 as *l'uːl.

If 貴 had a medial *-l-, it must have disappeared by the time 貴霜 'Kushan' was written in the Book of Han (111 AD).

(1.14.23:23: A medial *-l- would rule out Written Tibetan gus-po 'expensive' as a cognate.)

The *-d and *-l in the codas do not match (the suffix *-s is not a problem), and this mismatch is unnecessary since there is no evidence for *-d (or *-t) in the 貴-series. EAGLES SWALLOWING THE CENTER

Baxter and Sagart (2014: 101) reconstructed the Old Chinese phonetic series 0718 央 and 0370 因 with initial glottal stops. That brought to mind two etymologies whose sources I can't remember:

- Siamese กลาง klaːŋ A1 < Proto-Tai *klaːŋ A 'middle' < Chinese 央 'center' (Schuessler 2007: 585 called this a "traditional association" without citing a specific author)

- Siamese กลืน klɯːn A1 'to swallow' < Chinese 咽 'gullet, to swallow'*

There is also a widespread (K)lAŋ-type word for 'eagle' in Mon-Khmer that has been related to Old Chinese 鷹 'eagle' in phonetic series 0890 (Schuessler 2007: 574; also cf. Downer's Proto-Hmong-Mien *klâŋ² - how did Ratliff reconstruct that word?).

We know for certain that 央咽鷹 all had initial glottal stop in Middle Chinese (MC). I once thought the external data above might have pointed to an earlier *q(ɯ-)l- in Old Chinese:




later conditioned vowel raising in 央 and 鷹.

Without a preceding *ɯ, *i lowered to *ei and then *e after *q-.

Now I am skeptical for the following reasons:

1. The phonetic series of 央咽鷹 only have glottal stop initials in MC; they have no MC initials that unambiguously point to uvulars in the Baxter-Sagart system.

2. The phonetic series of 央 "lack[s] word-family contacts with velars or uvulars" (Baxter and Sagart 2014: 101).

3. Baxter and Sagart reconstructed 烟 ~ 煙 'smoke' as *[q]ˁi[n], possibly because it might share a root √*q-[n] with 熏 *qʰu[n] 'to smoke'. If it does, then 因 might be a uvular series. However, 煙 also belongs to a pure MC glottal stop series, and I can't think of any other examples of *i ~ *u ablaut.

咽 'to swallow' is cognate to 嚥 'to swallow' which belongs to yet another pure MC glottal stop series (GSR 0243). (Coincidentally, the verb 嚥 'to swallow' is homophonous with the noun 燕 'swallow' as in English!)

Shan oddly has ʔɯn A1 'to swallow' with a glottal stop matching Chinese rather than *k- which regularly corresponds to Siamese kl-.

4. Baxter and Sagart reconstructed the series of 鷹 with initial *[q](r)-, possibly because of the k-words for 'eagle', though that series only has glottal stop initials in MC. I am unaware of any velars or uvular word-family contacts for that series.

5. I know of no Chinese-internal evidence for medial *-l- in the phonetic series of 央咽鷹嚥.

6. Pittayaporn (2009) reconstructed *q- and *kl- but not *ql- in Proto-Tai. Did Proto-Tai speakers hear Chinese *ql- (assuming that cluster is correct!) and approximate it as *kl-?

(1:14.4:01: Pittayaporn reconstructed *qr- and *kr-, so I would expect *ql- in addition to *kl-. Perhaps *ql- could be reconstructed on the basis of cognate sets not in Pittayaporn's dissertation. If Proto-Tai had *ql-, then Chinese *ql- should have been borrowed as *ql- and not as the *kl- of 'middle' [and 'to swallow'?]. I doubt that Siamese kl- could be from *ql- because *q(r)- became Siamese aspirated kʰ-. I predict that *ql- would have also become a Siamese aspirate kʰ- or kʰl-.)

7. Unfortunately, Pittayaporn (2009) did not reconstruct a Proto-Tai word for 'to swallow'. The initial cluster of Wuming klwaŋ A1 may point to Proto-Tai *klw- with a *-w- absent from Chinese, but it may not be related, as I would expect its rhyme to be -ɯn (cf. Wuming xɯn A2 'night' and ʔɯn B1'other' corresponding to Siamese คืน khɯːn A2 and อื่น ʔɯːn B1).

8. Siamese ɯː (and similar vowels in other Tai languages) and Old Chinese *i in 咽 'to swallow' cannot be reconciled, though the former would match Old Chinese if it fronted to *e between coronals in 嚥:

*qlən-s > *qlen-s

However, that is highly improbable since 燕 already belonged to the *-en (not *-ən!) rhyme class in the Classic of Poetry, long before Proto-Tai borrowed from Chinese.

One could claim that Proto-Tai borrowed from an archaic dialect of Chinese preserving *ə, but that wouldn't work either because the fronting of occurred long after the composition of the contents of the Classic of Poetry. The *e of 燕 (and most likely 嚥) is primary, not secondary.

The only way that 嚥 could have a schwa is if GSR 0243 燕 was a mixed *e ~ series like GSR 0227 員 which is sui generis in Schuessler (2009).

In conclusion, I think these are cases of vague similarity among monosyllables rather than truly related words.

*1.14.3:21: Old Chinese 咽 'to swallow' has a *-s suffix absent from 咽 'gullet'. That suffix should correspond to Tai tone B1, but all Tai forms in Hudak (2008: 159) have tone A1 with the exception of Western Nung which has C2. If the Tai word is from Chinese, it must be based on the raw root 'gullet' (which could have become a verb 'to swallow' through zero derivation in the source dialect). AN EXPENSIVE RANGE: GSR 0540

As far as I know, in Semitic, uvulars and velars are not mixed in orthography or in morphology. 'Emphatic' and 'nonemphatic' consonants* are also not mixed. Yet Baxter and Sagart (2014)'s reconstruction of Old Chinese (OC) contains such mixtures: e.g.,

*m-k(r)ək-s ~ 誡 *kˁrək-s 'to warn'

*m-qʰ(r)aʔ ~ 戶 *m-qˁ 'to stop'

phonetic series GSR (Grammata Serica Recensa) 540:

*kuj-s 'precious, expensive'

*qʰˁuj-s 'to wash the face' (with 面 'face' on the left)

*[ɢ](r)uj 'to leave, reject' (with 辶 'walking' on the left)

Here's how I would handle those cases:
*m-k(r)ək-s ~ 誡 *Cʌ-krək-s 'to warn'

The emphasis of 誡 is secondary and due to harmony with the low presyllabic vowel:

*Cʌ-k- > *Cˁʌ-k- > *Cˁʌ-kˁ- = *Cˁʌ-q- > *q-

*Cʌ- may have been *mʌ-, a longer version of the prefix of 忌 *m-k(r)ək-s.

*mɯ-qʰ(r)aʔ ~ 戶 *m-qaʔ 'to stop'

Original *q- normally conditions emphasis.

The high presyllabic vowel blocked emphasis and conditioned the fronting of *qʰ-:

*mɯ-qʰ- > *mɯ-kʰ- > *mkʰ- > *ŋkʰ- > *ŋ-

I am not sure that *m-qʰ- became *ŋ-, as I don't know of any firm evidence for aspiration**.

Here is another solution: 禦 *mɯ-ɢ(r)aʔ ~ 戶 *ɢaʔ 'to stop'

I don't know of any voiceless-initial members of the 'stop' family, so I reconstructed the root with a voiced initial *ɢ-.

Once again, the high presyllabic vowel blocked emphasis and conditioned the fronting of a uvular:

*mɯ-ɢ- > *mɯ-g- > *mg- > *ŋg- > *ŋ-

See Baxter and Sagart (2014: 132) for the evidence for the shift of *mɢˁ- (= my *mɢ-) to *ŋ-.

One problem with this solution is that 戶 *ɢaʔ no longer has the voiceless initial that would be optimal for a phonetic in 所) *s-qʰ<r>aʔ (= my *sɯ-qʰ<r>aʔ) 'place'. But voiced-initial characters can be phonetics in characters with voiceless onsets: e.g., GSR 0027:

*ɢʷ(r)aj 'to do'

*C.qʷ(r)aj 'a name'

*[m]-qʷʰˤaj 'false'

I don't know why 譌 doesn't have a voiced root initial like 僞 *m-ɢʷ(r)aj-s 'to falsify'; no Middle Chinese reflex for OC *m-qʷʰˤ- is listed in Baxter and Sagart (2014: 130).

I reconstruct GSR 540 as a uvular series with a preinitial conditioning the fronting of *q- to *k- and a presyllable conditioning the lenition of *-ɢ- to *j-:

*C.quj-s > MC *kujʰ 'precious, expensive'

*qʰuj-s > MC *xwəjʰ 'to wash the face' (with 面 'face' on the left)

*Cɯ.[ɢ](r)uj > MC *jwi 'to leave, reject' (with 辶 'walking' on the left)

Is there any word-family evidence pointing to an original velar initial for 貴?

The earliest transcription evidence I know of for 貴 is 貴霜 for Kushan in the Book of Han (111 AD). but that only tells us that 貴 had *k- by the second century AD. It could have had a uvular initial in the past.

*I have long considered Arabic uvular q to be the 'emphatic' counterpart of velar k. Jakobson (1957: 515-518, cited in Baxter and Sagart 2014: 383) also considered Arabic pharyngeal ħ to be the 'emphatic' counterpart of velar x. Watson (2002: 44) added a third consonant to this group: Arabic pharyngeal ʕ as the 'emphatic' counterpart of glottal ʔ.

In Baxter and Sagart (2014)'s OC reconstruction, both uvulars and velars can be either emphatic or nonemphatic: e.g., there is a four-way contrast *k : *kˁ : *q : *qˁ.

I know of no language with a similar four-way phonemic contrast. I don't even know of any language with in addition to q. According to Youssef (2006: 13), Cairene Arabic has [k kˁ q]  in its phonetic inventory but not [qˁ]. He regarded [k kˁ] as allophones of /k/ on page 39. On page 40 he did posit as a "surface realization" which, despite the term, is apparently a intermediate stage between an "underlying representation" and a "phonetic form" in his theory.

The only two pharyngealized velars in UPSID are in Shilha (probably equivalent to q in the phonemic inventory at Wikipedia) and ŋˁ in !Xu. No languages in UPSID have *kʰˁ, *gˁ, *ŋ̊ˁ, *kʷˁ, *kʷʰˁ, *gʷˁ, *ŋʷˁ, or *ŋ̊ʷˁ  which are in Baxter and Sagart's OC reconstruction. Even Cairene Arabic which is full of pharyngealized consonants has only two pharyngealized velars [kˁ xˁ] which are allophones of /k x/ and are hence not phonemic (Youssef 2006: 39).

UPSID only has one language (Rutul) with two (qʰˁ, *ɢˁ) of the six pharyngealized uvulars in Baxter and Sagart's OC reconstruction: *qˁ, *qʰˁ, *ɢˁ, *qʷˁ, *qʷʰˁ, *ɢʷˁ.

Even as I finally read Baxter and Sagart's book, I remain skeptical about reconstructing such a large number of velars and uvulars. I think half of those phonemes are short-lived and secondary, and I'm not even sure they were all phonemes.

**I'm not sure if Baxter and Sagart (2014: 129) intended to say that the 'stop' words are cognate to *qʰ-words for 'place'. THE LOST LASO

A year ago I wrote about the Tangut name transcribed in Chinese as 老索 *law so 'old cord'. In his latest post, Andrew West proposed that it may be "a transcription of the otherwise unattested Tangut family name  [*la so]":

'Laso' = 5044 1la1 'Tangut surname La' + 2670 2so1 'man'

1la1 may be from *law which in turn may be from *lak. The Chinese transcription may reflect a nonstandard Tangut dialect in which the name was *Lawso.

I think his reconstruction is plausible. Both halves are attested as elements in Tangut family names listed in Miscellaneous Characters:

'Lewla' = 4788 1lew1 'Tangut surname Lew' + 5044 1la1

'Solwo' = 2670 2so1 + 1595 1lwo1 'dim (light), dusky' (a metaphorical adjective?)

Also, the first half transcribed Chinese 老, though that does not guarantee that a Chinese speaker would transcribe 5044 1la1 with 老. (I initially thought the second half transcribed Chinese 索, but I was wrong*.)

I'd like to study Tangut family name structure.

1.12.3:21: ADDENDUM: Although I assumed that 索 was read *so in the Chinese dialect underlying the transcription 老索, there are other possibilities.

In Phags-pa Chinese, 索 had two readings:

ꡛꡓ  <saw> <  Middle Chinese *sak

ꡚꡗ <shay> < Middle Chinese *ʂɤak ~ *ʂɤek

Jiyun lists a Middle Chinese reading *soʰ (< Old Chinese *sak-s) that would have become *ꡛꡟ <su> in Phags-pa Chinese.

So (pun unintended!) in theory the second syllable might have been standard Tangut sa (< *saw preserved in a dialect?), she (< *shai preserved in a dialect?), or su.

However, I favor *so, as most modern Mandarin readings of 索 including those in the northwest (Coblin 1994: 383) seem to be descended from *so(ʔ) rather than *saw, *ʂaj, or *su.

1.12.3:01: Although Li Fanwen (2008: 439) glossed

2670 2640 2so1 1pho1

in the Tangut translation of the Golden Light Sutra as 索訶 *soxo < *sakxa (the *-k- is inexplicable), a transcription of the first half of Sanskrit sahāpati 'lord of the saha world' (i.e., this world), I think it is actually a transcription of the first syllable of Chinese 娑婆 *sopho < *saba, a transcription of the first half of Sanskrit sabhāpati, a variant of sahāpati. The bh is a hypercorrection by Middle Indic speakers who knew that their h was a lenition of Sanskrit (i.e., Old Indic) bh and who mistakenly assumed that the h of sahā was a lenition of bh.

I do not think Tangut 2so1 1pho1 was a transcription of Sanskrit sahā, as I would expect a combination of the normal Sanskrit transcription characters

*1693 0165 1sa4 0ha0 (0 = tone and grade unknown).
Other Chinese transcriptions of Sanskrit sahā are

娑訶 *soxo < *saxa

沙訶 *ʂɤa xo < *ʂɤa xa (沙 may be a simplification of 娑).

and would have been transcribed in Tangut with a second syllable like ho. A DENTAL KÕʳ-NNECTION?

I used to think that Old Chinese (OC) and Tangut didn't share a word for 'tooth' until yesterday when I saw Baxter and Sagart (2014)'s reconstruction of OC 齒 'tooth' as *t-[kʰ]ə(ŋ)ʔ resembling Tangut

0039 2korn1 [kõʳ] < *R-koN-H 'tooth'

OC *-ŋ is projected back from the 集韻 Jiyun (1037) alternate fanqie 稱拯 for 河東 Hedong Middle Chinese *tɕʰɨŋˀ. Similar *-ŋ ~ zero alternations are in

能  *nˁə ~  *nˁəŋ 'a kind of bear' *nˁə(ʔ) *nˁəŋ 'able, ability'

*nˁə(ŋ)ʔ, phonetic in 仍 *nəŋ 'repeat'

*C.nəʔ 'ear' had a 河東 Hedong Middle Chinese reading *ɲɨŋˀ and Min reflexes with nasal vowels or final -ŋ. Should it be reconstructed with *-ŋ? Cognates like Written Tibetan rna, Written Burmese nāḥ, and Tangut

4681 1nu4 'ear'

have no nasal coda, but that may just mean that is a Chinese innovation. Baxter and Sagart (2014: 158) reconstructed rightward spreading of nasality from onset to coda in 'ear' and brought up the possibility of such spreading in 齒 'tooth' as well:

*t-ŋ̊əʔ > Proto-Min *kʰi(ŋ?) > Chaozhou lit. kʰi, colloq. tsʰĩ (sic)?

The forms above are from 漢語方音字匯 (the 2003 edition?) as reproduced in 小學堂; I would expect colloq. kʰĩ (which would be from Proto-Min) and lit. tsʰi (which would not be from Proto-Min). No other Min forms in 小學堂 have nasal rhymes.  The 1962 edition of 漢語方音字匯 only lists Chaozhou ki (sic). 广东闽方言语音研究 (p. 222) lists Chaozhou colloq. kʰi (sans nasalization) and lit. tsʰi.

If the true root of 齒 'tooth' had a nasal initial, it may be an ablaut variant of 牙 *ŋˁ<r>a 'tooth' sans an infix for multiple objects. However, Baxter and Sagart derive the initial of 牙 from *m-ɢˁ-. Could *t-ŋ̊- be from *t-m-ɢ-? Such m-ɢ(ˁ)-sequences are reminiscent of the *mɴɢ- that Guillaume Jacques (2014: 297) reconstructed for Proto-Japhug *-mɴɢam 'vise'. Japhug tɤ-mɢom 'vise' even has a tɤ- resembling Baxter and Sagart's *t-. Tangut *R- may partly be from *t-.

I am reluctant to link the Chinese, Japhug, and Tangut words for two reasons.

First, I cannot explain how a cluster like *mɢ- would become Tangut k-. Normally nasal-stop sequences became Tangut voiced (or prenasalized?) stops. Then again, maybe the root initial was *q-, judging from Somang tə-mkám 'vise' (Somang shifted uvulars to velars) and Written Burmese aṃ 'molar'. But that *q- would no longer match the *ɢ(ˁ)- of Old Chinese 牙 *m-ɢˁ<r>a and 齒 *t-m-ɢəʔ. Moreover, I expect *q- to condition Tangut Grade II, not Tangut Grade I: *t-qoN-H > *2korn2, not 2korn1.

Second, I cannot reconcile the possible Old Chinese coda *-ŋ with Japhug -m. Tangut -oN could be from *-am, so the rhyme is not an obstacle to a relationship between *R-koN-H (< *t-kam-H?) and tɤ-mɢom < *-mɴɢam. NINE ELBOWS

I know little about the early Chinese script, so I didn't know that the character for the Old Chinese (OC) word 'elbow' was used to write a nearly homophonous word 'nine' until I read pages 31-32 of Baxter and Sagart (2014). The two words rhymed in both Old Chinese and Middle Chinese (MC):

'elbow': OC *t-[k]<r>uʔ > MC *ʈuʔ (now written 肘)

'nine': OC *[k]uʔ > MC *kuʔ (now written 九)

Brackets indicate "either *X, or something else that has the same Middle Chinese reflex as *X" (Baxter and Sagart 2014: 8).

Angled brackets indicate that *<r> is an infix.

One might expect those words' apparent Tangut cognates to rhyme, but they don't:

1298 1kirw4 < *R-k(r)uk 'elbow'

3113 1gy'4 < *NGəX 'nine'

*R- might be from an earlier *t- matching the Chinese *t- prefix for inalienable nouns (Baxter and Sagart 2014: 57) and the presyllable of Japhug -zgrɯ 'elbow'. In fact, the Tangut preinitial could have been a presyllable *Rɯ- with a high vowl as in Japhug. (A low presyllabic vowel would have conditioned the lowering of the main vowel: *Rʌ-uk > -erw.)

It is not possible to tell whether the Tangut word once had a medial *-r- or not, as *R-uk and *R-ruk merged as -irw.

The final *-k does not match either Old Chinese *-ʔ or the zero coda of Japhug. Is this *-k a suffix, or is the Tangut word unrelated?

Guillaume Jacques (2014: 189) suggested that 1298 'elbow' could actually be cognate to

1377 1kirw4 < *R-k(r)uk 'bad, crooked, slanting, inclined'

which might be cognate to Japhug *kɤɣ < *kɔk 'to bend' and, I would add, Old Chinese 曲 *kh(r)ok 'to bend.' I suppose 'elbow' (i.e., a bent thing) must have developed from 'bend' before it became 'crooked'.

As for 'nine', I cannot explain why pre-Tangut has instead of *u. (1.10.0:: This might not be a problem for Gong Hwang-cherng who reconstructed Old Chinese 'nine' as *kjəgwx, but see Baxter and Sagart 2014's section on reconstructing the rhyme of 'nine'.)

I am tempted to derive the *-X of 'nine' from a *-t like the coda of Japhug kɯngɯt* 'nine', but *-X can also correspond to Japhug zero: e.g. (example added 1.10.0:40),

2205 1lyr' < *R-ly-X 'four' : Japhug kɯβde*-pə-tlej 'id.'

Could *-X in 'four' be a suffix absent from Japhug and other languages (see below)?

Tangut *R- may correspond to Japhug preinitial *-t- which may be a prefix absent in Written Tibetan bzhi < *blyi 'four', Written Burmese leḥ 'four', and Old Chinese *s-li-s 'four'.

*1.10.0:15: Guillaume Jacques (2014: 158) pointed out that the -t of Japhug 'nine' was carried over from kɯrcat 'eight'.

A -t-less form is in kɯngɯ-rtsɤɣ 'nine stages'. BAXTER AND SAGART (2014): INTRODUCTION

I first learned three months ago that William H. Baxter and Laurent Sagart's Old Chinese: A New Reconstruction (2014) had been released in the US, but I didn't get my own copy until tonight.

Although I have some doubts about the authors' reconstruction, I do agree with everything they wrote in their introduction, and I wish to make four points:

1. They reject this approach to reconstruction (p. 5):

One traditional view is that historical linguists have certain scientific procedures at hand that, if correctly applied, will produce reliable results and will not lead them into error. Conclusions resulting from the correct application of these methods may be regarded as "proved." (It follows from this view that if two scholars reach different results, one of them - at least - must have applied the methods improperly.)

I would only add that such a view assumes that the procedures are infallible. Are they? How do we know that?

2. Instead, Baxter and Sagart use the hypothetico-deductive method. They gave "the famous case of the solar eclipse of May 29, 1919" as an example of how predictions based on laws considered to be "scientifically proved" could be tested:

In the event, Einstein's theories turned out to fit the observations much more closely than Newton's (Dyson, Eddington, and Davidson 1920).

I share Baxter and Sagart's stance, and I sum it up as: theorize, test, theorize, test ...

3. Baxter and Sagart note that the English word reconstruct is problematic because it is an "accomplishment verb" with "both a process and an endpoint" rather than an "activity verb" whose endpoint is not normally presupposed. But the reconstruction of Old Chinese - and Tangut - is an ongoing process without any end so far. My thoughts about Tangut phonetics have changed a lot over the last year. I expect them to continue to change as I find new evidence and reinterpret old evidence.

4. I realized that Baxter and Sagart's "conventional transcription" for Middle Chinese is like my recent transcription of Tangut: both

are not phonetic reconstructions but conventional representations of the information about pronunciation given in Middle Chinese [and, in my case, Tangut] written sources. Accordingly, they are not preceded by asterisks; for typographical convenience, and to emphasize the fact that they are not reconstructions, they are restricted to ordinary ASCII characters (in italic type), rather than the International Phonetic Alphabet. EVENLY WEIGHTED EVIDENCE FOR 'PRIMAL' NEUTRALIZATION

Last night I erred and thought I had found a case of -V and -V' words with homophonous -Vq derivatives. These sets from Gong (2002: 178-179, 185-187, 191) aren't quite what I was looking for, but they're close:

Set Base noun q-derivative verb
Tangraph Li Fanwen # Reading Gloss Tangraph Li Fanwen # Reading Gloss

1737 1ka1 to be even (v.i.)

1576 2kaq1 to make even (v.t.)

5592 1kar'1 balance for weighing

2907 to measure, weigh on a balance


5682 to measure



5890 1ku2 loose

2668 1kuq1 to loosen

3177 1kur1 cold

3358 ice

The first set has a root *ka 'even' which became 1ka1. The other forms have affixes:

*R-ka-X > 1kar'1 'balance for weighing'

*S-ka-H > 2kaq 'to make even'

*S-R-ka-X-H > 2kaq (not *2karq'!) 'to weigh on a balance'

Here I treat *X, the source of the mysterious 'prime' (') distinction in Tangut, as a suffix, but it could have been something else.

I once thought that Tangut might have had -rq rhymes with retroflexion and tension toward the end of the 105-rhyme list after the -q and -r rhymes. Nishida (1964) reconstructed rhymes 99-101 with both features. However, if  Tangut had such -rq rhymes, I would expect 'to weigh on a balance' to have a high-numbered rhyme instead of rhyme 61 -uq1 which is the first of the -q rhymes. Did pre-Tangut have *-rq rhymes that merged with *-q rhymes?

(1.8.21:08: Were *q- and/or *' < *X- difficult to pronounce with a retroflex vowel? Here is a table of possible vowel quality combinations:

V V' Vn Vn'
Vq - Vnq -
Vr Vr' Vrn

Vnq is only possible if V = e and Vn' and Vrn are only possible if V = o. Assuming these isolated rhyme types [-enq, -on', -orn] are correct [they may not be**], a few subtypes (-on'2, -enq4, -orn1) are of relatively high frequency:

Rhyme Rhyme # Tone 1 Tone 2
-on'2 59: 1.57 17 -
-on'3 60: 2.50 - 5
-on'4 2
-enq3 65: 1.62/2.55 3 7
-enq4 7 12
-enq2 75: 2.65 - 4
-orn1 94: 1.91/2.82 11 8
-orn4 95: 2.83 - 6

Is that high frequency the result of mergers?

*-anq, *-enq, *-inq, *-onq, *-unq, *-ynq > -enq?

*-an', *-en', *-in', *-on', *-un', *-yn' > -on'?

*-arn, *-ern, *-irn, *-orn, *-urn, *-yrn > -orn?

If so, why did the vowels merge in two different ways: i.e., into e in -nq-type rhymes and into o in -n' and -rn-type rhymes?

Why are -enq1, -on'1, -orn2, and -orn3 missing?

Why are some of the above rhymes only with one tone and not the other: e.g., why is there no 1-enq2?

And why is -enq2 listed far from the other -enq rhymes?)

The second set has a root *qu 'loose' (cf. Mawo Qiang qhə qhəʴ and Ergong quə quə, but probably not  Guanyinqiao Wobzi Lavrung kú*** or Japhug ɴɢu****) which became 1ku2.

1kuq1 'loosen' had a causative prefix *S-:

*S-qu > *1kuq2 > 1kuq1 (the rhyme -uq2 does not exist; it might have merged with -uq1)

The third set has a root *ku. 'Cold' had a prefix *R- not in all of its probable cognates:

Muya tu³⁵ ku⁵⁵

Wobzi rkhô

Rangtang Puxicun rGyalrong su (with a uvular!)

Ganzi Daofu Xianshuizhen rGyalrong ʂkʰur (with a final -r!)

1kuq1 'ice' had a nominalizing prefix *S-.

One might try to conflate nominalizing and denominalizing***** *S- as a 'part-of-speech-switching' prefix, but I suspect that various phonetically distinct prefixes merged as *S-: e.g., *sʌ- and *ɕɯ-, etc.

1.8.22:16: The initial of 'ice' is uncertain.

Gong (2002: 187, 191) reconstructed it as k-, but his reconstruction in Li Fanwen (2008: 544) has l-, perhaps because it is next to 2luq1-tangraphs in the Precious Rhymes of the Tangraphic Sea and may have been homophonous with them.

Shi Jinbo et al. (2000: 285) approximated the pronunciation of 'ice' with the fanqie 菊祖. I do not know the basis of that fanqie.

'Ice' is not in Homophones, so even its initial class is unknown.

*This variant of 0640 is in Gong (2002: 187).

**1.8.20:16: Arakawa reconstructed both -enq and -onq, which makes me wonder if his *-enq is from *-inq and *-enq and his *-onq is from *-unq and *-onq. (What would have happened to his *-anq and *-Inq = my *-ynq?)

-orn and -on' may be unique to my reconstruction. I mechanically derived them from Gong's -(j)owr and -i/joow, and I have little confidence in them.

***I would expect a Wobzi Lavrung form with *q-.

Perhaps I am wrong to derive Tangut Grade II velars partly from *uvulars. 'Head' is another problematic set:

2750 1ghu2 < *Cʌ-qu

Wobzi ʁú (could ʁ be a lenited *q?)

but Japhug tɯ-ku with a velar even though Japhug has q-!

****Japhug ɴɢ- is from Proto-rGyalrongic *ɴɢ-, not *nq-, so the root of ɴɢu cannot be *qu.)

*****See the table at the top of the last entry for examples. BLADE WOUNDS IN PURSUIT OF THE 'PRIME' PHONEME

Recently I have replaced Gong Hwang-cherng's long vowels with a prime symbol (') to signify a distinction of unknown nature in my Tangut reconstruction. I write its pre-Tangut source as *X.

Last night I found this instance of a V ~ V' merger V' becoming Vq in Jacques (2014: 255):

Base noun q-derivative verb
Tangraph Li Fanwen # Reading Gloss Tangraph Li Fanwen # Reading Gloss

1823 1ma4

4688 1maq4 to cut, pierce, bite

5702 1ma'4 wound

5628 to wound

(1.8.1:54: "mja¹" [= my 1ma4] for 1823 is a typo for "mjaa¹" [= my 1ma'4] in Jacques 2014. Hence 1823 and 5702 are homophones, just as 4688 and 5628 are homophones, and there was no merger of derivatives of roots with VV'.)

5702 and 5628 share a root *ma with Japhug tɯ-ɣmaz < *-km- 'wound' and Written Tibetan rma 'wound'. Could Tangut 'prime' preserve something lost in Japhug and Written Tibetan, or is it an innovation: e.g., a trace of an affix?

I have adopted Arakawa's use of -q to indicate tension in the preceding vowel, but one can interpret it in other ways. Gong was the first to note that -q may correspond to non-Tangut sibilants. The comparisons below are from Jacques (2014), though the pre-Tangut reconstructions are mine:

0124 2luq3 < *SluH 'head' : Old Chinese 首 *l̥uʔ < *sl-? 'id.'

0385 2viq3 < *Ci-SpaH 'to be able' : Japhug spa 'to know, be able'

0527 1vaq3 < *Sɯ-pap 'tumor' : Japhug zbɤβ 'goiter'

2814 2lhiq4 < *Si-lha-H 'moon' : Japhug sla, Written Tibetan zla (< *sla) 'id.'

2878 1biq2 < *Sʌ-mbri 'willow' : Japhug ʑmbri 'id.'

Hence both Jacques and I reconstruct its pre-Tangut source as *S-.

Regardless of what *S- became in Tangut, it apparently could not exist with 'prime': e.g., the -q-derivative of 5702 1ma'4 with 'prime' is 5628 1maq4 without 'prime', not *1maq'4 with 'prime'. Did *1maq'4 ever exist? The absence of *-q' may indicate that -q and 'prime' were difficult or even impossible to pronounce together.

Next: Further incompatibilities. WEAK EVIDENCE FOR A 'STRONG' RHYME

As I was filling out the 2015 column in my 105rhymes file (Excel / HTML), I noticed that rhyme 102 appears as Grade I -oor in Gong's 1997, 2003, and 2008 lists of rhymes but as Grade III -joor with a medial -j- in all of his reconstructions of individual readings with the sole exceptions of

2628 1goor 'man, male' = 1gor'1 in my transcription

4748 1koor 'brocade' = 1kor'1 in my transcription

The other rhyme 102 syllables are

Reading Tangraph Li Fanwen # Gloss

3386 first half of 3386 1161 1sjoor 1śjịj 'levity'

2947 strong (the name of rhyme 103 in the Tangraphic Sea)

0980 full, excessive

1746 false, fake

5944 flame, light

1126 a unit of length

2869 the surname Lhor

4247 span

The last three are in the Mixed Categories of the Tangraphic Sea along with other lh-tangraphs.

The only transcriptions of any of these tangraphs I can find are

*2ŋgo1 for 2628 in Timely Pearl 294

*2kwo1 for 4748 in Timely Pearl 256

*3lõ3 for 4247 in Timely Pearl 324

Gong may have chosen to reconstruct rhyme 102 as Grade I because two out of three Chinese transcriptions were Grade I. Yet he reconstructed -j- in all coronal-initial rhyme 102 syllables, perhaps because 4247 was transcribed with Chinese Grade III (characterized by *-j- in Gong's reconstruction of Tangut period northwestern Chinese).

If rhyme 102 had a variant -joor after coronals, that variant would be homophonous with rhyme 103 which Gong reconstructed as -joor after velar kh- and coronal n-. Why would the Tangut have placed some -joor syllabes in rhyme 102 (1sjoor, 1ljoor, 1lhjoor) and others (1njoor) in rhyme 103?

Without Tibetan transcriptions with r-, I cannot be certain rhyme 102 is retroflex. The cognates in Jacques (2014: 199) do not contain r:

4247 1lhjoor 'span' : Japhug tɯ-ɟom < *-tlj- : Written Tibetan ɴdom < *N-l-

The placement of rhyme 102 toward the end of the list hints at retroflexion, though, as we know for sure that the retroflex rhymes follow the plain and tense* rhymes. If rhyme 102 was retroflex, its retroflexion must have been conditioned by an *r-prefix absent from Japhug and Tibetan. (Another possibility is that a *t-prefix cognate to Japhug tɯ- became *r-. A third possibility is to derive Arakawa's ld- for 4247 from *r-tl-.)

The Chinese transcriptions cannot tell me whether the vowel was long or short.

All I can be sure of is that rhyme 102 was something like 1-o without a rising tone counterpart 2-o. Maybe the best transcription would be -O with capitalization signifying 'something like'.

Rhyme 102 surely cannot be a simple -o or -or because those values have already been taken (rhymes 51 and 95 in Gong's reconstruction) and it is so rare. I use a prime symbol to distinguish 102 -or' from 95 -or. (-' may have been a glottal stop.)

Arakawa reconstructed rhyme 102 as -woq2 (i.e., as a second tense -wọ somehow distinct from rhyme 73, his first tense -(w)oq), projecting the medial *-w- of the Chinese transcription of 4748 onto all syllables of that rhyme. I do not know why he chose to reconstruct it as tense, as it is preceded by 22 retroflex rhymes in his reconstruction and followed by a plain rhyme -ya:n.

*The agnostic can call this second group of rhymes 'nonplain and nonretroflex' as Chinese, Tibetan, and Sanskrit transcription data cannot directly support tenseness as its defining characteristic. TANGUT RHYME DATABASE: 5 JANUARY 2015 EDITION

I added a new column for my latest Tangut transcription to my 105rhymes file (Excel / HTML).

Expanding on what I wrote last night, the transcriptions are in the following format:

tone initial medial vowel retroflexion nasality tension prime grade
0 - unknown 29-32 initials
plus unwritten glottal stop:
p-, ph-, b-, m-, (f-?), v-
t-, th-, d-, n-
(lt-?), ld-, l-, lh-
ts-, tsh-, dz-, s-, z-
c-, ch-, j, (ny-?), sh-, zh-, r-
k-, kh-, g-, ng-, h-, gh-
-(y)w- -a-, -e-, -i-, -o-, -u-, -y- -r -n -q -' -1: I
1 - 'level' -2: II
2 - 'rising' -3: III
4 - 'entering' -4: IV

The tone numbers are based on Chinese conventions. 3 would be a 'departing' tone, but no such tone is in the Precious Rhymes of the Tangraphic Sea which only lists three tones: 'level', 'rising', and 'entering'. The scare quotes indicate that the names taken from Chinese are not necessarily to be interpreted at face value: e.g., the 'level' tone may not have been level (in modern Cantonese, the level tones are falling, and one level tone in Mandarin is rising).

Medial -y- is only in rhyme 105 -ywa4 which may have been [ɥa].

The vowel quality codes (-r, -n, -q, -') are from Arakawa and are in an order that looked aesthetically pleasing to me: Vnq and Vrn are easier on my eyes than Vqn and Vnr because they are close to the English sequences Vnk and Vrn. -nq is also the sequence used by Arakawa. -rn is unique to my system; it is absent from Arakawa's: e.g., my Grade I -orn corresponds to Arakawa's Grade III -o:r. (Arakawa reconstructed vowel length which I rejected.)

I placed the prime symbol at the end to avoid implying that, for instance, o'n2 or a'r1 are o and a followed by syllabic n2 and r1. on'2 and ar'1 look more like single units (albeit one might misinterpret n' and r' as consonants distinct from n and r, though they are vowel qualities, not consonants).

If a rhyme could be Grade III or IV, I assign it to III if it follows a 'vigilant' initial (class II or VII initial or l-); otherwise I assign it to grade IV.

Not all combinations of tones, consonants, vowels, vowel qualities, and grades are possible: e.g., tension cannot coexist with retroflexion. A NEWLY FOUND INSCRIPTION IN A NEW TRANSCRIPTION

Thanks to Andrew West for this article on a Tangut gravestone inscription from 1278 that was found in 2013. The Chinese glosses are from the article and emended by Andrew.

Left line Right line
Tangraph Li Fanwen # Reading Chinese gloss English gloss Tangraph Li Fanwen # Reading Chinese gloss English gloss
3799 2sew1 small (loan from Chinese *2sew4) 1234 1then4 the Chinese surname Tian
1141 2li3 the Chinese surname *2li3 'plum' (Mandarin Li) 0477 1zyq4 maternal surname (loan from Middle Chinese *(d)ʑi(e)ˀ?)
1531 1ga4 army 3118 1hu1 transcription of Chinese *4fu3 'lucky' (Mandarin  fu)?
2805 2bu'4 command 0052 1zhi3 transcription of Chinese *1zhi3 'child' (Mandarin  er)?
2893 2khwe1 great 3654 0a0 monk; kin term prefix borrowed from Chinese *1a1-?
Left: 'Little Li, Great Commander of the Army'
Right: 'Tian family, Fu'er, Mother' (Li's wife)
0092 2ma4 mother

Six or even seven of the eleven words are loans from Chinese.

Both the Tangut and Tangut period northwestern Chinese reconstructions are in a new format:

tone number (1-2 for Tangut, 1-4 for Chinese) + syllable + grade number (1-4)

0 indicates an unknown Tangut tone number or grade.

Tangut syllables have the following consonants which may be followed by medial -w-:

p- t- ts- c- [tʂ] k-
ph- th- tsh- ch- [tʂʰ] kh-
b- d- dz- j- [dʐ] g-
m- n-   (ny- [ɲ]?) ng- [ŋ]
(lt- [tɬ]?) s- sh- [ʂ] h- [x] or [h]
ld- [dɮ]? z- [ɮ] zh- [ʐ] gh- [ɣ] or [ɦ]
v- [ʋ] or [v] l-   r-  
  lh- [ɬ]  

Initial glottal stop is unwritten. Hence w- is [ʔw] contrasting with v- [ʋ] or [v]. (Vietnamese has the same distinction: e.g., oa [ʔwa] vs. va [va])

Tangut syllables have only six vowels (not including the the prime symbol -' indicating a distinction of unknown nature, -n for nasalization, or -q for tension): a, e, i, o, u, y.  Y is a central vowel.

I have changed my mind so many times about Tangut grades that I have decided to use this agnostic notation instead. Removing the numerals, the prime symbol, and -q results in a simplified notation suitable for lay publications. A CAPRINE QUARTET

The last tangraph (Tangut character) in last night's entry on ovine characters was

2969 2ŋəʳ 'sheep, goat'

It was unique among tangraphs for sheep, but it has a near-lookalike among tangraphs for goats:

1-2. 1189 1104 2gwã 2te  'wild goat' (Li Fanwen 2008; Nishida 1964: 'a kind of sheep')

The top component of those two tangraphs

is shared with 121 other characters without any single common phonetic or semantic denominator: e.g.,

0745 2vɨe 'the surname syllable Ve'

0900 2vɨi 'under, below, bottom'

0901 2nɤẽ 'dirt, filth'

0911 2kɑ 'arduous, difficult, hard, tough'

0916 1tshu 'conceited, restrained'

The analysis of 1104 is unknown, but 1189 is derived from the top of 1422 1sy' 'deer' plus all of 5181 1tsɑʳ 'beast, animal':


5181 in turn has a circular derivation with 5167 1ɬa 'deer':


Why derive 'goat' or 'sheep' from 'deer'?

The center component of 2969 'sheep, goat' and 1104, the second half of 'sheep' or 'wild goat', is shared with


3. 2367 1tshə 'goat' =

left and center of 3454 2tshe 'sheep' +

right of 1neʳ 'wild animal'

whose right side has the same source as the left side of


2969 2ŋəʳ 'sheep, goat' (and probably  the second half of1189 1104 2gwã 2te  'wild goat' or 'a kind of sheep' as well)

3454 2tshe may be from 2367 1tshə plus *-j.

3454 and 2367 share their left and center components with

4. 3768 2tʂɨụ 'goat, lamb' (analysis unknown)

whose right side is in only two other tangraphs:

1852 2tʂɨụ 'to be able, to dare' (analysis unknown)


3805 1tʂɨụ 'that' =

right of first half of 5883 2tsha 1ʐɨiw 'to bully and humiliate' (the second half is 'to invade' by itself) +

right of 3768 2tʂɨụ 'goat, lamb' (analysis unknown)

Since all three are nearly homophonous, their shared component

is clearly phonetic. OTHER OVINE TANGRAPHS

Yesterday I wrote about the tangraph (Tangut character) for the calendrical term 2mø 'sheep':


Given that tangraphy (the Tangut script) is supposed to be largely semantic, it is surprising that the other ten 'sheep' tangraphs do not share any single element with each other or 2mø:

Tangraph Li Fanwen number Reading Gloss Notes Type
3452 2ʔe sheep
left side shared with 2550 2ʔe 'banquet'; phonetic A
3470 1mə 1tʂɨa 1mʌ 'patron god of sheep'
left side phonetic; 1mʌ is homophonous with 3513 1mʌ and may be an adjective 'heavenly'

1588 1tʂɨa B
3454 2tshe

left side phonetic/semantic?; also in
2367 1tshə 'goat'
2910 1tshə 'lamb'
3127 1tshə (second half of 2ɮəʳ 1tshə  'waterfowl'; 2ɮəʌʳ is 'water', so is the word literally 'watergoat' or 'waterlamb'?)
5959 1pʌʳ 'lamb'
2tẹ right side shared with 2910 1tshə 'lamb'
sounds like a possible *S-prefixed derivative of the second half of 1189 1104 2gwã 2te  'wild goat'
5557 2ga
old sheep (only in dictionaries) left from 2910 1tshə 'lamb' (semantic)
right from 0590 2ɮø̣ 'longevity' (semantic)
1nɪʳ one-year-old sheep (only in dictionaries)

looks like 5557 plus a right-hand component, but analyzed as:
left from 3452 (semantic; above)
center from 0590 2ɮø̣ 'longevity' (semantic)
right from 1996 2de 'to drink'  (why?)
5004 1gyʳ no semantic components shared with 5095 in spite of being a synonym
top and bottom right from 5007 2gyʳ 'to lie down' (phonetic)
bottom left from 0276 2no 'child' (semantic)

4925 1nwew
six-year-old sheep (only in dictionaries) top and bottom left from 4971 1ʂwɨi 'year' (semantic)
bottom right from 3194 ɬʌ (tone unknown) 'full' (semantic)
sheep, goat
left from 0558 1neʳ 'wild animal' (semantic; also at right of 2367 1tshə 'goat')
right from
all of 1153 1dʐə 'skin' (semantic; according to Precious Rhymes of the Tangraphic Sea) or
the second half of 1189 1104 2gwã 2te  'wild goat' (semantic; according to Combined Homophones and Tangraphic Sea)

Types A-C have right-hand components similar but not identical to the right side of 5504:




The significance of the different top parts is unknown.

Types B-D share a component

(coincidentally?) resembling Chinese 羊 'sheep' with 48 other tangraphs which lack ovine semantics: e.g.,

0079 ~ 0676 1vɨe 'to go'

0123 dʐɨi (tone unknown) 'to go'

0561 2dʐɨo 'to help' (< Chn 助)

0787 1kɑ̃ 'to expel' (< Chn 赶)

Type E tangraphs have a 'horned hat' in common, but according to Tangraphic Sea analyses, that 'hat' has two different sources:

5095 1nɪʳ 'one-year-old sheep' = 5007 2gyʳ 'to lie down' + 0276 2no 'child'
5004 1nwew 'six-year-old sheep' = 4971 1ʂwɨi 'year' + 3194 ɬʌ (tone unknown) 'full'

Type F is unique.

1.3.1:05: Or is it? I will look at 'goat' tangraphs next. A EWE-NIQUE COMPONENT?

2015 is the year of the 2mø 'sheep':

whose tangraph (Tangut character) consists of two halves of unknown function:


The second half is so rare that I don't think it's in my Tangut radical fonts. I thought it was unique to 5504 'sheep', but it's also in one variant of 3007 2Taʳ 'net' (dental initial unknown):


The first variant of 3007 with 亠 at the top right has a unique right side:

Are these rare right-hand components single units or compounds whose parts are abbreviations of other tangraphs?

=(< ?)+(< ?)?

=(< ?)+(< ?)?

The bottom half of those components, Kychanov and Arakawa's radical B299, is in 23 tangraphs (their 5096-5118). They have no obvious single phonetic or semantic common denominator, and none sound like 5504 2mø 'sheep' or 3007 2Taʳ 'net', so B299 cannot be phonetic in those two tangraphs.

B299 looks like the right-hand version of Li Fanwen (1987) radical 71 (in Andrew West's numbering):

which is in non-right-hand positions in eight tangraphs:

Tangraph Li Fanwen number Reading Gloss Notes
0083 1vɪ dragon semantic component in 4234 (below)
1188 2ŋɑ egg only in dictionaries
4234 1vɪ dragon plant longan (< Chinese 'dragon eye'); 'dragon' with 'wood' on top
5441 2swi mother-in-law center and right from 1188 'egg' (above)
5496 2bə cheek variant of 5510 which has 'flesh' on the left instead of LFW radical 71; I assume 'flesh' is original
5529 2ʂwɨi the surname Shi share a common phonetic
5530 1ʂwɨi first half of 5530 4761 1ʂwɨi 1ʂwɨa 'in a soft low voice'
5546 2ʂwɨi second half of 3278 5546 1tshi 2ʂwɨi 'food'

Complimentary distribution is no guarantee of allography. Conversely, the facts that

- none of the LFW 71 tangraphs appear in known analyses of B299 tangraphs

- none of the B299 tangraphs sound like ʂwɨi or (except maybe 5696 1bɛ 'smallpox')

does not necessarily rule out a relationship between two: e.g., the two radicals might be interchangeable in tangraphic analyses that have not yet been (re)discovered. (If only we could see what Nevsky saw before his texts were taken away!)

Nishida radical 241


the left side of 2mø 'sheep', can appear in initial, medial, and final position in 109 other tangraphs. None of the 109 others sound like 2mø, so radical 241 must be semantic and be an abbreviation of some other tangraph ... unless it is a Tangut B phonetic component in 5504.

Next: Other ovine tangraphs. THE GOLDEN GUIDE: LINE 103: TANGRAPHS 511-515

103. I can't remember what I was supposed to blog about next, so I'm going to stick with tradition and finish the year with a line from the Golden Guide. Three lines to the last surname ...

Tangraph number 511 512 513 514 515
Li Fanwen number 3601 0914 4305 3801 0775
My reconstructed pronunciation 2khew 1gɤe' 1tsũ ?kø̃ 1gy
Tangraph gloss opening, entrance a place name Ge the surname 宗 Zong (*tsũ) 'ancestor' second half of 1kə ?kø̃ 'dung beetle' transcription character
Word the surname 寇 Kou (*khew) the surname 崖 Ya (*ŋgɤe) the surname 姜 Jiang (*kø̃) the surname 虞 Yu (*ŋgy)
Translation Kou, Yi(ng?), Zong, Jiang, Yu

511: 3601 is a borrowing of Tangut period northwestern Chinese 口 *khew 'mouth' (albeit with a specialized meaning), but its character is a much more complex semantic compound:


3601 2khew 'opening' =

left of 2627 2lɨə̣ 'earth' +

all of 4866 1kɤe 'incomplete' (why?)

The presumably semantic function of 4866 is unclear. Would 'complete earth' be solid and lack an opening?

512: I presume the sources of 0914 somehow describe Ge, wherever that was: e.g., Ge was high and its earth looked as it had been kneaded by the gods.


0914 1gɤe' 'a place name Ge' =

top of 1166 1tswə 'to knead, rub' +

right of 0464 2so 'high'

Oddly 0914 appears as a transcription of Chinese 櫻 *ʔĩ in the Timely Pearl, which may be why Nie and Shi (1995) regarded 0914 as a transcripiton of the homophonous Chinese surname 嬰.

513: I presume 5635 is phonetic in 4305 despite the mismatch in initial voicing:


4305 1tsũ 'ancestor' =

top of 4250 1si 'wood' +

all of 5635 2dzu, second half of 2me 2dzu 'steed'

Unlike other 'horse' words,

1379 5635 2mɛ 2dzu 'steed' (only in dictionaries; a 'ritual' word?)

was written without the 'horse' radical:

Why wasn't that radical in all equestrian tangraphs?

514: The analysis of 3801 is unknown, but it looks like a transparent compound of the radicals for 'insect' and 'small':


I am uncertain about the tone of 3801. It is listed in the second tone volume of Tangraphic Sea, even though its placement in Homophones implies that it is a first tone tangraph.

515: 0775 is a straightforward semantophonetic compound:


0775 1gy 'transcription character' =

left of 1586 1ɣɤị 'sound' +


Mentioning Marathi व्ह् <vh> last night reminded me of a topic I've had in mind since I briefly mentioned Verner's Law two weeks ago: the segmental traces of accent in Avestan.

Avestan has an h before rk and rp that doesn't correspond to any consonant in other languages: e.g.,

'wolf': A hrk-; cf. Sanskrti vṛ́ka, Russian volk

'body': A hrp-; cf. Latin corpus, Sanskrit kṛ́p- 'beauty'

Moreover, Avestan sometimes has a (< *hrt?) where an rt is expected: e.g.,

'battle': A ana-; cf. Sanskrit pṛ́tanā

Did Avestan preserve an h - a laryngeal? - lost elsewhere in Indo-European? No, it turns out that h and were conditioned by an accent preserved by Sanskrit but unwritten in Avestan:

*ṛ́ > *ə́r > əhr before *k, *p

*ṛ́t > *ə́rt (> *əhrt?) > əṣ 

*r also became hr (and merged with *t to become ṣ) after other vowels: e.g.,

*árt (> *ahrt?) > aṣ in maya- 'man'; cf. Sanskrit mártya-

See chapter III, section 2 in Beekes (1988) for problematic cases in which Avestan accent does not agree with Sanskrit accent; the Sanskrit accent may not necessarily be conservative.

What is the phonetic motivation for those changes? I suspect hr was voiceless [r̥]. Why would *r devoice after a stressed vowel?

Similarly, why would Germanic spirants (e.g., þ) remain voiceless after an originally accented vowel but become voiced elsewhere?

'brother': *bhréʕtēr > Gothic broþar; cf. Sanskrit bhrā́tar-

'father': *pʕtḗr > Gothic fadar; cf. Sanskrit pitár-

If the Proto-Indo-European accent was a high tone, there may have been a correlation between high tones and voicelessness vaguely reminiscent to that found in Cantonese. However, higher Cantonese tones are associated with historically* voiceless initials that precede them, whereas Indo-European high tones were associated with voiceless consonants that followed them. Moreover, the Cantonese correlation involves all historically voiceless consonants, whereas the Indo-European correlations only apply to a limited number of consonants - just one in the case of Avestan! Lastly, there is one other accent-related shift in Avestan that does not involve voicing - or does it? *h (< *s) became before y and an accented vowel:

'to do homage': nəmax́ya-; cf. Sanskrit namasyá-

Could the spirant sign transliterated as have been voiced [ɣʲ] (assimilating to voiced palatal y [j]), just as Germanic spirants voiced before accented vowels?

*Modern Cantonese has many voiceless initials before vowels with low tones. Those initials are historically voiced: e.g., 唐 tʰɔːŋ ˨˩ <  *dɑŋ 'Tang (Dynasty)'.

