Amaravati: Abode of Amritas

18.7.7.14:15: THEY KANTU BE COGNATES II: A BIG MISTAKE IN TANGUT ETYMOLOGY

Middle Chinese (MC) *d- has two main Early Old Chinese (EOC) sources: *d- and *l- before the 'lower' vowels¹ *a/e/o. (I list all other sources here.²)

In theory, MC 大 *da̤(j) 'big' < EOC *lats (last seen in my last post) could have had *d- or *l- in EOC, and in fact, the word has been reconstructed in Old Chinese with both initials (*d- by Schuessler 2009 and *lˁ- [= my *l-] by Baxter and Sagart 2014). Two pieces of evidence point toward *l-:

- an alternate spelling as 世 *Hɯ-lap-s (Baxter and Sagart 2014: 109; they reconstruct *l̥ap-s)

*H- indicates a consonant that conditions aspiration or devoicing: *Hɯ-l- > *l̥-.

The use of 世 could indicate that 'big' was really *laps or that 世 was chosen to write *lats after *-ps merged with *-ts. Unfortunately there are no known cognates that could point to *-t or *-p.

- the 古丈 Guzhang subvariety of the 瓦鄉 Waxiang variety that preserves *l- in 'type A' syllables with 'lower(ed)' vowels has /lu 22/ 'big' with /l-/ (Baxter and Sagart 2014: 109).

I am guessing /u/ is from *-as and not *-ats.

For some time I thought EOC *lats (or *laps?) was cognate to Tangut

𘜶

4456 2leq3 'big'

and would have correlated the pre-Tangut *S- which conditioned -q (tenseness) with an aspirating prefix *H- that made *lats into 太 *H-lats > MC *tʰa̤j 'great'.

Now I see that a proto-Sino-Tibetan *l-word for 'big' based on those Chinese and Tangut words is impossible. The rhymes do not match. Converting Jacques' pre-Tangut reconstructions into my system, I posit three possible sources for *2leq3:

1. *Sɯ.leH

2. *Sɯ.leŋH
3. *Sɯ.laŋH

Lining up the components of the EOC and pre-Tangut forms:

EOC	H (= s?)	(V?)*	*l	*a	t or p	*s
Pre-Tangut	*S	*ɯ	*l	a or e	Ø or ŋ	*H
Match	✓	(✓?)	✓	✓ or ✕	✕	✓

In theory, EOC 太 'great' could have been *sɯ-lats with a high presyllabic vowel that was lost before it could trigger partial vowel raising in the main syllable.

The presyllables might match, but there is no way pre-Tangut *-e, *-eŋ, or *-aŋ can be reconciled with EOC *-ats or *-aps. So I can only regard the pre-Tangut and EOC words for 'big' as lookalikes.

¹EOC had two sets of vowels:

quality	palatal	neutral		labial
stress	+		-	+
higher	*i	*ə	*ɯ	*u
lower	*e	*a	*ʌ	*o

This happens to be identical to the higher/lower eight-vowel system I reconstruct for Early Korean apart from the inclusion of stress which is irrelevant to Korean phonology. Higher/lower systems are a trait of northeast Asian languages (EOC, Mongolic, Tungusic, Korean³, and possibly Tangut - but not Tibetan to the west, Burmese to the south, or Japanese across the sea to the east).

*ɯ and *ʌ are cover symbols for 'unknown unstressed higher vowel' and 'unknown unstressed lower vowel'. They are based on the Korean higher and lower minimal vowels which really were *ɯ and *ʌ. It seems that EOC had *i as an unstressed higher vowel at an early point, and it is possible that the unstressed subsystem was triangular: *a/i/u. If that was the case, *u has left no traces of its labiality on the following syllable, whereas *i has left traces of its palatality, and *a = *ʌ has triggered partial vowel lowering and pharyngealization.

EOC *d- and *l- before the 'higher' vowels *ə/i/u have palatalized Middle Chinese reflexes *d- and *j-:

時 EOC *də > MC *dʑɨ 'time'

慎 EOC *dins > MC *dʑi̤n 'careful'

受 EOC *duʔ > MC *dʑṵ 'to receive'

怡 EOC *lə > MC *jɨ 'cheerful'

引 EOC *linʔ > MC *jḭn 'to draw a bow'

誘 EOC *luʔ > MC *jṵ 'to lead, influence'

The last two examples might have had EOC *ɟ- (me), *j- (Schuessler), or *z- (Karlgren), but let's go with a currently mainstream *l- for now.

There is no such palatalization before the 'lower' vowels (unless a higher-vowel presyllable preceded during the period of height harmony; see below).

²All other sources of MC *d- (converted from Baxter and Sagart's 2014 reconstruction):

1. EOC *nasal preinitial or *nasal-ʌ-presyllable + *t-

奠 EOC *N-ten-s > MC *de̤n 'to be fixed (v.i.)'

奠 EOC *m-ten-s > MC *de̤n 'to set forth (v.t.)'

突 EOC *mʌ-tʰut > MC *dot 'to burst through'

毒 EOC *mʌ-duk > MC *dok 'to poison'

with a longer version of the volitional prefix in 奠 EOC *m-ten-s 'to set forth'

2. EOC *Cʌ.d/l-

道 EOC *kʌ.luʔ > *kʌ.lʌuʔ > MC da̰w 'way'

cf. Proto-Hmong-Mien *kləuʔ 'way', a borrowing from Chinese

3. EOC *Cɯ.d/l- + *a/e/o > *C.d/l- + *a/e/o

The presyllabic higher vowel was lost by the period of height harmony, so it could not trigger partial raising of the vowel of the main syllable.

4. EOC *mV- + *r- > Early Middle Old Chinese (MOC) *mr- > Late MOC *d-

There were two waves of *mV.r- simplification: this one (1) and a later one (2):

Stage \ Simplification wave
1
2

1. EOC
*mV.r-
*mV.r-

2. Early MOC
*mr-

3. Late MOC
*d-
*mr-

4. MC
*m-

Examples of the two waves:

Wave 1: 逮 EOC *mʌ.rəp-s > Early MOC *mrʌəts > Late LOC *dəts > MC *də̤j 'to reach to'

Like Schuessler, I would normally prefer to reconstruct *l- instead of *r-, but for the moment I want to make the Baxter-Sagart *r- work within my system. For the logic behind *r-, see Baxter and Sagart (2014: 133-134).

Wave 2: 埋 EOC *mʌ.rə > Late MOC *mrʌə > MC *mɛj 'to bury'

I could treat cases like

萏 EOC *CV-romʔ > MC *də̤m, second syllable of MC 菡萏 *ɣə̤mdə̤m 'lotus flower'

cf. Baxter and Sagart 2014's *rˁomʔ which should normally become MC *la̰m, not *da̰m!

as examples of wave 1 with presyllabic *m-, though once again I would prefer to reconstruct *l- instead of *r-.

Baxter and Sagart (2014: 134) posit different developments in different dialects instead of two waves within the same language.

5. EOC *N.r- + *a/e/o

蕩 EOC *N.raŋʔ > MC *da̰ŋ 'to beat furiously (heart)'

Again, I would normally prefer to reconstruct *l- instead of *r-, but for the moment I want to make the Baxter-Sagart *r- work within my system. This word is not in Schuessler (2009), but it would have *l- in that book's system.

³I hesitate to say 'Koreanic' since I do not know if non-Korean Koreanic languages also had higher/lower vowel systems. Korean height vowel harmony seems to be an internal innovation dating long after EOC; it may be due to contact with Jurchen to the north.

18.7.1.23:59: THEY KANTU BE COGNATES

While looking for cognates of Vietnamese trai ~ giai 'boy' outside Vietic for my last post, I discovered a Kantu noncognate ʔandrus 'male, man' (L-Thongkum 2001¹). If a layman saw these three words and were asked to pick the one word not related to the other two, they'd choose nara-:

Kantu ʔandrus 'man'

Ancient Greek andrós 'man (genitive singular)'

Sanskrit náras 'man (nominative singular)'

But of course the last two are from Proto-Indo-European *ʕnḗr 'man'. The Ancient Greek nominative singular anḗr is almost identical apart from the epenthetic -a-.

The direct Sanskrit cognate of anḗr is nā́ 'man'. The loss of *ʕ- and the shift of *ḗ to ā́ are regular; the loss of *-r is not² (compare with PIE *dʰwṓr > Skt dvār 'door' which retains *-r but has a different irregularity - d- instead of dh-).

Sanskrit nár-a- is an extended version of the same word with an -a- suffix.

The Kantu word has a compressed variant ndrus. Kantu is a Katu dialect; other Katu varieties and Souei have

- shifted -s to -jh

(cf. Old Chinese *-ts, *-ps > Late Old Chinese *-s > Early Middle Chinese *-jh)³

- lost the nasal

- or added a prefix

assuming that they derive from Sidwell's (2005) Proto-Katuic *ʔndruːs 'male, man':

Katu (Triw) ʔandruːjh 'male, man'

Katu (Phuong) trus ~ padrɨjh 'boy, man'

why two different rhymes? different dialects?

Katu (An Diem) padruːjh 'boy, man'

Souei kantruah 'male, man'

¹Why isn't this word visible when I view L-Thongkum (2001) using the "build custom dictionary" option in the SEAlang Mon-Khmer database?

²This loss is regular for r̥-stems like nr̥ for nā́, but it is not regular for Sanskrit as a whole (hence the *-r-retention in 'door').

³18.7.6.22:21: An even more relevant parallel is in Vietnamese:

*-s > *-ɕ > *-jh > hỏi/ngã tone (depending on voicing of the *onset) + /j/ as in mũi 'nose'

Thavung mús 'nose' (Premsirat 2000) retains the original *-s. Ruc muːʃ 'nose' (Phu 1998) is like my intermediate stage *-ɕ between *-s and *-jh; another Ruc form, muᵊh (Phu 1998), has no final palatal segment.

In Chinese, primary and secondary *-s generally had two different reflexes which are like those of Vietnamese *-h and *-s:

Early Old Chinese	*-s	*-ks	*-ts	*-ps
Middle Old Chinese	*-s		*-ts
Late Old Chinese	*-h		*-s
Middle Chinese	'departing tone'		*-j + 'departing tone'

The general pattern of mergers (four categories into two) is clear, but the phonetic details are not: e.g., perhaps *-ks became *-x and merged with *-h from *-s at the Middle Old Chinese stage:

Early Old Chinese	*-s	*-ks	*-ts	*-ps
	*-h	*-x
Middle Old Chinese	*-h		*-ts
Late Old Chinese			*-s
Middle Chinese	'departing tone'		*-j + 'departing tone'

The two scenarios above need not be mutually exclusive, as they - and others I have not yet imagined - could represent what happened in different varieties of Old Chinese.

Late Old Chinese secondary *-s may have been phonetically *[ɕ], a simplification of *[tɕ] < *[ts] < *[ts] and *[ps]. I am unaware of any evidence for reconstructing an affricate as a source of Vietnamese *-s.

The high-frequency Early Old Chinese word 大 *lats 'big' has modern reflexes with and without [j]: e.g., Mandarin dà < *las and dài < *lats. (7.17.8:57: Thanks to Gong Xun for pointing out that Taiwanese tuā is in fact from *lats and not *las despite the superficial similarity of tuā to Mandarin dà.)

I am not aware of evidence pointing toward a dialect in which all *-ts (and *-ps?) merged with *-s and *-ks as *-s, though there is no a priori reason for doubting that such a massive merger could happen.

High-frequency words may be subject to greater erosion, so perhaps *lats had an abbreviated variant *las that became the ancestor of standard Mandarin dà (as opposed to standard Mandarin dài < *lats.)

Tangut Yinchuan font copyright © Prof. 景永时 Jing Yongshi
Tangut character image fonts by Mojikyo.org
Tangut radical and Khitan fonts by Andrew West
Jurchen font by Jason Glavy
All other content copyright © 2002-2018 Amritavision

Stage \ Simplification wave	1	2
1. EOC	*mV.r-	*mV.r-
2. Early MOC	*mr-	*mV.r-
3. Late MOC	*d-	*mr-
4. MC	*d-	*m-