09.11.21.23:57: WHY DOESN'T MIDDLE KOREAN HAVE DOZENS OF CLUSTERS?
Alexander Vovin (1996) reconstructed 14 Proto-Korean consonants:
|*p||*t||*c (= my *ts)||*k|
|*l *r||*ɲ *y|
Although *s is missing from his tables of reconstructions, I have added it because I consider its absence to be an accident. So I am pretty sure his total at the time was 15.
He later rejected S. Robert Ramsey's *b *d *g and I don't think he reconstructs *ɲ anymore. Subtracting those four consonants leaves only 11:
Only 9 or 10 of these could appear initially. I doubt that *r- ever appeared in initial position. Korean initial r- is only in loanwords. There is no evidence for Korean initial *l-, though Altaic-type languages can have initial l-: e.g., Manchu and Mongolian.
On the other hand, all 11 could appear medially. Therefore if *CVC- sequences were contracted to *CC-, early Korean could have had up to 99 different clusters:
(9 consonants) x (11 medial consonants) = 99
However, if Cy-clusters are ignored, Middle Korean (MK) only had eleven clusters:
|s-clusters and geminate ss||sp-||st-||sn-||ss-||sk-|
Where did all the other clusters go?
I have already discussed how *k- and *h-clusters became MK aspirates and how nasal-obstruent sequences might have become MK voiceless obstruents.
(11.22.3:10: If initial *l- existed, it might have merged with *n-: e.g., *lk- and *nk- could have merged into *nk- which simplified to MK k-.)
Other speculations:Simplification of obstruent + sonorant > obstruent: e.g.,
|*pl-||*pr- or *py-?||p- or py-?|
|*pn-||*pr- or *n-?||p- or n-?|
Similar tables could be written for *t-, *ts-, and *k-.
Absorption of *t(s)- into following obstruent:
*yVC- could have
- never contracted
- merged with yaC- and yəC- depending on vowel height- become iC-
- become C- (cf. the Czech pronunciation of js- as [s])
I'm out of time, so I'll discuss *sC-clusters tomorrow.
09.11.19.2:07: DID KOREAN HAVE PRENASALIZED OBSTRUENT INITIALS?
In my 1999 PhD dissertation and 2003 book, I reconstructed prenasalized obstruents in Old Japanese corresponding to the voiced obstruents of modern Japanese:
I have excluded modern Japanese allophones: e.g., the [ŋ] pronunciation of /g/.
OJ prenasalized obstruents generally did not appear in initial position except in Chinese loanwords. Most modern Japanese words with voiced obstruent initials are loanwords. Native modern Japanese words with voiced obstruent initials have usually lost an initial vowel: e.g., de- < OJ inde- 'emerge'.
I have been wondering if early Korean not only had NC clusters like OJ but also had them in initial position. Such clusters would result from vowel loss:
*NVC- > *NC-
I would expect these clusters to become oral obstruents in Middle Korean:
*Nvh might have simplified to m- or n- depending on whether *N- was *m- or *n-.
The vowels most likely to be lost are the 'minimal' vowels *ʌ and *ɯ. I discussed an instance of possible *ɯ-loss in my last entry:
'big': PK *hɯkɯ- > MK khɯ-
The above scenario predicts that Middle Korean would have few or no words beginning with
nasal + minimal vowel + obstruent
Is this actually the case? I quickly counted entries in Yu's (1964) Middle Korean dictionary:
The above figures include forms over a wide span of time. A strict count excluding late forms and variants (e.g., mʌsʌm for mʌzʌm) might reveal a different pattern. In any case, it is clear that
nasal + minimal vowel + grave obstruent
is rare (and nonexistent if the grave obstruent is p).
The five exceptions with -k- are really only two, and their -k- may be from earlier clusters:
- four derivatives of the root mɯk- 'heavy' < ?*mɯCk-
- mɯkɯyəti- 'to spoil, rot' (in 類合 Yuhap, 1576); modern mulkhŭrŏji- cannot be a descendant of that word and implies PK ?*mɯlkkɯləti-.
Are all -t-, -s-, -ts- after nasal + minimal vowel sequences also from earlier clusters? I am particularly hesitant to derive all 27 (!) instances of mɯs- from mɯCs-. I have no idea why mɯs- is so common.
Perhaps nasal + minimal vowel + grave consonant sequences
*mɯp-, *mʌp-, *nɯp-, *nʌp-
*mɯk-, *mʌk-, *nɯk-, *nʌk-
were more prone to vowel loss than nasal + minimal vowel + coronal consonant sequences
*mɯt-, *mʌt-, *nɯt-, *nʌt-
*mɯs-, *mʌs-, *nɯs-, *nʌs-
*mɯts-, *mʌts-, *nɯts-, *nʌts-
though I don't know why.
In my last post, I mentioned how Alexander Vovin's (1996) reconstructed Proto-Korean *tk and *kt as sources of Middle Korean th. Other Middle Korean aspirates are also from PK *k-clusters in his reconstruction:
MK ph < *pk, *kpMK tsh < *tsk, *kts
MK kh < *kk
Note that not all *k-clusters beccame aspirates:
MK (p)sk (nonaspirates) < *(p)sk
Korean aspirates in Chinese loanwords may or may not reflect Chinese aspirates. This is a complex issue that I should address elsewhere.
The clusters above presumably come from earlier *CVC sequences. For example, Vovin (1996) reconstructed PK *kɯkɯ- 'big' as the ancestor of MK khɯ- 'id.' This word is attested in 雞林類事 Jilin leishi (1104) in Chinese transcription as
very Late Middle Chinese.*xəʔkən
vLMC lacked the syllables *xɯ and kɯn, so its schwa is not evidence for an early Korean schwa in 'big'.
vLMC *xəʔkən could have represented an early MK *xkɯ-n or *hkɯ-n (with a preaspirated initial) or *hɯkɯ-n. (*-n is an adjective ending.) I don't know of any language in which kk > xk, so I would prefer to reconstruct *hɯkɯ-n.
Vovin proposed that Proto-Tungusic *k became Manchu x. Could a similar change have occurred in Korean? But if PK *k became MK h, where would MK k have come from?
If Korean clusters came from *CVC sequences, there must have been *CVC sequences of the type *NVk-. Vovin proposed that medial clusters became single nonleniting consonants: PK *-Nk- > Mk -k-. If this shift is extended to initial consonants, then MK initial k- could also be from PK *Nk- < *NVk-, filling the gap left by PK *k-:
|Stage 1: Early Proto-Korean||*Nvk-||*k-|
|Stage 2: Vowel loss||*Nk-||*k-|
|Stage 3: Spirantization of old *k-||*Nk-||*h-|
|Stage 4: *N-loss; new k- in MK||k-||h-|
*NVk- > *Nk- > *k- > h-
There is one huge problem with this proposal. k is very common in MK and modern Korean. In Yu's (1964) MK dictionary, the k- section (including some sk-) is 120 pages long (15% of the text), but the h- section is only 42 pages long. Although PK *Nk- may be a source of MK k-, I doubt that PK initial *Nk- < *NVk- really outnumbered PK *k- by a ratio of three to one. So instead of trying to derive MK h from PK *k, I will follow Vovin (1996) and reconstruct *h and *k as separate phonemes in PK:
|*NVk- > *Nk-|
If PK had *h, MK aspirates may also derive from PK *Ch- and *hC- clusters; e.g.,
MK khɯ- < PK *hɯkɯ- 'big'
*h- and *k- clusters must not have been numerous in PK because Yu (1964) has relatively few entries with MK aspirate initials:
|MK aspirate initial||PK sources||Number of pages||% of text|
|tsh-||*tsVk-, *kVts-, *tsVh-, *hVts-||16||2.1|
|kh-||*kVk-, *hVk-, *kVh-||4||0.5|
|th-||*tVk-, *kVt-, *tVh-, *hVt-||11||1.4|
|ph-||*pVk-, *kVp-, *pVh-, *hVp-||11||1.4|
Although kh- has one fewer source than other aspirates, it is only a third as common as th- and ph- and a fourth as common as tsh-. Yet the k- section may be the largest in the book. Why is kh- so uncommon? Did some of its proposed PK sources actually become MK h-?
David Boxenhorn asked me if pre-Tangut *tC- could have merged with *pC- or *kC-. My initial answer was no. Then I remembered that I proposed mergers of *t- with*s- or *r-:
Table 1. *t- and *s- merger scenario
|Pre-Tangut stage 1: consonant clusters||Pre-Tangut stage 2: tense consonants||Tangut: tense vowels|
Table 2. *t- and *r- merger scenario
|Pre-Tangut stage 1: consonant clusters||Pre-Tangut stage 2: metathesis||Pre-Tangut stage 3: medial *-r-||Tangut: retroflex vowels|
*t-tV would still merge with *s-tV in this scenario.
Later, I thought of a third scenario. What if preinitial *t- survived in some morphemes whose original root initials had become -w- or -h-?
Table 3. Preinitial > initial > -w-/-h- chain scenario
Pre-Tangut stage 1: consonant clusters
|Pre-Tangut stage 2: lenition||
Pre-Tangut stage 3: metathesis and merger
*t-p-, *t-ph-, *t-b-
*t-k-, *t-kh-, *t-g-
*k-t-, *k-th-, *k-d-
I was influenced by Alexander Vovin's (1996) reconstruction of Proto-Korean *tk and *kt as sources of Middle Korean th.*p-th- and *p-d- became thw- and dw-, not tw-.
*t-b- might have become *db- > *dβ- > dw-.
*t-g- might have become *dg- > *dɣ- > d- or even the dh- in Tibetan transcriptions of Tangut (which may represent a nonstandard dialect with a d- : dh-distinction).
If the preinitial-to-initial chain scenario is correct, some Tangut morphemes with initial tw- < *t-P- should have labial-initial cognates and some Tangut morphemes with initial th- < *t-K- should have velar-initial cognates. Do such cognates exist?
09.11.16.3:06: DOES PROTO-RGYALRONG NEED BOTH *L- AND *J-?
Guillaume Jacques reconstructed both on pp. 333-334 of his 2004 PhD dissertation, but I think only one might be necessary. They occur in complementary distribution in his reconstruction:
Both became j- in Japhug except before *-ʑ-. *l- to *j- shifts are in bold.
If *l- were reconstructed instead of *j-, the only oddity would be *ll-:
I would tentatively reconstruct *cl- instead of *ll-, since Guillaume (2009: 324) wrote
[Japhug] jl- pourrait venir de *cl- ou de *jl- en PGR [Proto-rGyalrong]. Ce groupe pourrait avoir plusieurs origines, comme le suggère la correspondance avec le zbu rj- et lɟ- [ʎɟ- on p. 323].
Oddly, he does not mention PGR *cl- again, and his table on p. 334 lists PGR *jl- as the sole source of Japhug jl-. Here's a table of correspondences for Japhug jl- based on Guillaume's table 261:
|Proto-rGyalrong (my guesses)||Japhug||Somang||Zbu|
|*cl-||jl-||jl-||lɟ- or ʎɟ-|
Here's how my PRG forms might have developed into their modern reflexes:
PRG *cl- > *ɟl- > *ʑl- > Japhug and Somang jl-
PRG *cl- > *lc- > Zbu lɟ- or ʎɟ-
PRG *lr- > *jr- > Japhug and Somang jl-PRG *lr- > *rl- > Zbu rj-
(but where does Japhug jr- come from?)
09.11.15.15:39: PRE-TANGUT PREINITIALS: GUILLAUME JACQUES' PROPOSAL (PART 1)
I don't have time to go through Guillaume's 2009 paper 原始西夏語的前置音 "Pre-Tangut Pre-initials" in detail right now, so I'll just provide a simple comparative table. Non-Tangut preinitials are merely listed to show the preinitial inventories of related languages: e.g., one should not assume that all labial preinitials are descendants of a single Proto-Sino-Tibetan prefix *p- (or that preinitials are prefixes rather than parts of roots).
Table 1. Preinitials and presyllables in selected Sino-Tibetan languages
|Sagart's Old Chinese (1999)||Classical Tibetan||Proto-rGyalrong (Jacques 2004)||Japhug rGyalrong||Mawo Qiang||Pre-Tangut (this site)||Pre-Tangut (Jacques 2009)||Tangut reflex of pre-Tangut preinitial or presyllable|
|*p-||b-||*p-||p-, w-, f-, β-||(none; *ɸ-, *β- merged with *x-, *ɣ-?)||*p-||*p-||medial -w-|
|*t-||d-||*t- (rare; only before p-)||(*t- merged with *r-?)||no equivalent||(same as *r-?)|
|*s-||s-||*s-, *ɕ-||s, z-, ɕ-, ʑ, ʂ-||s-, z-, ʂ-||*s-||*S-||tense vowel|
|infix *-r- (= my preinitial *r-)||r-||*r-||r-||r-, hr-||*r-||*r-||retroflex vowel|
|l-||*l-, *j-||l-, j- (< *l-, *j-)||*l-||no equivalent||Grade III/Grade IV medial|
|*k-||g-||*k-||k-, x-, ɣ-||x-, ɣ-||*k-||no equivalent||aspirate initial|
|*m-, *N-||m-. H- [N]||*m-, *n-||m-, n-||m-||(*N-)||no equivalent||voiced obstruent initial|
|*q-||(*q- > *ɣ- > H-?)||*q-||χ-, ʁ-||χ-, ʁ-||(*q- merged with *k-?)||no equivalent||(same as *k-?)|
All preinitials are only a subset of initials: e.g., all of the above languages have a distinction between unaspirated and aspirated initials that is absent from preinitials* and none have affricate preinitials.
I did not know about Guillaume's reconstruction until last Thursday, and I have been reconstructing preinitials and presyllables on this blog at least as far back as 2007. So I was happy to see that we reconstruct three preinitials more or less identically:
*s- (Guillaume's *S-)
Gong (1999) proposed *s-, but I reconstructed *p- and *r- independently of Guillaume.
Guillaume's preinitial *C- corresponds to my presyllable *CV-. In my reconstruction, obstruents lenite in intervocalic position: e.g.,
*CV-pV > *CV-bV > *CV-βV > *βV > vV
*CV-tV > *CV-dV > *CV-lV > lV
*CV-(t)sV > *CV-(d)zV > *CV-zV > zV
*CV-kV > *CV-gV > *CV-ɣV > ɣV
I got this idea from Vietnamese and Korean intervocalic lenition, though Korean never lost the first vowel of the conditioning environment:Vietnamese:
*VpV > *VbV > *VβV > βV > vV
*VtV > *VdV > VdʲV > dʲV > zV (orthographic dV)
*VcV > *VɟV > ɟV > zV (orthographic giV)
*VkV > *VgV > *VɣV > ɣV (orthographic gV)
*VpV > *VbV > *VβV > VwV
*VtV > *VdV > VrV
*V(t)sV > *V(d)zV > VzV > VV
*VkV > *VgV > *VɣV > VV
Three of my preinitials have no equivalents in Guillaume's reconstruction:
1. *t-: I reconstructed this to fill out the subsystem of stop preinitials. Since Tangut has many syllables with retroflex vowels and I found it difficult to believe that they all originated from syllables with *r, I wondered if some originated from earlier *TVCV syllables:
Stage 1: *tVCV, *thVCV, *dVCV
Stage 2: *tVCV (merger of dental stops before unstressed vowel)
Stage 3: *t-CV (loss of unstressed vowel)Stage 4: *d-CV (voicing of *t-; assimilation before a voiced C?)
Stage 5: *r-CV (lenition of *d)
Stage 6: *CrV (metathesis to avoid awkward *r-C-sequence)
Stage 7: *CrVʳ (retroflexion of vowel after *r)
Stage 8: *CVʳ (loss of *r)
However, the rarity of *t- in Proto-rGyalrong and the absence of t- in Mawo Qiang make me wonder if pre-Tangut did not have had *t-. Did their common ancestor merge Proto-Sino-Tibetan *t- with another preinitial?
2. *k-: I reconstructed this to account for aspirated initial members of word families. I was inspired by the development of aspiration in Korean
*kC-, *Ck- > Ch-
though *Ck- is not a source of Tangut aspirates:
Table 2. Tangut aspiration alternations and their pre-Tangut sources
|p- ~ ph-||*p- ~ *k-p-|
|t- ~ th-||*t- ~ *k-t-|
|ts- ~ tsh-||*ts- ~ *k-ts-|
|tʃ- ~ tʃh-||*tʃ- ~ *k-tʃ-|
|k- ~ kh-||*k- ~ *k-k-|
If pre-Tangut had a *q- (cf. the uvular preinitials of rGyalrong and Qiang)4, a similar table could be written substituting *q- for *k-.
3. *N-: I reconstructed this to account for voicing alternations in word families, but I now view these as zero ~ *k-alternations:
Table 3. Tangut voiced ~ aspirate alternations and their pre-Tangut sources
|Tangut alternation||My previous pre-Tangut||My current pre-Tangut|
|b- ~ ph-||*N-ph- ~ *ph-||*b- ~ *k-b-|
|d- ~ th-||*N-th- ~ *th-||*d- ~ *k-d-|
|dz- ~ tsh-||*N-tsh- ~ *tsh-||*dz- ~ *k-dz-|
|dʒ- ~ tʃh-||*N-tʃh- ~ *tʃh-||*dʒ- ~ *k-dʒ-|
|g- ~ kh-||*N-kh- ~ *kh-||*g- ~ *k-g-|
|l- ~ lh-||*N-lh- ~ *lh-||*l- ~ *k-l-|
*k- lenited to *x- (cf. Qiang), metathesized, and devoiced the adjacent voiced obstruent: e.g.,
*k-b- > *x-b- > *bx- > *bh- > ph-
If *N- existed, I would expect it before nonaspirates as well as aspirates:
Table 4. Hypothetical Tangut voiced ~ nonaspirate alternations and their pre-Tangut sources
|Hypothetical Tangut alternation||Pre-Tangut|
|b- ~ p-||*N-p- ~ *p-|
|d- ~ t-||*N-t- ~ *t-|
|dz- ~ ts-||*N-ts- ~ *ts-|
|dʒ- ~ tʃ-||*N-tʃ- ~ *tʃ-|
|g- ~ k-||*N-k- ~ *k-|
But Gong (1988) did not find such alternations, and I cannot imagine why *N- would only attach to aspirates. So it is simpler to reconstruct a single prefix *k- attaching to both voiced and voiceless obstruents instead of two prefixes *k- and *N- in complementary distribution.
Although I suspect pre-Tangut did have a preinitial nasal like its relatives, such a nasal cannot be reconstructed on the basis of known alternations.
4. *l-: Until last night, I assumed that if Tangut had an *l-, it merged with *r-:
Table 5. *l-/*r-merger scenario 1
|Stage 1||Stage 2: merger||Stage 3: metathesis||Stage 4: retroflexion||Stage 5: medial loss|
Table 6. *l-/*r-merger scenario 2
|Stage 1||Stage 2: metathesis||Stage 3: merger||Stage 4: retroflexion||Stage 5: medial loss|
But more recently, I wonder if *l- could be a source of medial -ɨ- (Grade III) and -i- (Grade IV) in Tangut:
Table 7. *l- as source of Grade III/IV
|Stage 1||Stage 2: lenition||Stage 3: metathesis|
|Grade III initial||*l-CV||*j-CV||CɨV|
|Grade IV initial||CiV|
The shift of *l- to j- occurred in rGyalrong: e.g.,
Proto-rGyalrong *lp- > jp- in Japhug tɤ-jpa, Somang ta-jpâ 'snow'
cf. Tangut vị (Grade III/IV neutralized) < *si-lpa 'id.' (derivation here**)
However, Proto-rGyalrong *l- doesn't always correspond to Tangut Grade III or IV: e.g.,
PG *lpaˠm > Japhug tɤ-jphɣom, Somang tə-rpâm, Zbu tɐ-lvɐ́m 'ice'
cf. Tangut vọ (Grade I) < *sʌ-pam 'id.' (derivation here***)
Guillaume reconstructed a distinction between PG *l- and *j- but I think a single *l- is sufficient. I'll explain why next time.
*It is possible that preinitials had allophonic aspiration in proto-languages: cf. Khmer:
**One possible derivation of 'snow':
[khmae] /kmae/ 'khmer' (with allophonic aspiration of the preinitial /k/)
*si-lpa > *si-pla > *si-pia > *si-bia > *si-βia > *si-βi >*s-βi > *zβi > *ββi > *ββị > *βị > vị
I assume that vowel harmony simplified *ia to *i before *i in a presyllable.
It's also possible that no *l- was involved in the 'brightening' of the vowel from *a to i:
*si-pa > > *si-pia > *si-bia > *si-βia > *si-βi >*s-βi > *zβi > *ββi > *ββị > *βị > vị
***One possible derivation of 'ice':
*sʌ-pV > *sʌ-bV > *sʌ-βV > *s-βV > *zβV > *ββV > *ββṾ >*βṾ > vọ
I have left out the development of the rhyme since I don't know its chronology relative to the development of the initial and tenseness with one exception: tense nasal *-ọ̃ merged with tense oral *-ọ̣ (unlike *-õ and *-o which remained distinct):
*-ạm > *-ạ̃m > *-ạ̃w > *-ɔ̣̃ > *-ọ̃ > -ọ
I reconstruct a low vowel in the presyllable to condition Grade I. A high vowel would have led to Grade III or IV. I used to reconstruct low vowels in presyllables to condition Grade II instead of I, but I doubt that was correct. I also used to think Grade II might be velarized. But currently I have no idea how Grade II originated.
Table 8. Presyllabic origins of the four Tangut grades
|Tangut||My old pre-Tangut||My current pre-Tangut|
|Grade III||*Cɯ-, *Ci- (grade determined by initial of root syllable)|
The above scheme is modelled after my Old Chinese reconstruction:
Table 9. Presyllabic origins of the four Middle Chinese grades (extremely simplified)
|Middle Chinese||Old Chinese||Example|
|Grade I||*Cʌ-||MC *kej < OC *Cʌ-ki|
|Grade II||*rʌ-||MC *kɛj < OC *rʌ-ki|
|Grade III||*rɯ-||MC *kɨi < OC *r(ɯ)-ki|
|Grade IV||*Cɯ-||MC *ki < OC *(Cɯ)-ki|
Note that MC *kej is in fact Grade IV in Yunjing because the grade system was devised after *kej > *kjej and the latter was classified as having a palatal rhyme like Grade IV *-i. But historically *kej and *ki have very different origins that are obscured by this later conflation. Grade 'IV' *kej < OC *Cʌ-ki has much more in common with Yunjing Grade I *ka < OC *(Cʌ-)ka than Yunjing Grade IV *ki < OC *(Cɯ)-ki.
The *-ej > *-jej change may not have occurred in the south, as there is no trace of *-j- in most southern Chinese languages or Vietnamese or Siamese loans from southern Chinese:
Table 10. Forms of 'chicken'
|Sinograph||Early MC||Late (non-southern) MC||Meixian Hakka||Cantonese||Xiamen Min||Chaozhou Min||Fuzhou Min||Vietnamese||Siamese|
|雞||*kej||*kjej||kai||kaj||lit. ke; collloq. kue < *koj < *kaj||koi||kie||kê < *kee (not giê < *kjee)||ไก่ kaj|
However, - *-j- could have dropped without a trace in Hakka and Cantonese, so kaj may be from *kjaj.
Pre-Vietnamese *kj- could have simplified to *k- before front vowels. gi- before front vowels appears to be a lenition of *c and *(t)s rather than a reflex of *kj-:
Siamese ไก่ kaj is probably a very old loan predating the development of *-j- before *-ej. The Xiamen and Chaozhou forms may also be archaisms. Literary Xiamen ke is a loan from literary MC *kjej.
giết < *CV-chết 'kill' (cf. chết 'die')
giêng < *CV-chiêng 'first of the year' < 正 MC *tɕieŋ
giống < *CV-chống 'breed' < 種 MC *tɕuoŋʔ
giếng < *CV-(t)siếng 'well' < 井 MC *tsieŋʔ