08.11.22.23:57: IS 乎 A CONTRACTION?
Old Chinese has three locative prepositions which are nearly homophonous in modern Mandarin but were phonetically distinct until recently*:
於 Md yu2 < formerly yu1 < Middle Chinese *ʔɨə
乎 Md hu1 < formerly hu2 < Middle Chinese *ɣo
于 Md yu2 < Middle Chinese *wuo
I will discuss 于 in my next post.
I think the first two are derived from an Old Chinese *Nɯ-qɑ. (Beckwith** would be tempted to compare *Nɯ-qɑ with Japanese naka 'inside'.) 於 lost its presyllable after undergoing deemphatic harmonization, whereas 乎 absorbed its presyllable into its initial:
於 Middle Chinese *ʔɨə < *ʔɨa < *Nɯ-ʔɨa < *Nɯ-qɑ
(For now I'm assuming that original uvulars backed to glottals intervocalically.)
乎 Middle Chinese *ɣo < *ɣɑ < *ɢɑ < *ɴɢɑ < *ɴqɑ < *Nɯ-qɑ
The graph 於 is a variant of 烏 OC *Cʌ-qɑ 'crow'. The latter is also written 鴉 OC *Cʌ-qɑ with 牙 OC *Nɢjɑ (?*Nqjɑ) as phonetic***. 烏 and 鴉 could be reduplications:
*qʌ-qɑ < *qɑ-qɑ 'the bird that makes the caw-caw noise'
乎 is phonetic in 呼 MC *xo(h) and 嘑 MC *xo which may be from OC *Cʌ-qhɑ(-s) or *sq(h)ɑ(-s) (if OC *sq(h)- did not become MC *kh-).
虖, a variant of 乎, might have a redundant phonetic 虍 MC *xo < *Cʌ-qhla 'tiger'. *qhl- could be a contraction of an earlier *qV-l-: cf. other OC words for 'tiger':
於兔, 於菟, 於(虎+兔), 於檡 OC *qɑ-lɑ
苟竇 OC *qoʔ-los
*I do not understand how 於 came to be homophonous with 于. 於 is a high-frequency word, so it is unlikely that people would would change its tone. Moreover, other 於-graphs (淤瘀箊) are still pronounced with tone 1. I would expect obscure graphs to be misread like more common graphs, but in this case a common graph has been misread without any obvious motivation (beyond confusion with 于).
The first tone of 乎 Md hu1 may be by analogy with 呼 Md hu1.
**Christopher Beckwith believes that Chinese and Japanese are genetically related, whereas I am not sure if Chinese has any relatives outside of Sino-Tibetan.
08.11.22.14:20: SLANTING, SYANTING, OR SGANTING?
I have long been bothered by the graphs
牙 Middle Chinese *ŋæ 'tooth'
邪 Middle Chinese *ziæ 'crooked, evil'; also jæ 'question particle' (with alternate graph 耶)
I want to think that the former is phonetic in the latter. But how can I reconcile their completely different initials?
External evidence may point to *ŋ- at some earlier stage of 'tooth'. Many Southeast Asian languages have *ŋa-like words for 'tusk, ivory' (see Schuessler 2007: 550 for a list). It's not clear whether Old Chinese borrowed the word from the ancestor of one of these languages.
Unfortunately, there is no early external evidence for 邪. (邪 was used to write the first syllable of 'Yamato' in the 3rd century AD, long after the graph was devised.)
My old *-r-/-l-solution
A few years ago, I might have reconstructed the Old Chinese readings of the two graphs as
牙 OC *ŋra
邪 OC *sŋla (> MC *ziæ), OC *la (> 耶 MC jæ)
with a common phonetic denominator *ŋLa (*L = a liquid).
I would have cited 斜 'slanting' (presumably cognate to 邪 'crooked') with the phonetic 余 OC *la as evidence for *-l- in 邪 OC *sŋla. However, 斜 is a Late Old Chinese graph devised at a time when 余 OC *la had shifted to LOC *jɨa. Both 斜 and 邪 were LOC *ziæ, which is close to 余 LOC *jɨa. But that does not necessarily mean that 斜 and 邪 were once phonetically similar to 余 OC *la.
Moreover, there is almost no external evidence for a medial *-r- in 牙. Bahnar ŋəla 'tusk, ivory' resembles OC *ŋra, but it is probably a compound of two synonyms: one borrowed and one native. And I don't know if the -h- in Proto-Hmong *ŋha or Lushai ŋho 'tusk' is comparable to OC *-r-.
I have been inclined to reconstruct 牙 as OC *r-ŋa, with a prefix absent from the non-Chinese forms with simple *ŋ-. OC could have added a prefix to a borrowing from another language, or other languages could have borrowed the bare root from OC. 牙 OC *r-ŋa resembles a Pulleyblank-style reconstruction of 邪 as *s-ŋʲa with a palatalized velar nasal. But I am not convinced OC had such a phoneme. See Sagart (1999: 34-36) for a critique of Pulleyblank's OC *ŋʲ-. I also do not know if Pulleyblank would allow alternations between plain and palatalized velar nasals in a xiesheng series.
On Thursday night, I came up with two alternate solutions to the 牙/邪 problem. Neither is satisfying.
牙 OC *ŋja
邪 OC *s(ŋ)ja (> MC *ziæ), OC *ja (> 耶 MC jæ)
For some time, I've thought that some pure type B MC *j-series (e.g., 邪/耶) should be reconstructed with OC *j- instead of Sagart's (1999) *l- or his more recent (2007) *ɢ-. I've also reconstructed OC *-ja (cf. Starostin's OC *-ia) as thje source of MC *-iæ. OC *-a without a medial yod became MC *-ɨa. (Sagart has no *-j- medial, so his OC *-a unpredictably split into MC *-iæ and *-ɨa.)
I assume that OC initial *j- was always nonemphatic* but suspect that there was a medial *-j- in emphatic syllables: e.g., 八 OC *pret < ?*pʌ-rjat 'eight' (cf. Written Tibetan brgyad). The MC reflexes of medial emphatic *-j- and *-r- would be indistingushable: e.g., MC *ŋæ could be from either *ŋja or *ŋra. Perhaps both medials had merged into *-ɣ- in Middle Old Chinese (comparable to the *-h- in some of the related foreign words?). Central vowels following the velar medial *-ɣ- fronted before *-ɣ- was lost:
*-ɣa > *-ɣæ > *-æ
*-ɣə > *-ɣɛ > *-ɛ
One huge problem is the absence of external evidence for medial *-j- in 牙. The only foreign form with a palatal component in Schuessler (2007: 550) is (Written?) Mon ŋek (but the SEAlang Mon dictionary has initial n- and final -h: Old Mon gneh, neh, Modern Mon nèh).
Also, if Old Chinese roots lacked initial clusters (Sagart 1999: 20), then this solution requires a velar nasal prefix in 牙 and perhaps 邪:
牙 OC *ŋ-ja
邪 OC *s-(ŋ-)ja (> MC *ziæ), OC *ja (> 耶 MC jæ)
However, there is no evidence for such a prefix. (It is possible that the generic nasal prefix *N- [Sagart 1999: 74-78] could have been *ŋ- rather than *n- or *m-, but its exact point of articulation is unknown.)
incorporates Sagart's (2007) hypothesis of a voiced uvular stop as a source for MC *j-:
牙 OC *Nɢja
邪 OC *sɢja (> MC *ziæ), OC *ɢja (> 耶 MC jæ)
Non-Chinese languages could have borrowed the cluster *Nɢ- as *ŋ-.
First, there are several MC *ŋæ(ʔ/h) words written with 牙 as phonetic: 庌芽訝迓雅. 訝 and 迓 'meet' belong to a large MC *ŋ- word family whose members were written with several different phonetics (see Schuessler 2007: 551 for a list). Is it likely that most 牙-series words and the members of the 'meet' family all had an initial cluster *Nɢ- in OC?
Perhaps *N- should be interpreted as a cover symbol for more than one nasal prefix: one word could have had *n-ɢ-, another *m-ɢ-, etc. But why are there no members of the 'meet' family with reflexes of the bare root initial, or the root initial plus a nonnasal prefix?
Maybe the root 'meet' had *ŋ- or *ŋ- (conditioned by presyllables) and the 牙-series mixed *Nɢ- ?[ɴɢ] with *ŋ- ?[ɴ]. Could other nasal-initial phonetic series contain OC *nasal-stop clusters?
The only 牙-graph without MC *ŋ- is 鴉 MC *ʔæ which could be from Sagart's *qqa (my *Qʌ-qɑ) with irregular retention of a low vowel (instead of the expected backing and raising to MC *-o; see Schuessler 2007: 83, 517).
Second, my reconstruction does not allow uvulars to become MC type B initials unless they are 'deemphasized' by nonemphatic presyllables: e.g.,
邪 OC *sɯ-ɢjɑ > *sɯ-ʁjɑ > *sɯ-ɣja > *sɣja > *zja > MC *ziæ
邪/耶 MC *jæ is a question particle. Grammatical words tend to be short and simple, so I would rather not reconstruct them with presyllables:
邪/耶 OC *Cɯ-ɢjɑ > *Cɯ-ʁjɑ > *Cɯ-ɣja > *ɣja > *ja > MC *jæ
Is there any language in which *ɢ- regularly became j-?
Perhaps there is a way out: 邪/耶 MC *jæ and another question particle 與/歟 MC *jɨə are probably contractions of 也乎 OC *ljajʔ-ɢɑ (Pulleyblank 1995: 9). Maybe their presyllables derived from nonemphatic 也 *ljajʔ:
也乎 OC *ljajʔ-ɢɑ > *li-ɢɑ > *li-ɢjɑ > *lɯ-ɢjɑ > *lɯ-ʁjɑ > *lɯ-ɣja > *ɣja > *ja > 邪/耶 MC *jæ
with palatality of presyllable spreading into following syllable
也乎 OC *ljajʔ-ɢɑ > *li-ɢɑ > *lɯ-ɢɑ > *lɯ-ʁɑ > *lɯ-ɣɨa > *ɣɨa > *jɨa > 與/歟 MC *jɨə
without palatality of presyllable spreading into following syllable
Next: Is 乎 OC ɢɑ also a contraction?
*What would happen if *j- were preceded by an emphatic presyllable? Would the glide have backed and strengthened to a velar fricative?
*Cʌ-j- > *Cʌ-j- > *Cʌ-ɣ- > *ɣ-
But I can't think of any xiesheng series with MC *j- ~ *ɣ- alternations.
Some Old Chinese reconstructions I have seen (e.g., Sagart's) don't have initial *j-. j (position unspecified) occurs in 83.81% of the languages in the UCLA Phonological Segment Inventory Database. If Old Chinese had an aspirated labiouvular *qhw- (in only 4 UPSID languages), I would expect it to have a less exotic *j-. Although I did not reconstruct an OC *j- for years due to the influence of Starostin and Sagart, I have come to suspect that I was wrong, and I was pleased to see that Schuessler (2007) reconstructed OC *j- (though he and I may not necessarily reconstruct it in the same words).
For at least a year, I've reconstructed OC *j- in phonetic series with pure type B Middle Chinese readings: e.g., 羊 'sheep' (GSR732):
|Sinograph||MC initial||MC type||My OC initial||Sagart's (2007) initial||Sagart's (1999) initial|
If GSR755 were originally an emphatic (type A) series, I would expect at least one MC type A reading, but there are none.
Moreover, my reconstruction forbids uvulars from directly becoming MC type B initials. If I reconstructed GSR732 as a uvular series, I would need nonemphatic presyllables for all its OC readings: e.g., 羊 *Cɯ-ɢɑŋ.
Finally, the probable external cognates that Sagart listed have (*)j-:
羊 OC *jaŋ 'sheep' : Proto-Tibeto-Burman *jaŋI suspect that WT g-yang is yang plus a presyllable *[gə]. Although g-y- looks like the *ɢ- Sagart reconstructed in 祥, it is the -y- that should be compared with the OC root initial (my *j- or Sagart's *ɢ- [and his earlier *l-]).
洋 OC *jaŋ 'great expanse of water' : Written Tibetan yangs-pa 'wide, broad, large'
祥 OC *jaŋ 'auspicious' : Written Tibetan གཡང g-yang (not gyang!) 'happiness, blessing, prosperity'
Could a Proto-Sino-Tibetan cluster *ɢj- have been the ultimate source of WT g-y- and MC *j-? Maybe uvulars shifted to velars before medial *-j- in OC. But *j requires fewer changes than *ɢj-.
In this paper, Laurent Sagart derived Middle Chinese velar-glottal alternation series from Old Chinese uvular series. Sagart proposed two types of uvulars, type A (my *emphatic) and type B (my *nonemphatic). He used doubled letters to represent type A consonants:
Sagart *qq- (type A) : my *q-
Sagart *q- (type B) : (no equivalent)
I don't reconstruct type B uvulars in Old Chinese since I know of no language which has both emphatic and nonemphatic uvulars. According to Islam Youssef (2006), Cairene Arabic consonants all have emphatic and nonemphatic versions with the exception of q which can only behave like an emphatic. I assume that Old Chinese had a consonant inventory consisting of emphatic/nonemphatic pairs plus inherently emphatic uvulars and labiouvulars:
(This set is not unique to any of my three solutions.)
Sagart reconstructed six pairs of (labio)uvulars (rewritten here in a notation closer to mine, apart from graphic gemination):
He generally reconstructed presyllables where I don't (and
vice versa). (His paper does not list Middle Chinese reflexes
of all possibilities, so I have had to guess some of them.
Guesses are indicated with question marks.)
|Sagart||My 'third strike' reconstruction||Middle Chinese||Middle Chinese type|
Not all sources of Middle Chinese initials are listed: e.g., MC *k- can also come from OC *k-.
I reconstruct more presyllables than Sagart does because I don't allow uvulars to become Middle Chinese type B initials unless they were 'deemphasized' by a nonemphatic presyllable:
*Cɯ-Q- > *Cɯ-K- > *K-
Unlike Sagart, I think MC *ʔ-/*x- can also originate from velars: e.g.,
影 MC *ʔɨɛŋʔ
< my OC *ʔɯ-kraŋʔ
cf. Sagart's OC *qraŋ
I reconstruct the 京 series (GSR755) as velar (and hence nonemphatic) because all MC readings in this series are type B:
|Sinograph||MC initial||MC type||My 'third strike' OC (presyllable-)initial||Sagart's OC (presyllable-)initial|
|憬||*kw-||*p-kr- or *kwr-||*Cə-qwr-|
If GSR755 were originally an emphatic series, I would expect at least some type A MC readings. Moreover, Sagart's reconstruction requires more presyllables than mine. His reconstruction only has one reading with a bare, presyllableless initial (影), whereas mine has several (京景勍鯨黥).
I also think some of Sagart's voiced uvulars could be reconstructed as OC *j-. I'll discuss this in detail next time.
Even the title of my last post was wrong - it needed a comma between "Yes" and "I"!
Can I come up with a better account of 可-words? Three strikes and I'm out. This time I dispense with a second stage of harmonization:
|Graph||Stage 1||Stage 2||Stage 3||Stage 4||Stage 5||Stage 6||Stage 7|
|阿||*Qʌ-qɑj||*Qqɑj||*ʔɑj||*ʔɑ||e [ɤ], a|
Stage 1: Early or pre-Old Chinese: *Q- could be *q- (which was always emphatic) or *ʔ-. *C- and *C- are consonants other than *q- or ʔ-. There is no way to know whether 哥 had a presyllable or not; if it did, that presyllable must have been emphatic with a low unstressed vowel.
Stage 2: Emphasis harmonization.
Stage 3: Height harmonization.
Stage 4: Presyllable vowel loss. It's also possible that presyllables were simply dropped from 哥 and 奇.
Stage 5: Late Old Chinese: The clusters *Qq- and *ʔk- may have merged into *ʔq- which simplified to *ʔ-.Stage 6: Middle Chinese.
Stage 7: Mandarin.
This scenario makes the following predictions for Late Old/Middle Chinese *k- ~ *ʔ- phonetic series:
1. There should be more type A readings since these series originally had emphatic core syllables.
2. There should be few *ʔ-readings in such series because most members of those series would have no presyllables or presyllables with nonglottal, nonuvular initials.
3. There should be more type A *ʔ-readings in such series because type A *ʔ- has more sources (< *Qʌ-q- and *ʔʌ-q-) than type B *ʔ- (< *ʔɯ-q-).
Here are the frequencies of Late Old/Middle Chinese initials for 可 series readings listed in Karlgren (1957: 19-20):
(*g- can only be a type B initial and *ɣ- can only be a type A initial. Type B *ŋ- may be from *Nɯ-q-. Type A *x- may be from *Qʌ-qh-.)
The statistics violate my predictions:
1. Type B readings outnumber type A readings by a ratio of 2 to 1. Maybe this was an original nonemphatic series with an archetypal reading *kaj (instead of emphatic *qɑj).
2. One-fourth (11/44) of the readings have initial glottal stop.
3. Type B *ʔ-readings outnumber type A *ʔ-readings by a ratio of 2.66 to 1.
Am I out yet? Maybe not. The 后-series with *x- (the fricative counterpart of *ʔ-) behaves largely as I would predict. In fact, it has no type B readings at all; its archetypal reading may have been emphatic *qo. I am surprised it has no glottal stop-initial readings:
|LOC/MC||*k-||*kh-||*ɣ- (< ?*N(ʌ)-q-)||*x- (< ?*Qʌ-qh-)||Total|
But looking at only two series proves nothing. Do *k- ~ *ʔ/x- phonetic series as a whole behave as I would expect?
I'm not satisfied with my attempt to outline the history of words written with 可, so I'm going to try again. This time I am only going to discuss the readings of four graphs:
|Graph||Stage 1||Stage 2||Stage 3||Stage 4||Stage 5||Stage 6||Stage 7||Stage 8||Stage 9||Stage 10||Stage 11|
||*ʔɑj||*ʔɑ||e [ɤ], a|
Stage 1: Early or pre-Old Chinese: All four readings consisted of the syllable *qɑj with or without a presyllable. The underlining indicates 'emphasis' (pharyngealization): */qɑj/ was phonetically *[qˁɑˁjˁ].
Stage 2: The nonemphasis of the presyllable *Cɯ- spread into the following syllable so that the entire word would be nonemphatic. Emphatic uvular *q- became its nonemphatic velar counterpart *k-. Emphatic back *ɑ became its nonemphatic central counterpart *a.
Stage 3: *a partly bent upward to assimilate with the high vowel of the preceding presyllable:
|Stage 1||Stage 2: emphatic harmony||Stage 3: partial height harmony|
|*ɯ ... ɑ||*ɯ ... a||*ɯ ... ɨa|
|nonemphatic followed by emphatic||both nonemphatic|
|high followed by low||high followed by high-low diphthong|
Stage 4: The unstressed presyllable *Cɯ- was lost or was fused with *k-:
*Cɯ-k- > *Ck- > *kk- > *k-
Stage 5: Presyllabic prefixes were added to 阿 and 椅.Stage 6: Medial *-q- backed to *-ʔ- intervocalically.
Stages 7-8: Harmonization (cf. stages 2-3).
Stage 9: Late Old Chinese: Loss of prefixes and phonemic emphasis.
Stage 10: Middle Chinese.
Stage 11: Mandarin: Old emphatic syllables have a velar or zero initial whereas nonemphatic syllables have palatal initials.
I don't like this scenario either because it requires prefixation (stage 5) that happens to follow the loss of prefixes (stage 4) and two waves of harmonization (stage 2-3 and 7-8). Although a lot could have happened in the fifteen or so centuries between Early and Late Old Chinese, such repetition is implausible. (However, if harmonization were a continually active process during most of the Old Chinese period, there would be no need to posit two waves of it.) I'll post a third scenario tomorrow.
try to give a quick example of what my article is about.
Chinese characters containing the phonetic element 可 'can, able, may' do not sound alike in Mandarin: e.g.,
阿 e, a
奇 ji, qi
But these syllables presumably were phonetically similar when their graphs were first devised. Why did they diverge so greatly from a common prototype?
Old Chinese had two types of syllables, 'emphatic' ('type A', indicated with underlining) and 'nonemphatic' ('type B', not underlined). In my paper, I suggested that Old Chinese had emphatic harmony: mixed-emphasis sequences (BA, AB) were converted to emphatic (AA) or nonemphatic (BB) sequences.
The words written with 可 originally consisted of type A syllables preceded by unstressed type A or B presyllables. Unstressed type B presyllables (*Cɯ-, *Nɯ-) shifted following type A syllables to type B:
|Character||Stage 1||Stage 2||Stage 3||Stage 4||Stage 5||Stage 6||Stage 7|
Stage 1: Early or pre-Old Chinese: All six contained Type A core syllables with uvular initials (q-, qh-, ɢ-) associated with Type A.
Perhaps all seven once contained *qaj if 可 *qhajʔ < *sqajʔ and 何 *ɢaj < *Nqaj.Stage 2: Core syllables developed high vowels (ɨ) if preceded by type B presyllables (*Cɯ-, *Nɯ-).
Stage 3: Some presyllables lost their unstressed vowels.
Stage 4: Former presyllables, now reduced to consonants, fused with following consonants. The type of the first consonant determined the type of the cluster: e.g., type B *C- + type A *q- = type B *kk-. The type A vowel *ɑ shifted to type B *a after type B clusters and *ɨ which became an exclusively type B vowel.
At some point around stage 4, medial *-q- was reduced to a glottal stop between vowels.
(Another possibility: *ʔ- originated from *qq- < *ʔ-q- < *ʔV-q-.)
Stage 5: Late Old Chinese: The type A/B distinction was no longer phonemic. (But it may have still been phonetic: e.g., the LOC phoneme */k/ could have had an allophone *[q] before the former type A vowel *ɑ.) Clusters fused into single initials. Presyllables were lost, leaving glottal stop initials as traces of their former existence.
Stage 6: Middle Chinese: LOC *-j was lost after *-ɑ-. LOC *-aj fused into MC *-e.
Stage 7: Mandarin: MC *-ɑ became Md -e except in Md 阿 a which has survived basically intact from Middle Chinese. MC *-ɨe simplified to Md -i, and MC velars (*k-, *g-) palatalized to Md j- [tɕ] and q- [tɕh] before -i.
I don't really want to promote my own article, so I won't even name it.
I do want to publicly thank my longtime linguistic collaborator John Bentley, my blogging buddy Sarah, and her friend Rebecca Tillman for helping me with the statistical section of my article. I did thank them in print, but some of them may never see my paper, so I thought it would be a good idea to express my gratitude in a more accessible location.
I also want to thank fellow blogger David Boxenhorn for the last five years of discussions on topics including 'emphasis' in Chinese, the seemingly endless Tangut mystery, and much more. If I could rewrite my paper today, it might be a lot more convincing because of the additional Arabic parallels that David and I have found since 2005.