In Old Chinese, there were two types of syllables, A (emphatic) and B (nonemphatic). This opposition was originally conditioned by the height of the first vowel but later became phonetic after the loss of presyllables:

*CVlow-CV, *CVlow > *type A syllable (indicated by underlining)

e.g., *Cʌ-CV, *Ca > *CV (after loss of presyllable *Cɯ-), *Ca

*CVhigh-CV, *CVhigh > *type B syllable (indicated by lack of underlining)

e.g., *Cɯ-CV, *Ci > *CV (after loss of presyllable *Cɯ-), *Ci

(Mid behaved like a high vowel and could have been *ɨ. Crothers' (1978: 105) survey of vowel systems found that /a i u ɨ e o/ is the second most common vowel system after /a i u e o/ and doesn't list /a i u ə e o/ as a common vowel system.)

Vowels 'bent' differently depending on syllable type. The first halves of type A high vowels were pushed down whereas the first halves of type B nonhigh vowels were pushed up:

Early Old Chinese vowel *i *e *a *o *u
Late Old Chinese Type A vowel *ei *e *a *o *ou
Late Old Chinese Type B vowel *i *ie *ɨə *ɨa *uo *u

Type A vowels after medial *-r- generally lowered (*a fronted) whereas *-r- before type B vowels shifted to *-ɨ-:

Early Old Chinese *-rV sequence *ri *re *rə *ra *ro *ru
Late Old Chinese Type A vowel *ɛi *ɔu
Late Old Chinese Type B vowel *ɨi *ɨe *ɨə *ɨa *ɨo *ɨu

Some initial consonants also developed differently depending on syllable type:

Early Old Chinese consonant *t- *th- *hl- *hn- *d- *l- *n- *hr- *hŋ- *g(w)- *w-
Late Old Chinese Type A consonant *t- *th- *d- *n- *x- *ɣ(w)- *ɣw-
Late Old Chinese Type B consonant *tɕ- *tɕh- *ɕ- *dʑ- *j- *ɲ- *ʈh- *ɕ- *g(w)- *w-

Type A *x- and *ɣ- were probably uvular *[χ] and *[ʁ] which later fronted to velars.

One might expect type A *k- [q] to have become *x- just as *g- became *ɣ-, but it remained a stop and is still a stop in modern Chinese languages.

Fortition is common among type A consonants whereas palatalization and lenition are common among type B consonants. One exception to this generalization is the fortition of type B *hr- to *ʈh- (Sagart 1999: 40-42). Perhaps type B *hr- was a voiceless retroflex flap *[ɽ̊] that merged with the retroflex stop *ʈh-.

None of the above tables are exhaustive. I only intend to show some of the key differences between type A and B segments.

With a few exceptions, Late Old Chinese syllables contain initials and vowels of the same type. Hence *dɨək would be unexpected since it combines type A *d- with type B *ɨə. Yet Vietnamese được 'get' seems to be from LOC *dɨək, even though 'get' was 得 LOC *tək < OC type A *tək.

Vietnamese thời 'time' seems to be from Middle Chinese *ʑəj < LOC *dʑə with a type B initial and a type A vowel instead of the expected 時 MC *dʑɨ < LOC *dʑɨə < OC type B *də.

Last night, I mentioned Vietnamese thiệt 'real, true' which also seemed to combine a type B initial with a type A vowel. I proposed that it was borrowed from Southern Late Middle Chinese 2, a dialect distinct from the dialect that was the source of most Sino-Vietnamese readings. I initially thought that type A initial / type B vowel mixes like *dɨək 'get' were from a third southern dialect, but there are other explanations. Even mainstream Chinese has one important word that has a type mix:

地 MC *dih (combining a type A initial *d- with a type B vowel *i) < LOC *dieh 'earth'

Theoretically this word should be

type A MC/LOC *deih < OC *lajs

or type B MC/LOC *jieh < OC *lajs

Perhaps MC *dih is simply an irregular contraction of LOC *deih. However, Vietnamese địa 'earth' seems to be from the type A/B mixture *dieh. Maybe *dieh is from OC *Cʌ-Cɯ-lajs. The high vowel presyllable *Cɯ- conditioned a type B vowel but the low vowel presyllable *Cʌ- conditioned a type A consonant.

The two versions of 得 'get' could have had different prefixes:

OC *Cʌ-tək > *tək > mainstream LOC/MC *tək > SV đắc

OC *N-tək > southern variant LOC/MC *dɨək > Viet được

The *d- of LOC/MC *dɨək could be an archaism in a common word like the [t] of the standard Mandarin genitive particle 的 de [tə] from OC type B *tə. (The regular reflex of OC *tə became LOC *tɕɨə and is now standard Mandarin zhi with a different character 之.)

But a prefix-based explanation cannot account for the southern MC *ʑəj underlying Viet thời 'time', since it has the same type B initial as its mainstream counterpart:

? > LOC *dʑə > southern variant MC *ʑəj > SV thời

*də > LOC *dʑɨə > MC *dʑɨ > southern late MC *ʑi > SV thì

How could the vowel of the southern variant become type A if it was immediately preceded by a type B consonant?

3.28.0:56: Here are more examples of Vietnamese emphatic oddities:

Type A initial + type B vowel

đương 'in the act of' < MC *tɨaŋ

the regular đang < MC *taŋ < OC *taŋ also exists

Đường 'Tang Dynasty' < MC *dɨaŋ

the regular Đàng < MC *daŋ < OC *laŋ also exists

Type B initial + type A vowel

doan 'fate, existence' < MC *jwan

the regular duyên < MC *jwien < OC *lon also exists

I thought doan might be from LOC *jwɨan, but I would expect doàn with a huyền tone characteristic of the LOC-based loan stratum.

Not all oddities seem to be type A/B mixtures. I'll write about 'reversed type' and possible ablaut variants next. ARE THESE REALLY ALTERNATING RHYMES IN VIETNAMESE?

Thompson (1965: 70-71) listed vowel and rhyme alternations in addition to the consonant alternations I wrote about last night. The most unusual alternation is

-ật ~ -iệt ~ -ực

which as far as I know only appears in one set of words:

thật ~ thiệt ~ thực 'true, real, really'

All three words can be written with the same nom graph:

Southern Late MC *ʑɨk < MC *dʑɨək < OC *dək 'this; really'

And thật and thực can be written as

Southern Late MC *ʑət < MC *ʑit < OC *mlit 'fruit; full, solid, real, really'

寔 and 實 are now both shí in Mandarin and sat in Cantonese, even though they were not homophonous in Old Chinese.

What's going on here?

thực is the expected Sino-Vietnamese reading corresponding to Southern Late MC *ʑɨk via Old Vietnamese *ʑɨk.

thật is the expected Sino-Vietnamese reading corresponding to Southern Late MC *ʑət via Old Vietnamese *ʑət. Cantonese sat is also from LSMC *ʑət.

寔 and 實, once very different in Old Chinese, had phonetically converged in Middle Chinese and become interchangeable synonyms with interchangeable graphs.

Although thiệt is a nom reading for 寔, it is not a Sino-Vietnamese reading. Yet it too is a Chinese loanword. It belongs to a class of Chinese loanwords which sound as if they were amalgams of descendants of emphatic and nonemphatic Old Chinese syllables:

Nonemphatic OC *mlit > SLMC *ʑət > SV thật

Hypothetical emphatic OC *mlit > *mleit > *met > SLMC *mjet > hypothetical SV miệt

thiệt has the initial of thật and the rhyme of the hypothetical miệt.

These 'amalgams' may in fact simply reflect a second SLMC dialect which had undergone different sound changes or had different vowels: e.g.,

OC *mlit > SLMC2 *ʑjet > Viet thiệt


OC *mlit > *mleit > *ʑet > SLMC2 *ʑjet > Viet thiệt


OC *mlet (with -e- instead of -i-) > SLMC2 *ʑjet > Viet thiệt

Has this second SLMC dialect left any traces in modern Chinese languages?

Next: Did Vietnamese also borrow from a third SLMC dialect? VIETNAMESE INITIAL ALTERNATIONS IN THOMPSON (1965)

Last night, I wrote about a d- ~ nh- alternation in Vietnamese. That was only one of 14 alternations listed on p. 70 of Thompson's A Vietnamese Grammar. Here are the others with my speculations about their origins. These speculations need to be tested with comparative evidence and nom graphs with phonetics implying earlier initials. For simplicity, I use voiceless symbols to represent both voiced and voiceless consonants in initial position: e.g., *k- = *k- and *g-. I assume all preinitial obstruents are voiceless, so *k-c- could be *k-c- or *k-ɟ- but not *g-c- or *g-ɟ-. *C- is an unspecified oral preinitial consonant.

1. c- ~ tr- < *k- ~ *kl-

2. ch- ~ x- < *c- ~ *ch- < *c- ~ *k-c-

3. d- ~ đ- < *t- ~ *C-t-

4. d- ~ gi- < *j- ~ *k-j-

5. d- ~ n- < *C-t- ~ *ʔ-t-

cf. Viet nước < Proto-Vietic *ʔdaak 'water' (Thomon [never heard of it] and Tum in Laos have d-)

6. gi- ~ tr- < *k-j- ~ *C-l-

One nom graph for giai ~ trai 'male human' is 男+皆 < 男 'man' + 皆 *kjaaj, implying *kjai ~ *klai; *-j- may be from *-l- via *-ʎ-

but one nom graph for giời ~ trời < Middle Vietnamese blời 'sky' is 巴+俐 < 巴 *p- + 俐 *li; could the alternation be from *gj- ~ *bl- < *k-j- ~ *p-l-? Note that *bj- would become t-, not gi-.

7. l- ~ nh- < *mɲ- < *m(-)l-

e.g., lẽ ~ nhẽ < Middle Vietnamese mlẽ ~ mnhẽ < 理 Old Chinese *mʌ-rəʔ 'reason'

but lăm ~ nhăm 'five' (after mười ~ mươi 'ten') is also năm in isolation with an n- corresponding to an initial oral stop in other Vietic languages implying Proto-Vietic *ʔd- (cf. 'water' above); did n- irregularly palatalize to nh- [ɲ] after the -i of 'ten', then denasalize to *ʎ- > l-?

8. ng- ~ ngo- < *ŋ- ~ *p-ŋ-

but why aren't there more C- ~ Cw- alternations from *C- ~ *p-C-?

9. nh- ~ r- < *N-r- or *N-l- ~ *r-

10. s- ~ th- < *ʂ- ~ *ɕ- (after *Cr- > *ʂ- but before *ɕ- > th-)

11. s- ~ tr- < *Cr- (~ *Cl-?)

12. s- ~ x- < Middle Vietnamese [ʂ] ~ [ɕ] < *Cr- ~ *ch- (!)

The ultimate sources of s- and x- are too different to alternate. Could this alternation be of recent origin and reflect the merger of s- and x- as [s] in northern Vietnamese? Or did it originate at the Middle Vietnamese stage? Note that both th- and x- were [ɕ] during different periods:

Spelling Modern Hanoi Middle Vietnamese Old Vietnmaese
th [tʰ] [tʰ] *ɕ, *tsh, *th
x [s] [ɕ] *ch

3.25.23:50: I used to reconstruct OV *sh- (aspirated s) instead of ɕ-. *sh- would have hardened to th- just as *s- hardened to t-. It's possible that *ɕ- < *ch- could be confused with *sh-. However, sh is a very unusual sound. The only language with sh that I know of is Burmese which developed it from *ch. UPSID lists only three languages with aspirated fricatives: Burmese, its neighbor Sgaw Karen, and Mazahua in Mexico. All three languages have sh. No language in UPSID has any other aspirated fricatives: e.g., fh, vh, zh, xh, ɣh, etc.

13. th- ~ x- < *tsh- ~ *ch-

thanh and xanh < *tshɛɲ ~ *chɛɲ 'blue/green' are newer and older borrowings of 靑 Middle Chinese *tsheŋ [tsʰɛjŋ]. The x- < *ch- version was borrowed before Vietnamese incorporated the Chinese affricate *tsh- into its phonemic inventory. E-NH-IGMA OF THE MONOCAUSAL NASAL

Readers who know Chinese, Sino-Japanese, and/or Sino-Korean readings for 一/壹 'one' may have been surprised by the Sino-Vietnamese reading nhất with initial nh- [ɲ] (the spelling pattern is from Portuguese):

Graph Old Chinese Middle Chinese Mandarin Cantonese Sino-Japanese Sino-Korean Sino-Vietnamese
一/壹 *ʔit *ʔit yi yat ichi, itsu il nhất (also nhứt)

This word was also borrowed into Thai as เอ็ด ʔet, 'one' as a final digit in สิบเอ็ด sip ʔet 'eleven', etc. (สิบ sip 'ten' is also a Chinese loanword.) Vietnamese is the only language I know of in (South)east Asia which has a nasal initial for this Chinese word. (But I bet this word would also have an initial nasal if it had been borrowed into Muong.)

Similarly, Sino-Vietnamese has an unusual initial nh- in 因 'cause':

Graph Old Chinese Middle Chinese Mandarin Cantonese Sino-Japanese Sino-Korean Sino-Vietnamese
*ʔin *ʔin yin yan in in nhân

Other MC (and OC) *ʔin graphs are also read as nhân: e.g.,



烟 'smoke' also has an SV reading yên [ʔiən] < Late MC *ʔjen < OC *ʔin < *Cʌ-ʔin.

Yet 婣 MC *ʔin < OC *ʔin (like 因, etc.) 'affinity by marriage' has the SV reading yên [ʔiən] without an iniital nasal. (The unexpected rhyme -ên instead of -ân may reflect Late MC *ʔjen < OC *ʔin or be due to influence from 淵 SV uyên [ʔwiən] sharing the same phonetic.)

Moreover, 印 MC *ʔinh < OC *ʔins 'seal' has been borrowed twice into Vietnamese, first as in and then as ấn [ən] (cf. Cantonese yan). Although Vietnamese has no nh-reading for 印, the CUHK Cantonese database lists a second Cantonese reading I didn't know about, ngan. I suspect this is the unetymological ng- that alternates with initial zero in Cantonese. Such an ng- didn't exist when SV was borrowed because I know of no SV glottal stop ~ ng- alternations corresponding to Cantonese zero ~ ng-.

There are no other instances of SV nh- corresponding to MC *ʔ-. How did MC *ʔ- end up as SV nh- in only three syllables (一/壹 nhất ~ nhứt and 因 nhân and its homophones other than 婣)? One might propose that nh- in 'one' is from a Vietnamese nasal prefix plus *ʔ-, but why would such a prefix fail to attach to any other -morphemes other than 因 and nearly all of its homophones?

3.25.3:15: There are native Vietnamese words with d- ~ nh- alternations (Thompson 1965: 70):

dơ ~ nhơ 'dirty'

dện ~ nhện 'spider' (cf. Muong jeɲ, Nguon jen)

Vietnamese d- is from *j-. Comparative evidence implies that the nasal variant of 'spider' is secondary - perhaps from *N-jện with a nasal prefix that fused with the initial *j- of the root?

Vietnamese nhấc, nhắc 'lift' have no Vietnamese variants with d- < *j-, but they do have Muong, Arem, and Ruc cognates with j-.

Perhaps 一/壹 'one' and 因 'cause', etc. were originally *jất ~ jứt and *jân which are close to Cantonese yat and yan.

Maybe no prefix was involved. In Lao, Proto-Tai *j- merged with *ɲ-, whereas *ʔj- shifted to j- (Li Fang-kuei 1977: 178, 181). Perhaps there was a chain shift in Lao:

*ʔj- > *j- > ɲ-

The nasalization of PT *j- was regular in Lao, whereas early Vietnamese *j- could have been sporadically nasalized.

The initial nasals of 一/壹 and 因, etc. could have had different origins. The *j- of 因 *jân (and *jện 'spider'?) may have nasalized to assimilate to the nasal coda:

[jən] > [jə̃n] > [ɲə̃n] > [ɲən]

However, assimilation cannot account for the secondary nh- of syllables without nasal codas. WITH A QUALITY DOG UNDER A COMPLETE SUN

Last night, I examined a few graphs for Vietnamese nhắt 'small'. Thanks to John Bentley for finding two more:

撎 = 扌 'hand' (how is this semantic?) + 壹 SV nhất 'one' (phonetic)

壹+少 with the same phonetic plus 少 'few' (semantic)

扌 'hand' might have been chosen to match the left-hand element of the first graph of


listed as the spelling of lắt nhắt 'tiny' at LingWiki. Was lắt nhắt ever written as 搮撎 with two 扌 hands? (Contrary to LingWiki, 搮一 is a native Vietnamese expression.) 扌 'hand' resembles 犭 'dog', the element used in graphs for animals such as

犭+朮 chuột 'mouse' (朮 SV thuật 'glutinous millet' is phonetic)

犭 'dog' is the semantic element of another nhắt graph

𤢽 nhất as in chuột nhất 'small mouse'

which has an unexpected phonetic 質 'quality' (SV chất) with initial ch- instead of nh-. This phonetic is homophonous with 窒 SV trất 'obstruct' (yet another graph for nhắt) in northern Vietnamese which has merged ch- and tr-.

One last graph for nhắt as in chuột nhất


has the phonetic 日 SV nhật 'sun' with the seemingly irrelevant element 了 SV liễu 'complete'.

That online dictionary lists 2194C as the Unicode codepoint for 了+日. But 2194C is


with 孑 SV kiết 'lonely' on the left which makes more semantic sense than 了 'complete', since one could derive 'small' from 'lonely':

'lonely' > 'very few people' > 'very few' (cf. 少 'few' in 壹+少 nhắt 'small' above) > 'very small'

Could 了+日 be derived from 孑+日? ARRIVING IN A CAVE

Last Friday, I mentioned the Five Thousand Characters entry for 'shrew':

鼱鼩 𤝞 窒

(the third character is 犭+朮)

Sino-Vietnamese linh câu 'shrew' / native Vietnamese chuột nhắt 'small mouse' (!)

The Vietnamese translation is unusual, not only because it doesn't seem to match the Sino-Vietnamese, but also because the second character is an unusual choice for nhắt 'small'. The SV reading of 窒 'obstruct' (not 'small'!) is trất with tr-, not nh-. 窒 consists of 穴 'cave, hole' (semantic; what's being obstructed) and the phonetic 至 'arrive' (SV chí, also with an initial other than nh-).

I wonder if 窒 is an error for 壹 SV nhất 'one' (normally written 一) which would be a much better phonetic match for nhắt 'small'. This Nôm dictionary lists the similar-looking graph 噎 'choke' (note the 口 'mouth' on the left) and 一 'one' for nhắt 'small' (as in lắt nhắt 'tiny') but not 壹 'one' or 窒 'obstruct'. Neither dictionary at nomfoundation.org has an entry for nhắt. THE LOST LIQUID OF LIFE

In my last post, I tried to account for the liquid and affricate liquid initials of two sinographs

鼱 Cantonese, Mandarin jing ~ Sino-Vietnamese linh

靚 Cantonese jing ~ leng, Mandarin jing ~ liang (?)

with the phonetic

靑 MC *tsheŋ < OC *tsheŋ 'blue, green'

by reconstructing earlier initial sequences of affricates and liquids:

鼱 OC *tɯ-sɯ-l > *l > MC *lieŋ > SV linh in linh câu

but OC *tɯ-sɯ-l > *tsl > *ts > MC *tsieŋ > Ct, Md jing

靚 OC *tɯ-sɯ-leŋ(ʔ)-s > *leŋh > MC *lieŋh > Ct leng

but OC *N-tɯ-sɯ-leŋ(ʔ)-s > *N-tsleŋs > *dzleŋs >*dzeŋs > MC *dzieŋh > Ct, Md jing

(I have modified my reconstructions slightly since last night. I now suspect that 鼱 shares an *s-initial root with 鼪 OC *s(l)eŋ-s 'weasel', and I would like all 靑-graphs to have an *s-liquid sequence. See below.)

The l-readings resulted from the loss of presyllables before *-l- whereas the affricate readings resulted from the fusion of presyllables followed by medial *-l- loss.

If similar processes existed in English:

Presyllable loss: because > 'cause (this has already happened)

Fusion: because > bcause > pcause > phause (cf. Korean ph- which may partly be from *pk-)

If 鼱 and 靚 once had *tsl-, should other graphs with the same phonetic also be reconstructed with medial *-l-: e.g., was their phonetic 靑 *tshl in OC?

靑 in turn has a phonetic on top which definitely had a liquid in it: 生 MC *ʂɨeŋ 'live' with a retroflex initial that could go back to either OC *sr- or *rs-. Schuessler (2007: 459-460) regarded 生 as cognate to both *r- and *s-words. External evidence points toward an *r-initial root: e.g., Mikir reŋ 'live'. Hence perhaps the Proto-Sino-Tibetan *s-prefix plus *r-root of 生 was reinterpreted as an *s-root with an *-r- infix, and *r-less forms like

性 OC *seŋs 'nature'

were created from 生 OC *sreŋ by analogy with *s- ~ *sr- pairs like

洗 OC *sərʔ 'wash'

洒 OC *srərʔ < *r-sərʔ 'sprinkle'

(The pair is from Sagart [1999: 112]; the reconstructions are mine.)

However, reconstructing *-l- in 靑-characters opens up another possiblity. What if all graphs with 生/靑 had OC liquids?

Example graphs MC initial OC initial (Schuessler) OC initial (liquid hypothesis)
*ʂ- *sr- *sr-
*s- *s- *sl-
*ts- *ts- *t-sl-
*tsh- *tsh- < *k-s- *k-sl-
*dz- *dz- *N-sl-
鼱 and 靚 *l- *r- (not in Schuessler) *tV-sV-l-

All *(C)-(C)-r/l- sequences would fuse into fricatives or affricates whereas presyllables *sV- and *tV- would drop, leaving *l- as a new initial.

The liquid hypothesis allows me to state that all 生/靑-graphs had OC readings with a common core s(V-)RIŋ, with *-R- = *-r- or *-l- and *I = *e and a few instances of *i (綪倩輤 *ksliŋs).

The liquid hypothesis also allows me to link 星 OC *sleŋ 'star' to the *l-ŋ 'bright' word family that 靚 'beautiful' may belong to.

(3.22.0:23: But I am inclined to agree with Schuessler [2007: 344, 356] and reconstruct the 'bright' word family with *r- instead of *l-. Maybe there were two roots for 'bright', *l-ŋ [e.g., 靚星] and *r-ŋ [e.g., 亮朗]).

One problem with the liquid hypothesis is that it requires me to reconstruct a lot of *sRIŋ syllables (written with 生/靑) and very few *sIŋ syllables (not written with 生/靑: e.g., 觲觪騂 *seŋ). It is unlikely that syllables with *sR- outnumbered syllables with a simpler *-s-.

3.22.00:45: Another problem is that 生/靑-words have proposed external cognates with simple s- (Schuessler 2007: 431-432) rather than initials derived from earlier liquids: e.g.,

Graph Gloss OC (Schuessler) OC (liquid hypothesis) Written Tibetan WT gloss
green *k-sêŋ *k-sleŋ gsing-ma pastureland
clear *k-seŋ *k-sleŋ seng-po clean
星晴 clear sky *dzeŋ *N-sleŋ
clean *dzeŋh *N-sleŋ-s

If those words had roots with *sl- in Proto-Sino-Tibetan and OC, their WT cognates would have WT zl-, not s-.

Should those proposed cognates be rejected as coincidental lookalikes? Or should the *-l- of my reconstructions be rejected? I'd rather give up *-l- to keep the cognates.

