Many tones in Chinese-type languages arose from lost initial consonant distinctions: e.g., in Thai, the old initials hm- (as in the non-Thai name Hmong) and m- merged into m- (which is easier to pronounce than hm-), but words which once had different initials now have different tones due to what J. Marvin Brown (1975) called "the great tone split":

*hmaa (old tone A) = maa (rising tone) 'dog'

*maa (old tone A) = maa (mid tone) 'to come'

In Thai,

old voiceless initials (e.g., *hm-) tend to correlate with modern low tones

old voiced initials (e.g., *m-) tend to correlate with modern high tones

Using J. Marvin Brown's terminology, I call this the V-H (voiced-high) pattern. The Thai rising tone is initially lower than the mid tone in this Wikipedia chart.

But Cantonese has the opposite pattern; it's V-L (voiced-low):

old voiceless initials correlate with modern high tones

old voiced initials correlate with modern low tones

Simple models of tonogenesis (the process through which tones arise) I've seen assume that V-L is the norm. Is it? It would be interesting to see statistics on V-L vs. V-H in East Asian tonal languages.

Tonight I took a close look at the tones of 南昌 Nanchang Gan for the first time and thought that it was V-H since yang 42, 45, and 5 are all higher than yin 24, 21, and 2:

Old tone category ping shang qu ru
"Upper" or yin (< *voiceless initial) 24 213 21 21
"Lower" or yang (< *voiced initial) 42 55 5

Modern tones are described using a five-level scale: 1 is lowest and 5 highest. Combinations of numbers represent contour tones: e.g., the yangping tone 42 starts at 4 (medium high) and falls to 2 (medium low).

"Upper"/yin and "lower"/yang tones are derived from earlier *voiceless and *voiced consonants. However, looking at the Chinese Wikipedia, it turns out that "upper" and "lower" were reversed in the English Wikipedia, so Nanchang is V-L after all - yang 24, 21, and 2 are all lower than yin 42, 45, and 5:

Old tone category ping shang qu ru
"Upper" or yin (< *voiceless initial) 42 213 45 5
"Lower" or yang (< *voiced initial) 24 21 2

(The yin/yang reversal aside, some numbers are slightly different from those in the English Wikipedia: 45 instead of 55 and 2 instead of 21.)

But some other Gan dialects are partly V-H:

- In 鷹潭 Yingtan and 懷寧 Huaining, yinping is lower than yangping

- In 撫州 Fuzhou, yinru is lower than yangru

- Nanchang, 宜春 Yichun, and 洞口 Dongkou have low (yin?)shang*

- However, no dialect has V-H in the qu category

I wonder if Laurent Sagart figured out the history of Gan tones in his book Les dialectes gan.

4.22.00:52: Now I'm confused. 汉语方音字汇 (1962) lists only six tones for Nanchang Gan with only one ru category instead of two (yinru and yangru). One might conclude that Wikipedia describes a recent innovation, but the great tone split did not happen within the last fifty years! Do different Nanchang speakers have different numbers of tones?

It's been a long time since I looked at Karlgren's early 20th century data. Did he describe the tones of Nanchang?

*4.22.1:11: A quick check of a few Nanchang syllables implies that "shang" might have once been yinshang and that yangshang merged with yangqu after voiced obstruents: e.g.,

下 'bottom': Middle Chinese *ɣæ + yangshang tone = Nanchang ha + yangqu

cf. 下 'go down': Middle Chinese *ɣæ + yangqu tone = Nanchang ha + yangqu

夏 'summer': Middle Chinese *ɣæ + yangshang tone = Nanchang ha + yangqu

Note that yangshang became shang (not yangqu) after voiced sonorants: e.g.,

馬 'horse': Middle Chinese *mæ + yangshang tone = Nanchang ma + shang

cf. 把 'to hold': Middle Chinese *pæ + yinshang tone = Nanchang pa + shang

These shifts are not exotic; they also occurred in standard Mandarin though the contours of the tones are different and Mandarin lost a yin/yang distinction in qu tones:

下 'bottom': Middle Chinese *ɣæ + yangshang tone = Md xia [ɕja] + qu

cf. 下 'go down': Middle Chinese *ɣæ + yangqu tone = Md xia [ɕja] + qu

夏 'summer': Middle Chinese *ɣæ + yangshang tone = Md xia [ɕja] + qu

馬 'horse': Middle Chinese *mæ + yangshang tone = Md ma + shang

cf. 把 'to hold': Middle Chinese *pæ + yinshang tone = Md ba [pa] + shang

4.22.21:33: The upper and lower tone categories have been fixed in the English Wikipedia article on Gan. I am not at all upset about the initial reversal; it inspired this post and I've made similar errors myself on this blog. WHAT'S THE DIL WITH THIS HṚD PARADIGM?

Looking at this example in the Wikipedia article on devanagari (transliterations and transcriptions are mine)

Similarly, the sequence धड़कने <dhaṛakane> in दिल धड़कने लगा <dila dhaṛakane lagā> (the heart started beating) and in दिल की धड़कनें <dila kī dhaṛakanẽ> (beats of the heart) is identical prior to the nasalization in the second usage. Yet, it is pronounced [dʱəɽək.neː] in the first and [dʱəɽ.kənẽː] in the second.

made me wonder where the Hindi word दिल dil 'heart' came from. I would have expected a descendant of Sanskrit hṛdaya- 'heart' (the direct Sanskrit loan हृदय hriday doesn't count), but in fact that word became Hindi हिया hiyā 'heart, courage'. I can't find dil in Turner's A Comparative Dictionary of Indo-Aryan Languages.

Skt hṛdaya- is unusual in two ways.

First, why does it have initial h-? One might initially assume that there's nothing odd about h- since it corresponds to h- of its English cognate heart. But the expected correspondences are

Proto-Indo-European Sanskrit English
*k- ś- (not h-!) h-
*gh- h- g-, y- (not h!)
*gʷh- (g)h- b- (not h!)


PIE *k: Skt śṛṅga- 'horn' : Eng horn

PIE *gh: Skt haṃsa- 'goose' : Eng goose; Skt hari- 'yellow' : Eng yellow

PIE *gʷh: Skt √han 'kill', ghn-anti 'they kill' : Eng bane

(4.21.13:15: Are there instances of PIE *gʷh > Skt (g)h- and Eng g- or y-? I can't find any.)

So why isn't the Sanskrit word for 'heart' śṛdaya- instead of hṛdaya-? Skt śrad-dha- 'faithful' (lit. heart-place') has the expected initial ś-.

Second, why does hṛdaya- have the optional root stem hṛd- in all forms other than the "first five" (with 'strong' stems: nom. sg./du./pl. and acc. sg./du.)* according to Monier-Williams?

Case\Number Singular Dual Plural
Nominative hṛdayam hṛdaye (< hṛdaya-ī) hṛdayāni
Accusative hṛdayāni or hṛnd-i (with added -n-)
Stems for other cases hṛdaya- or hṛd-

Why isn't hṛd- optional for all forms? In other words, why is there no

nom./acc. sg. *hṛt (< -d; with final devoicing)

nom./acc. du. *hṛd-ī

nom. as well as acc. pl. *hṛnd-i with intrusive -n-

Given that hṛdaya- is neuter, Whitney's grammar (section 311b)

The class of strong cases, as above defined [i.e., the first five forms], belongs only to masculine and feminine stems. In neuter inflection, the only strong cases are the nom.-acc.-pl.

implies a different paradigm in which the short stem hṛd- is optional for all forms other than the third and sixth: i.e., the nom.-acc.-pl.:

Case\Number Singular Dual Plural
Nominative hṛdayam or hṛt (< -d)  hṛdaye (< hṛdaya-ī) or *hṛd-ī hṛdayāni (but not hṛnd-i!)
Stems for other cases hṛdaya- or hṛd-

Why doesn't the root hṛd- have its own full paradigm? The Cologne Digital Sanskrit Dictionaries paradigm generator creates one, but Gérard Huet's generator does not (and the latter creates the dubious acc. pl.  hṛd-aḥ with a masculine ending; cf. the genuine masculine acc. pl. su-hṛd-aḥ below).

Maybe a full paradigm did exist. According to Whitney's grammar (section 397a), in theory a small class of words like hṛd- is supposed

to lack the nom. of all numbers and the accus. sing. and du. (the neuters, of course, the acc. pl. also [i.e., the sixth form, not just the first five!]) [...] But the usage in the older language is not entirely in accordance with this requirement

so it's theoretically possible that such forms could be found in the Vedas or in early Sanskrit speech which must have contained more forms than are attested in the literature.

To add to the confusion, Macdonell gave the paradigm for the compound su-hṛd- 'friend' (lit. 'good heart', m.; section 77) which always has the short stem hṛd-:

Case\Number Singular Dual Plural
Nominative and vocative su-hṛt (< -d) su-hṛd-au su-hṛd-aḥ
Accusative su-hṛd-am
Stem for other cases su-hṛd- only

Could hṛd- have a full paradigm only as a member of a compound?

I assume the neuter adjective a-su-hṛd- (lit. 'not-good-heart') 'having no friend' also had a full paradigm:

Case\Number Singular Dual Plural
Nominative, accusative, and vocative a-su-hṛt (< -d) a-su-hṛd-ī a-su-hṛnd-i
Stem for other cases a-su-hṛd- only

*I would rather speak of eight forms rather than five by following Macdonell's grammar (section 73) and including the vocative in all numbers. But the vocative may not be considered a true case. Whitney does not mention the vocative in his sections on strong cases or words like hṛd-. Even Macdonell did not indicate that vocatives of the -declension had strong stems (section 101). He even omitted the vocatives of -declension duals and plurals! (4.21.12:03: They are identical to the nominative dual and plural.) SOMNOLENT SADNESS

Today I learned that


<bhāvaḥ zïm śrau2>

phaawaʔ sɯm saw

lit. 'state sleepy sad'

is Thai for 'depression'.

ภาวะ <bhāvaḥ> phaawaʔ 'state' is from Sanskrit bhāva 'being', cognate to English be.

The native word ซึม <zïm> sɯm 'sleepy' happens to sound like the som of Latinate sleep-words like insomnia.

เศร้า <śrau2> 'sad' at first looks like it should be from Sanskrit; it has the letter ศ <ś> which is normally found only in Sanskrit loanwords. Moreover, the cluster ศร <śr> is normally found only in Sanskrit loanwords: e.g., เศรษฐ <śreṣṭha> [seettha] 'best'*.

But on the other hand, Sanskrit loanwords don't normally end in <au> or have tone markers which are normally reserved for Thai words**. Is this a native Thai word respelled to look Indic? Was it ever spelled as เส้า <sau2> or เซ่า <zau1>*** without Sanskrit ศ <ś> and silent -ร- <-r->?

I thought เศร้า <śrau2> might be a Khmer loan respelled with Sanskrit ศ <ś>, but I couldn't find any Khmer word like ស្រៅ <srau>, and I can't think of any Khmer loans with tone markers.

*Why is economics called




lit. 'best-science'

in Thai? (Parentheses indicate a Thai character beneath a silencing symbol: e.g., <(ra)> = ร์ but <ra> = ร. Is the Khmer word



seethaʔ-saah (-h < *-s)

a calque of the Indo-Thai term?

**I recall that Gedney noted an exception: Sanskrit loha 'red' (cognate to English red; by extension, 'metal': i.e., something red?) was borrowed as โล่ห์ <lo1(ha)> ~ โล่ <lo1> loo 'shield' with a tone marker. This might be evidence for borrowing foreign final -h as an early Thai *-h that developed into tone class 1 (and eventually into a falling tone). Aha, found the Gedney passage I had in mind which lists even more Indic loans with tone markers.

***ส <s> + tone marker 2 and ซ <z> + tone marker 1 (not 2) both represent [s]-syllables with falling tones. OVERCOMING PEACEFUL PEOPLE

Today I was looking at food packaging from Hong Kong and it occurred to me that

克 Mandarin ke [khɤ], Cantonese haak 'gram' (the character means 'overcome')

安士 Mandarin anshi [anʂr̩], Cantonese onsi 'ounce' (the characters mean 'peace' and 'person')

weren't just weight units in different systems but were based on different Chinese languages. No Cantonese speaker would have borrowed gram as haak, and no Mandarin speaker would have borrowed ounce as anshi. The Mandarin word for 'ounce' is 盎司 angsi [aŋsz̩] (also less commonly spelled 盎斯*) with -s- instead of -sh-.

Mandarin 克 ke is closer to gram than Cantonese haak, but wouldn't Mandarin ge be even closer? Was the choice of 克 based on a third Chinese language which had a g-(like) reading for that character? Most Chinese readings of that I have seen have initial aspirated kh- with the exceptions of Cantonese with h- (from earlier *kh-) and Yangzhou kəʔ with an unaspirated k- that could stand in for g-. I doubt 克 for 'gram' is based on Yangzhou (or Taiyuan**).

I then wondered how ancient literate Chinese would interpret 克 'gram' and 安士 'ounce'. They would be puzzled by the Arabic numerals preceding what initially appear to be 'overcome' and 'peaceful person'. Would the substitution of Chinese numeral characters enable them to figure out from context that 'gram' and 'ounce' are actually units of measurement?

*盎司 has 15.5 million hits in Google, but its homophone 盎斯 has only 434,000 hits. DeFrancis' 1996 Chinese-English dictionary only lists the former, whereas Lin Yutang's 1972 C-E dictionary only lists the latter.

4.19.00:27: Hanyu fangyin zihui (1962: 19) lists Taiyuan khəʔ ~ k as well as Yangzhou kəʔ as readings of 克. Could these instances of unaspirated k- be errors for aspirated kh-? WERE THERE HMONG IN THE TANGUT EMPIRE?

Sitha Thor pointed out something I hadn't seen before in the Wikipedia entry on the Tangut people (emphasis mine):

Historically the "Qiang" was a summary term for the multiple ethnic groups who lived in northwest China and included the Tibetans, Han, and a small portion of the Miao/Hmong.

Did the historical 羌 Qiang ever include the Han (= Chinese proper)?

I am hesitant to equate the ancient Qiang with the modern Qiang whose language is related to Tangut.

Wikipedia has two sources for the inclusion of the Hmong among the Qiang. I can't see the first, but I did find the second at Google Books:

... [t]he [三苗] San Miao [...] were banished early in the third century B.C.E. by the legendary Emperor Shun to the region of [三危] Sanwei, later identified as present-day Gansu province [which was once Tangut territory]


1. Were the 三苗 San ('Three') Miao in any way related to the later 苗 Miao (= Hmong?)

2. If 苗 Miao was an attempt to transcribe a Hmong (or at least non-Chinese) ethnonym, could 三 also be a transcription of a non-Chinese word rather than the Chinese word for 'three'?

3. Should "third century B.C.E." be third millennium BCE? (Giles' A Chinese-English Dictionary lists 2255 BCE as the supposed accession date of Shun.)

4. How believable is this story given that "the legendary Emperor Shun" never existed?

5. Did someone think it would be fitting for the 三苗 San Miao 'Three Miao' to go to 三危 Sanwei ('Three Dangers')?

6. Were the descendants of the San Miao still in Gansu when the Tangut Empire was established in the 11th century AD over three thousand years after Shun's mythical reign? There are no Hmong speakers in Gansu today or any evidence for them in the Tangut Empire.

Tangut fonts by Mojikyo.org
Tangut radical and Khitan fonts by Andrew West
Jurchen font by Jason Glavy
All other content copyright © 2002-2012 Amritavision