Old Chinese disyllabic words are of several types:

1. Compounds: 君子 *kur tsəʔ 'gentleman' < 'lord' + 'master'

Some compounds are graphically and/or phonetically obscured: e.g.,

鳳凰 *N-pəm-s waŋ 'phoenix' < 風 *pəm 'wind' + 皇 *waŋ 'sovereign'?

(functions of affixes unknown)

2. Reduplications

Total: 踧踧 *Cʌ-Tikʷ Cʌ-Tikʷ 'level and smooth'


Onset (including preysllable):

蚰蜒 *lu lan 'centipede'

蠐螬 *Cʌ-TSi Cʌ-TSu 'beetle larva'

蟰蛸 *saw T-sew 'spider' (prefix not reduplicated!)

蟋蟀 *sit T-sut 'cricket' (ditto)


彷徨 *baŋ waŋ 'hesitant'

3. Neither compound nor reduplication

蝴蝶 *ga lep 'butterfly'

鷾鴯 *ʔəks nə 'swallow' (bird)

Compounds (category 1) were not subject to emphatic harmony rules just as compounds are not subject to vowel harmony in 'Altaic'-type languages: e.g., Mongolian галт тэрэг (not *галт тараг) 'train', literally 'fire cart' (a calque of Chinese 火車).

Categories 2 and 3 were subject to those rules, though there are exceptions. Words in these categories are nonbasic and tend to be within a few semantic domains: e.g., flora and fauna. I am particularly interested in category 3 words which defy emphatic harmony rules. If such words have no Chinese-internal etymologies and are not actually compounds, could they be borrowings from extinct non-Chinese languages that have gone extinct without any other traces? They remind me of vowel harmony-violating borrowings in 'Altaic'-type languages: e.g., Mongolian компутер (not *компутар) 'computer'.

Some category 3 words could even be hybrid compounds: e.g., the 蝶 *lep half of 'butterfly' has cognates in Tibeto-Burman, so it may be from a Proto-Sino-Tibetan root *lep, but perhaps the other half 蝴 *ga was a non-Chinese word for 'butterfly' that formed a redundant compound 'butterfly-butterfly'. DID OLD CHINESE LOSE *Q-ODAS BEFORE EMPHATIC SPREADING? (PART 1)

David Boxenhorn proposed that Old Chinese (OC) might have transitioned from having only one emphatic per root (see his proposal in my previous post) to allowing emphatic spreading. What if the *-q > *-k shift is from the first period, and apparent counterexamples are due to emphatic spreading from the second period?

Before seeing if this new proposal will work in part 2, I'm going to outline the development of emphasis in OC:

1. There is no evidence for emphasis in other Sino-Tibetan languages. Hence I regard emphasis as an OC innovation - possibly due to 'Altaic' influence (see the 1994 article by Norman introducing the OC emphatic hypothesis for parallels with 'Altaic') - or did 'Altaic' acquire emphasis from OC?

2. The earliest stage of OC had no phonemic emphatics. A small number of consonants (e.g., *q) conditioned backer, lower allophones of vowels: e.g., */qa/ [qˁɑˁ] vs. */ta/ [ta] and */qi/ [qˁɪˁ] vs. */ti/ [ti]. These consonants were primary emphatics.

3. All consonants immediately before lower series vowels (*a *e *o; unstressed in presyllables) became phonetically emphatic:

*/qa/ [qˁɑˁ] (primary emphatic)

*/ta/ [tˁɑˁ] (secondary emphatic)

*/tʌ-ti/ [tˁʌˁ-ti] (secondary emphatic)


*/ti/ [ti]

*/tɯ-ta/ [tɯ-tˁɑˁ]

(8.17.1:07: Did /k/ and /q/ begin to merge at this point? The emphatic allophone of /k/ would be *[qˁ] just like /q/ [[qˁ] unless OC had both *[kˁ] and *[qˁ].)

4. Emphasis spread from presyllables into root syllables:

*/tʌ-ti/ [tˁɑˁ-ti] > */tʌ-ti/ [tˁʌˁ-tˁɪˁ]

*/tɯ-ta/ [tɯ-tˁɑˁ] > */tɯ-ta/ [tɯ-ta]

Presyllables and root syllables had to match in terms of emphasis. There was also a strong tendency for disyllabic and reduplicated morphemes to match in terms of emphasis: e.g.,

蜈蚣 */ŋʷa koŋ/ [ŋʷˁɑˁ qˁɔˁŋˁ] 'centipede'

髑髏 */dok ro/ [dˁɔˁqˁ rˁɔˁ] 'skull'

鵜鶘 */le ga/ [lˁɛˁ ɢˁɑˁ] 'pelican'

螻蟈 */ro kʷək/ [ʁˁɔˁ qʷˁʌˁqˁ] 'kind of frog'

昆吾 */Cʌ-kun ŋa/ [Cˁʌˁ-qˁʊˁnˁ ŋˁɑˁ] 'name of a legendary country'

妯娌 */rluk rəʔ/ [rluk rəʔ] 'husband's brother's wife'

麒麟 */gə rin/ [gə rin] 'kirin'

鼩鼱 */Cɯ-go tseŋ/ [Cɯ-go tseŋ] 'shrew'

鴝鵒 */Cɯ-go lok/ [Cɯ-go lok] 'mynah'

芙蓉 */Cɯ-Pa loŋ/ [Cɯ-Pa loŋ] 'lotus'

璵璠 */Cɯ-la ban/ [Cɯ-la ban] 'kind of precious jade'

蜘蛛 */Cɯ-tre tro/ [Cɯ-tre tro] 'spider' (and many other reduplicative words)


蝙蝠 */pen pək/ [pˁɛˁnˁ pək] (not [pˁɛˁnˁ pˁʌˁqˁ]!) 'bat'

(即+鳥)鴒 */tsək reŋ/ [tsək rˁɛˁŋˁ] (not [tsək reŋ]!) 'wagtail'

鸕鶿 */ra dzə/ [ʁˁɑˁ dzə] (not [ʁˁɑˁ dzˁʌˁ]!) 'cormorant'

轀輬 */Cʌ-ʔun raŋ/ [Cˁʌˁ-ʔˁʊˁnˁ raŋ] (not [Cˁʌˁ-ʔˁʊˁnˁ rˁɑˁŋˁ]!) 'closed carriage in which one could lie down'

did /raŋ/ harmonize with /ʔun/ before /ʔun/ harmonized with  */Cʌ-/?

鸚鵡 */rʔeŋ Cɯ-ma/ [ʁˁɛˁŋˁ Cɯ-ma] (not [ʁˁɛˁŋˁ Cˁʌˁ-mˁɑˁ]!) 'parrot'

(8.17.2:34: 5 and 6 forgotten in haste and added.)

5. Loss/fusion of presyllables. Emphatic consonants became phonemic:

*/ta/ [tˁɑˁ] > */a/ [tˁɑˁ]

*/tʌ-ti/ [tˁʌˁ-tˁɪˁ] > /i/ [tˁɪˁ]

*/tɯ-ta/ [tɯ-ta] > */ta/ [ta]

*/ti/ [ti] remains unchanged.

6. Phasing out of emphasis. Vowel allophones became phonemic:

*/a/ [tˁɑˁ] > /tɑ/ [tɑ]

*/i/ [tˁɪˁ] > /tɪ/ [tɪ]

*/ti/ [ti] and */ta/ [ta] remain unchanged.

Next: Did *-q > *-k shift occur at 2 or 3 above? DID OLD CHINESE HAVE *Q-ODA DISSIMILATION?

One problem with last night's proposal was my inability to explain why my Old Chinese (OC) *-q had two seemingly random reflexes: *-k or *-ʔ.

David Boxenhorn proposed

that Old Chinese could have had a Semitic-like restriction of one emphatic per root. I consider OC *q to be an emphatic. (Baxter and Sagart reconstruct nonemphatic *q- and emphatic *qˁ- which is a rare distinction found in Ubykh.) If *-q were preceded by another emphatic (*Q), it could have dissimilated by fronting to *-k:

*QVq > *QVk

e.g., 特 *Cʌ-Təq > *dˁəq > *dˁək

(I am more agnostic about the sources of voicing and aspiration in emphatic obstruents than I was last night. Hence *Cʌ- instead of *Nʌ- and *Hʌ-.)

On the other hand, if *-q were the only emphatic, it could have backed to a glottal stop:

*CVq > *CVʔ (cf. how Baxter and Sagart's initial OC *q- became *ʔ-)

e.g., 止 *təq > *təʔ (phonetic in 特 above)

This predicts that no *-q words became

*QVʔ (with *-q backing to a glottal stop instead of *-k) or

*CVk (with *-q fronting to a velar instead of backing to a glottal stop)

Unfortunately, such words did exist:

*Cʌ-Kəq > *gˁəʔ (not *gˁək!)

𠬝 *bəq > *bək (but phonetic in 服 which had two readings, the expected *bəʔ and the unexpected *bək; 服 is in turn phonetic in 箙 *bək)

Is a separate */q/ series necessary? I have long assumed that OC final */-k/ had two allophones: *[-qˁ] after emphatics (contrary to the dissimilation hypothesis) and *[-k] after nonemphatics: e.g.,

刻 */kʰˁək/ [qʰˁʌˁ]

𠬝 */bək/ [bək]

Was it acceptable to use a  *[-ʔ] phonetic for *[-qˁ]-syllables and vice versa, and somewhat less acceptable (though still possible) to use a *[-ʔ] phonetic for *[-k]-syllables and vice versa?

*[-ʔ] phonetic for *[-qˁ]-syllable:

*təʔ in 特 *dˁək [dˁʌˁ] (no other readings of 止-graphs end in *-k)

*[-k] phonetic for *[-ʔ] syllable:

𠬝 *bək in 服 *bəʔ (all other readings of 𠬝-graphs end in *-k)

8.16.0:48: 𠬝 is the only example of a *[-k] phonetic for a *[-ʔ] syllable that I can find. I can't find any examples of *[-qˁ] phonetics for a *[-ʔ] syllable. Hence I've crossed out "and vice versa" above. DID OLD CHINESE HAVE *Q-ODAS?

My last two posts on codas made me wonder if Old Chinese (OC) had *-q as well as *-p *-t *-k *-kʷ. If OC had initial *q-, could it also have had final *-q?

I first encountered the notion of OC *-q in Pulleyblank (1984: 226) who proposed it instead of *-kʷ in the 藥 rhyme category.

I wonder if *-q existed in addition to *-kʷ. Although there is no rhyme evidence for a fifth stop, perhaps the extant poetry reflects mergers that had not yet occurred when these phonetic series were created:

GSR series Sinograph OC Sinograph OC
0934 𠬝 *bək (< *-q?) *bəʔ ~ k (< *-q?)
0937 *Nʌ-Kəʔ (< *-q?) *Hʌ-Kək (< *-q?)
0961 *təʔ (< *-q?) *Nʌ-Tək (< *-q?)
0977 *ləʔ (< *-q?) *lək-s (< *-q-s?)
0995 *wəʔ (< *-q?) *wək (< *-q?)

(8.15.0:55: 异 is another spelling of 異 'different' which is phonetic in 翼 *lək 'wing'. Why did Baxter and Sagart [2011] reconstruct 翼 'wing' as *ɢrəp with *-p? To link it with 翌 *rəp 'wing' [= their *ɢrəp]? Could the *-k of 翼 and the *-p of 翌 be from *-kʷ? More here.)

Here is a possible *-q word family (found in Sagart 1999; the reinterpretation is my own):

səq:*Hʌ-səʔ (< *-q?) 'colorful' ~ 色 *Tʌ-sək (< *-q?) 'color'

8.15.1:41: Here are two more *-q word families:

ləq:*m-lək (< *-q?) 'to eat' : 飼 *s-m-ləʔ/k-s (< *-q-s?) 'to feed'

The 司 phonetic series mostly represents open syllables or syllables ending in *-ʔ (< *-q?); 飼 is the only member that might have had *-k-

məq:*Cʌ-mək 'black' : 晦 *Hʌ-məʔ/k-s (< *-q-s?) 'dark'

The 母 phonetic series mostly represents open syllables or syllables ending in *-ʔ (< *-q?); 晦 is the only member that might have had *-k-

Was there a back coda chain shift?

*-q > *-ʔ > *-Ø

Could that account for phonetic series like

*mʌ-rə (< *-ʔ?) : 麥 *mʌ-rək (< *-q?)

in which *-k alternates with zero? PHONODYNAMICS

is a term I coined tonight to refer to the study of sound changes. It's shorter than 'historical phonology'. I Googled only nine hits for it with 'linguistics'; its earliest attestation was as phonodynamic in Jannaris (1897).

Similar terms for other subfields of historical linguistics could be morphodynamics, semantodynamics, syntactodynamics, etc.

A key tool in phonodynamics is what I call phonostatistics. Earlier tonight I found Grimes (2010) which includes a synchronic study of Hungarian phonostatistics: e.g.,

At the outset, it was already known that /h/ and /ʤ/ can only appear in onsets and /x/ (an allophone of /h/) in the coda – this is common knowledge. However, the [statistical] skew present for other consonants has not been generally recognized – for example, /z/, /n/, /r/, and /g/ show strong preference for appearing in the coda, while /f/, /b/, and /v/ have strong preference for onset position. In fact, excepting /p/, all labials show a preference for onset position. This is a peculiar fact that I have not seen noted by others previously. (p. 129)

Phonologists have paid more attention to complementary distribution - e.g., to [h]- and -[x] in Hungarian - than to sounds that can be found in the same position but with different degrees of frequency. Phonodynamic studies may find correlations between frequencies and sound changes.

Is an onset preference for labials universal? In my small survey of final stops, there were several languages which only permitted nonlabial stop codas. Thanks to David Boxenhorn for pointing out that Hungarian's distant relative Finnish is another. I've never seen a language that has -p as its sole final stop. (I predict such languages - if they exist - are more numerous than languages that have -c as their sole final stop. C is much less frequent than p; the former occurs in only 12% of all languages in UPSID, whereas the latter occurs in 83%. Oddly Grimes found a preference for both /tʸ/ = /c/ and /p/ as codas in Hungarian, but the low frequency of /c/ - the second least common consonant in Hungarian - makes me wonder if the coda preference is an artifact of a sample restricted to 678 CVC words as opposed to the entire lexicon.)

Vietnamese has [p] only as a coda, though [c] can occur in both onset and coda positions.

Vietnamese stop distribution

class implosive pulmonic
unaspirated aspirated
labial nonlabial
stop [ɓ ɗ] [p] [t c k] [tʰ]
onset yes no yes yes
coda no yes no

[tʰ] was once a member of a set: *[pʰ tʰ cʰ kʰ]. The other three became fricatives [f s x] with the same distribution as [tʰ] (possible as onsets but not as codas).

Are there Muong languages with the same distributional pattern? Å-SYMMETRY IN AVESTAN (APPENDIX: -T̰)

Is there any script in the world besides Avestan that has different characters for released t and unreleased t̰? Is there any language that has a phonemic distinction between the two? In Avestan, the distinction was subphonemic: t̰- was the first consonant in clusters (t̰k-, t̰b-) and is in final position (including the ends of first elements of compounds), whereas t occurred everywhere else.

"Å-symmetry" refers to the fact that no other released/unreleased pairs exist in writing. They might have existed in speech. I wrote last December,

I presume other final stops in Avestan were also unreleased, but they lacked special letters in the Avestan alphabet because final /t/ was more common. (The ablative singular and secondary third person singular suffixes end in /t/. I don't know of any suffixes ending in other stops.)

I searched through the entire Yasna for the string

p or k + space, comma, period, or exclamation mark

and found no examples. If there are any -p or -k-final words that wouldn't match any of those patterns, they must be very rare.

The unique status of -t̰ in Avestan reminds me of how -t was the only permissible final stop in Late Middle Japanese (and only in Chinese loans), though I think -p and -k also once existed in an even earlier Japanese style of pronouncing Chinese loans.

If a language has only one final stop, what is that stop most likely to be? I doubt it's palatal -c. In modern Fuzhou, a glottal stop is the only permissible final stop.

8.13.1:39: A survey of permissible final stops

Khmer -p -t -c -k
Vietnamese -p -t -c -k
Thai, Taiwanese -p -t
German, Russian, Sanskrit, Korean, Cantonese -p -t
Malay (native vocabulary; see Tadmor 2009: 1061) -p -t

Shao Yong's 11th century Liao Chinese (Pulleyblank 1984: 106) -p

Late Middle Japanese (and Avestan?)


Shanghai, Fuzhou, Burmese

That table refers to phonetic stops, not to phonemes or historical stops preserved in writing: e.g., the -b of German, the -б of Russian, the -ပ် <p> of Burmese, etc. I have also excluded final affricates.

Manchu has -b -t -k, but I suspect -b was phonetically [p].

I almost added Turkish to the -p -t -k row but according to Myers and Crowhurst,

It is worth noting that there are a few sporadic exceptions to the syllable final devoicing rule, for example Ad, a proper name, and serhad 'frontier' [which have final [d] rather than [t]].

Are there any exceptions with [b] or [g]? Å-SYMMETRY IN AVESTAN VOWELS (CONCLUSION)

I titled this series "Å-symmetry" with A and a superscript O because Avestan diphthongs such as ao (from mraot̰ /mraut/ at the end of last Thursday's entry) lack the symmetry of their Sanskrit counterparts:

Sanskrit diphthongs

e [eː] < *ai o [oː] < *au
ai < *āi au < *āu

Avestan diphthongs

~ ōi ~ ē /ai/ ao ~ ə̄u ~ ō /au/
āi /āi/ āu /āu/

ai and au are not the short counterparts of āi and āu; they are /a/ followed by epethentic vowels anticipating palatal and labial segments in the following syllable* (Jackson 1892: 25-26):

/aCI/ > [aiCI] (/I/ = palatal vowel)

/aCy/ > [aiCy]

/arU/ > [aurU] (/U/ = labial vowel)

/arv/ > [aurv]

The true short counterparts of āi and āu were not written as a matching pair like ai and au:

ends in a long vowel unlike its back counterpart ao

ōi has a rounded vowel unlike its back counterpart ə̄u

I would have expected

and or

ae and ao or

ōi and ōu or ēu or

ə̄i and ə̄u

Were the allophones of /ai/ and /au/ due to pressure to differentiate them from āi and āu?

/ai/ has a long second vowel unlike its long counterpart āi

ōi /ai/ has a round first vowel unlike its long counterpart āi

ao /au/ has a mid second vowel unlike its long counterpart āu

ə̄u /au/ has a mid first vowel unlike its long counterpart āu

There may have been more symmetry at an intermediate stage. Beekes (1988: 36) "suggest[ed] that ōi continues [the unattested] ə̄i." ə̄ in ə̄i backed to ō so the two halves of the diphthong would be polarized: i.e., maximally differentiated in terms of the horizontal dimension (o is back and i is front).

The asymmetry is also distributional:

tends to be in open nonfinal syllables whereas ōi tends to be in closed syllables and final position; ē is also common in final position

ə̄u is almost always in the genitive singular ending -ə̄uš of u-stems (i.e., in a subset of closed syllables rather than closed syllables in general); ō is in final position like ē but is rarer

Skjærvø (2003: 10) found only one apparent minimal pair with and ōi in Young Avestan:

aēm 'he' (< *ayam) : ōim 'one' (also aoim, ōyum < *aivam)

In these cases, and ōi are not simply /ai/ and /au/ but result from contractions of longer sequences.

*i-epenthesis generally only occurs before labials and dentals

p, b, β

t, θ, θr, δ, n, ṇt, r

and ŋh < *sy.

u-epenthesis only occurs before /r/. Was /r/ u-friendly in a way that other consonants weren't? Elmer Fudd's pronunciation of English r as [w] comes to mind: "wascally wabbit".

