Last month, I discovered Pierre Marsone's translations from the annals of the History of the Liao ending in the middle of the era known as 統和 'uniting harmony' in Chinese and as

<HEAVEN ? ?> ~ <HEAVEN ?> (large script)

<HEAVEN s.bu.o.ɣo> (small script)

The first characters in both Khitan scripts are obviously equivalent to each other though not to Chinese 統 'unite':


I am not certain that the remaining Khitan characters are equivalent to each other.


Although s.bu.o.ɣo> in the small script means 'inherited' or 'succeeded' (Kane 2009: 63, 100, 118), it is possible that one or both of the large scirpt words means 'harmony' like Chinese or something else entirely.

It doesn't help that I can't find any other example of the large script character

with the element resembling Chinese 'thread' in the few texts I have on hand. That character does not appear to be a ligature of the other two characters which are in those texts*:

<PIG/AFFAIR ? ?> (耶律褀墓誌 5; <? ?> is presumably either the word in the era name or a homophone; I have not seen the second character without the third)

<c.er ? ? ...> 'wrote ...' (北大王墓誌 4; unknown if the ホ-like character is a separate word or the beginning of a word)

<... ? ? ? ...> '?' (北大王墓誌 8; word boundaries unknown; unknown if the ホ-like character is a separate word)

<? ?> '?' (北大王墓誌 11, 14, 17; surrounded by various characters on both sides, so possibly a word)

<? ?> '?' (耶律褀墓誌 5, 6 [x 2], 22 [x 4], 24, 25 [x 2], 26; surrounded by various characters on both sides, so possibly a word))

I wish I could compile a list of all Khitan large script character combinations that occur twice and do not contain any known suffixes. Such combinations are likely to be new words or stems.

*3.13.1:45: Although there are at least two other large script characters containing the left-hand component 糹resembling Chinese 糹 'silk' (2197 and 2209 in N4631), I have not seen 糸 as an independent large script character. Moreover, I do not know of any attestations of the right-hand component as an independent large script character or as a component in any other character.

3.13.1:52: Could

  (= N4631 0965 which has 小 under ㄴ+丨 instead of a long 亅 intersecting ㄴ?)

be a variant of the phonogram  <u> (= N4631 2211 which has a bottom hook?). That wouldn't be possible if

represented something like <sbuoɣo> (division point unknown) ending in <o>.

15.3.11:23:59: A DAY 3: TONES

I will finish my survey of the forms for 女 'woman' in Chinese languages by looking at tonal categories. I will not examine tone shapes because they may be even more diverse than the vocalism: e.g., within Mandarin varieties alone, 'woman' can have tones that are high level (Jinan), high falling (Xi'an), mid falling (Wuchang), low falling (Tianchang), low falling-rising (Beijing), and low rising (Hefei). However, those are all reflexes of the same tone category (traditionally called 'rising' though its shapes vary; here I will call it 2).

In standard Mandarin, 2 is the tone that regularly developed in syllables with Middle Chinese sonorant initials (*ɳ-) and glottalization (*-ˀ):

Category 1 2 3 4
Old Chinese *-Ø/m/n/r/ŋ *-ʔ *-s *-p/t/k/kʷ
Middle Chinese *-Ø/m/n/ŋ *-ˀ *-ʰ *-p/t/k
Standard Mandarin: *voiceless 1a 2 3 1a, 1b, 2, 3
Standard Mandarin: *voiced sonorant 1b 1a, 3
Standard Mandarin: *voiced obstruent 3 1b, 3

I would expect 'woman' in other Chinese languages to have tone 2 or 2b. (-a indicates a tone conditioned by *voiceless initials and -b a tone conditioned by *voiced initials.) But in fact 'woman' also has tones in other categories:

Group\Rhyme 1a 12a 1b 12b 2 2a 23a 2b 3 3a 3b













Are forms with tones 1 and 3 evidence for reconstructing Old Chinese variants ending in zero and *-s? And are a-tones evidence for reconstructing a voiceless initial? Not necessarily:

- In standard Mandarin, 2 merged with 3 after *voiced obstruents. Maybe some varieties merged 2 with 3 after *voiced sonorants as well.

- In standard Mandarin, 2 has the same reflex after *voiceless initials and *voiced sonorants. Maybe some varieties developed 2a after those two classes of initials and 2b after *voiced obstruents.

- Some varieties merged 2a with 1a (12a) or 2b with 1b (12b). In some cases this may have only occurred after *voiced sonorants (and the results were labeled 1a and 1b).

3.12.2:13: 馬 'horse' belongs to the same Middle Chinese tonal category (*voiced sonorant + *-ˀ) as 'woman'. The notes at xiaoxue.iis.sinica.edu.tw give some insight into its unusual tones in Wu varieties: e.g., the tone from *voiced sonorant + *-ˀ merged with 1a in 溧陽 Liyang, 3a in 衢州 Quzhou, and 3b in 崑山 Kunshan.

15.3.10:23:59: A DAY 2: RHYMES

Two days ago, I wrote about the larrge number of initial consonants in the forms for 女 'woman' in Chinese languages. There is even more variation among the rhymes. This table may be the widest I've ever made in HTML with fifty columns:

Group\Rhyme -i -iu -ie -iə -iou -iɤ -iɛ -iɔ -ü̥y -y -ỹ -yz -yʮ -ye -yə -yɤ -yu -u -ui -uei -uĕi -uɛi -ɯŋ -o -oi -oy -ɤŋ -ei -əu -əŋ -ɵy -øi -øy -œi -œy -ɔi -ɔy -ɐu -a -aɯ -aɤ













I didn't even include the syllabic nasal rhymes  -ŋ̍  and -n̩ in Hakka. More on such rhymes below.

Let me try to make some sense out of all that. I am sure what follows will have to be wrong to some extent because  I am making generalizations and I do not know the histories of all the individual varieties in each group.

I do at least know the history of standard Mandarin [ny] 'woman':

*Rɯ-naʔ > *Rɯ-nɨaʔ > *Rnɨaʔ > *rnɨaʔ > *nrɨaʔ > *ɳɨaʔ > *ɳɨəˀ > *ɳio > *ɳø > *ɳy > ny

-y is the only rhyme for 'woman' found across Chinese. Many -y forms could be borrowings from Mandarin(-like) prestige dialects with -y.

Other high vowels like -i and -u and the diphthong -iu  could either be from *-y or borrowings of -y in varieties lacking -y.

Falling diphthongs could be from warped high vowels: e.g., Cantonese -œy may be from *-y. Diphthongs like *-œy could lose part of their rounding and become -oi, etc.

Rising diphthongs like -iə might be partial retentions of earlier rising diphthongs like *-ɨə.

The mid vowel rhymes may be partial retentions of the second halves of earlier rising diphthongs like *-ɨə.

The low vowel rhymes may retain the height of Old Chinese *-a. Lowering is doubtful since Chinese vowels tend to raise. Could Wenzhou na* directly reflect the second half of Old Chinese *Rɯ-naʔ?

The nasality of the initial conditioned new codas in rhymes like -ɯŋ.

The nasal was all that was left in Hakka forms such as Meixian which I presume is from an earlier *n- + high vowel sequence.

I can't explain the z-type rhymes: ([zʷ] in IPA), -yz, and -yʮ. Normally such rhymes developed after sibilants, not nasals.

*Wenzhou na is labeled as literary at xiaoxue.iis.sinica.edu.tw, though the form listed as colloquial (ȵy) resembles mainstream ny-forms and is hence likely to be a borrowing.


The forms for 女 'woman' in Chinese languages may be only the tip of an iceberg of lost diversity. The northwestern Chinese dialect known to the Tangut became extinct, leaving only substratal traces in the Mandarin dialects that replaced it.

I thought northwestern Mandarin m-forms for 'woman' like Xi'an mi* might be substratal retentions, yet I know of no premodern evidence for such a word.

Could those Mandarin m-forms be borrowed from the m-forms of Jin to the east? Such an m-word for 'woman' need not have anything to do with Tangut m-words for females, as m-words for 'mother' have developed independently in many languages.

女 had a retroflex nasal *ɳ- in Middle Chinese. Coblin (1994: 102) listed two other cases of Xining m- corresponding to Middle Chinese *ɳ- before i:

尼 'nun': Xining mi : Middle Chinese *ɳi

膩 'oily': Xining mi : Middle Chinese *ɳiʰ

Did *ɳ- become m- in the northwest after the Tangut period, and were those m-forms replaced by mainstream n-forms with isolated exceptions like 'woman'? The trouble is that *ɳ- > m- before i makes no phonetic sense.

Let me put aside the above 'mi-stery' and look at the earliest attestations of 'woman' in the northwest:

Tibetan transcriptions from c. a millennium ago (Coblin 1994: 156): ji, Hji, HjI**

Preinitial H- represents a homorganic nasal.

Khotanese Brahmi transcriptions (Coblin 1994: 156): jū, ś̮ī***

Tangut transcription

4706 2ju'3 'a name character' (also 'woman', a borrowing from Chinese)

also used to transcribe the first syllable of 'Jurchen'

Alas, I have not seen any Uyghur or Arabic transcription.

Those transcriptions point to something like *ndžy in early northwestern Chinese.

I use a non-IPA symbol ž to avoid committing to *ʐ, *ʒ, or *ʑ. I use j in my Tangut transcription to similarly avoid committing to *ndʐ, *ndʒ, or *ndʑ (though I think *ndʐ is most likely). I leave out prenasalization in my Tangut transcription as it is nonphonemic and possibly even optionally absent.

The ī ~ ū variation in Khotanese indicates a high vowel like or *y absent from Khotanese. I chose front rounded *y because the Tangut could have borrowed central as the vowel that I transcribe as y (following the convention of transliterating Russian ы as y).

I do not know why a Chinese word was transcribed and borrowed with the mysterious phonetic quality that I transcribe as 'prime' (-') in Tangut. If 'prime' was glottal stop or glottalization, it might correspond to the glottalization of Middle Chinese *ɳɨəˀ which might have survived into a later period. (3.10.0:31: Emmerick and Pulleyblank (1993: 56) think Khotanese transcriptions may reflect the late survival of glottalization.)

In early Tang, *ɳɨəˀ became something like *ɳɖɨəˀ and later *ɳɖjøˀ in the northwest. (I am assuming the glottalization survived even after *jø raised and fused to *y in *ndžyˀ.) I cannot tell whether the Japanese Kan-on reading jo from an Old Japaense reading ndiyə was borrowed from a northwestern *ɳɖɨəˀ or *ɳɖjøˀ; Old Japanese ə would have been the best approximation of if it was present in the source of Kan-on.

*Presumably mislabeled as literary at xiaoxue.iis.sinica.edu.tw; the more mainstream-looking form ȵy is more likely to be colloquial.

**Capital I transliterates the mysterious gigu inversé (reversed i) of Tibetan. See Hill (2010: 116).

***I think Coblin used ś̮ to transliterate the letter that Emmerick and Pulleyblank (1993: 55) transliterated as ś’ and interpreted as [ʒ].

3.10.0:30: The transcription ś̮ī is taken from nama ś̮ī, a transcription of the phrase 男女 *nam ndžy 'man and woman'. The prenasalization of *ndžy might have been difficult to hear after the nasal coda of the first syllable.

15.3.8:23:59: A DAY

Today is International Women's Day, so I thought I'd look at the development of 女 'woman' in Chinese languages.

I'll start in ... the middle. A generic Middle Chinese form might be *ɳɨəˀ:

Its retroflex initial goes back to Old Chinese *nr-.

Its diphthong goes back to Old Chinese *a which was raised due to the presence of a preceding high-vowel presyllable that was later lost:
*Cɯ-a > *Cɯ-ɨa > *Cɯ-ɨə

Its glottalization goes back to an Old Chinese final glottal stop *-ʔ.

There are two possible Old Chinese reconstructions:

*Cɯ-nraʔ with a presyllable whose initial consonant left no trace in Middle Chinese

*Rɯ-naʔ with a coronal-initial presyllable that lost its vowel, possibly became *r- before *n- (if *R- was a consonant other than *r- such as *l- or *t-), metathesized, and fused with *n:

*Rɯ-naʔ > *Rɯ-nɨaʔ > *Rnɨaʔ > *rnɨaʔ > *nrɨaʔ > *ɳɨaʔ > *ɳɨəˀ

I favor the latter reconstruction because some modern forms point to a simple root initial *n-:

Group\Initial m- t- d- nd- n- l- nz- z- ʐ- ɲ- = ȵ- ʔɲ- j- g- ŋ- Ø-

The diagnostic forms are not those with n- which is from *nr- (in turn from either *Cɯ-nr- or *Rɯ-n-). They are in fact those with nz-, z-, and ʐ which are reflexes of *Cɯ-n- (and *C- could be *R-).

Forms such as

孝義 Xiaoyi nzu (colloquial*)

舒城 Shucheng Mandarin

石樓 Shilou Jin ʐu

may be from *nɨaʔ which lost its presyllable following the partial raising of *a and developed a palatal rather than a retroflex initial:

*Rɯ-naʔ > *Rɯ-nɨaʔ > *nɨaʔ > *ɲɨaʔ > *ɲɨəˀ > nz-/z-/ʐ-

The palatal nasal ɲ- (> j- > zero) may or may not be a retention of *ɲ- from *Cɯ-n-.

I don't know if the glottal stop in ʔɲ- is real or just a notational device like the one that Zee (2003: 131) rejected. I doubt it is a trace of a presyllable.

In some Min varieties, *ɲ- seems to have backed to ŋ- which may have hardened to g- in 沙縣 Shaxian (via an intermediate prenasalized stage: *ŋ- > *ŋg- > g-).

Elsewhere, the retroflex intitial *ɳ- became n- (> l- or nd- > d- > t-?).

3.9.0:51: I forgot to mention the m-forms in Jin and northwestern Mandarin (Coblin 1994: 101-102, 156; the Jin forms are not listed at xiaoxue.iis.sinica.edu.tw). I thought the northwestern Mandarin forms could be  substratal words related to Tangut m-words for females:

0092 1ma4 < *Cɯ-ma-C 'mother'

0960 1meq4 < *Sɯ-me 'young girl'

3168 1my'4 < *mi-ʔ 'woman'

3209 1my'4 < *mi-ʔ, first syllable of 1my'4 2ur4 'female servant'

3334 1ma4 < *Cɯ-ma-C 'female'

5162 1my4 < *mi 'mother'

Could the m-forms in Jin be borrowings from the northwest, or are they native?

*The literary form ny is a borrowing from a dialect with n- from *nr-.

