After two quick failed attempts to find nôm support for my account of rhotacism in Vietnamese, I looked up râu 'beard' in the five editions of The Tale of Kiều at nomna.org and was puzzled by three of the characters near it on line 628 of the 1866 edition:

Nôm 毛+各 髟+娄 襖 (with 釆 simplified to 人) 衤+𢖫
Reading Mày râu nhẵn nhụi áo quần bảnh bao
Gloss brow beard smooth clothes elegant

毛+各 mày 'brow' has the semantic element 毛 mao 'hair' plus 各 các 'each' whose function is unknown. Is 各 an error for 眉 'brow'? Later editions of Kiều have 眉 'brow' or 毛 'hair' + 眉 'brow'. 眉 is read as mi in Sino-Vietnamese; mày is an earlier borrowing.

隊 normally represents Sino-Vietnamese đội < *tọj 'squad'. Why was it used for nhụi [ɲuj] in the 1866 and later editions? Is the nh- of the reduplicative word nhẵn nhụi from a pre-17th century cluster with an *t-like component? (10.27.9:57: Or did the initial of an earlier *tụi assimilate to the initial of nhẵn?)

Quần is written as 裙 (< 衤 'clothes' + 君 quân) in later editions but has the curious right-hand element 𢖫 (a variant of 忍 nhẵn that normally represents Sino-Vietnamese nhẫn 'endure' and is also the nin in Japanese 忍者 ninja) in the 1866 edition. I suspect 𢖫 is an error influenced by 忍 earlier in the line. Does it occur anywhere else? (10.27.9:41: I can't find it using the Nôm Lookup Tool.)

10.27.9:54: Although nôm only recently became extinct, its characters can be surprisingly opaque in structure. This does not necessarily mean that whoever transcribed the 1866 edition of Kiều knew structural principles that were lost just 147 years later, as some of the characters are clearly carryovers from a time when Vietnamese phonology was very different: e.g., they indicate pre-17th century clusters. Yet others may be of recent origin. The challenges of nôm make the mysteries of tangraphy all the more daunting. NÔM EVIDENCE FOR VIETNAMESE RHOTACISM: RÂU 'BEARD'

nomna.org lists eight nôm characters for râu 'beard' (whose r- is from *CVs-; cf. răng 'tooth' from my previous entry). Each character has two components:

Left component Function Right component Function
?: vũ 'rain' phonetic: tu < *su
semantic: 'hair'
phonetic: do < *jɔ
phonetic: du < *ju
phonetic: lâu
phonetic for Chinese ̣(!): hồ
not visible; 9 strokes - perhaps 须?

I don't know what 雨 vũ 'rain' is doing in a character representing a word that neither sounds like vũ nor has anything to do with rain. Is 雨 a distortion of 髟 'hair'? The two radicals look very different, but I can't come up with a better explanation. Regarding 雨 as a semantic element because the many hairs of a beard are like the many drops of rain is a stretch.

Once again, I'm disappointed that none of the phonetics reflect earlier initials like *ʐ-, perhaps because at least most of them may be of recent origin. The l-phonetic graphs must postdate the shift of *ʐ- to r-.

The d-phonetic graphs may have been created by speakers of dialects with d/r-merger: e.g., Hanoi in which d- and r- are both [z].

It is not possible to date the use of 鬍 hồ 'beard' for an unrelated Vietnamese synonym râu 'beard'.

It is possible that 髟+湏 for râu 'beard' may be a variant of 鬚 tu 'beard' used to write an unrelated Vietnamese synonym.

It is even possible that râu and tu are related words. I reconstruct the Old Chinese source of Sino-Vietnamese tu as *sɯ-no which became Late Middle Chinese *su, the source of Sino-Vietnamese tu. Suppose that *sɯ-no compressed into *so in a southern Old Chinese dialect. In fact Peiros' database has a Muong dialect labeled as "ANNG" with so for 'beard'. Ferlus (1982) cites a Muong form tho (cf. Barker 1966's Mường Khến thăng 'tooth' corresponding to Vietnamese răng 'id.').

Those Muong varieties may have dropped a presyllable whose vowel conditioned the voicing of a following *-s- in Vietnamese:

*CVs- > *CVz- > *Cz- > *Cʐ- > *ʐ- > r-
I don't know when *-o broke to -âu relative to those shifts.

That Chinese loanword (*CV-)so could have replaced a native word with descendants such as Ruc təmɨːɲ (Phu 1998) surviving elsewhere in Vietic.

It is remotely possible that 髟+湏  dates from a period when 'beard' was still *CV-so in early Vietnamese and 鬚 was still pronounced *su in Sino-Vietnamese, but I doubt it. It (and its distortion 雨+須) are probably undatable semantic loans like 鬍 (see above). Those who chose to write the Vietnamese word for 'beard' with 鬚-like graphs may have been unaware of its etymological connection to 鬚.

The phonetic that I can't see might be 须 which is now the simplified form of 須 used in the PRC. I am reluctant to consider a slightly abbreviated form of 鬚 as a separate character. NÔM EVIDENCE FOR VIETNAMESE RHOTACISM: RĂNG 'TOOTH'

nomna.org lists eight nôm characters for răng 'tooth' (whose r- is from *kVs-). Each has two components:

Left component Function Right component Function
semantic: 'tooth' phonetic: lân
semantic: 'tooth' phonetic: lăng (?)
齿 semantic: 'tooth' phonetic: lăng (?)
semantic: 'tooth' phonetic: lăng (?)
semantic: 'flesh' phonetic: lăng (?)
semantic: 'flesh' semantic: 'tooth'
phonetic: lăng (?) semantic: 'tooth'
phonetic: lăng (?) semantic: 'many'

Only one character is a semantic compound (月 'flesh' + 齒 'tooth').

月 'flesh' is not a semantic component normally found in characters for dental morphemes.

Even odder is the semantic component 多 'many' presumably referring to many teeth.

Modern nôm dictionaries list characters with components like 齿 which are identical to simplified forms now used in the PRC. Were such components actually in use in premodern Vietnam or are they anachronisms?

(10.24.23:52: nomna.org lists a variant of 齒 with 㐅 corresponding to 人 in 齿. I have not seen that variant in Chinese, Japanese, or Korean. Is it uniquely Vietnamese? Did that variant also appear as a component in nôm characters?)

Both phonetic elements (粦, 夌) have initial l-.

I don't know what the Sino-Vietnamese reading of 夌 is, but it is most likely lăng since it is an alternate, simpler spelling of 陵 lăng. There is no doubt that it was chosen for răng 'tooth' because it is a phonetic in characters with the Sino-Vietnamese reading lăng: e.g., 凌菱綾, etc.  r- is not a possible initial in Sino-Vietnamese, so l- was the best available approximation of Vietnamese r-. However, I suspect that other phonetics were chosen for this word before its initial shifted to r-. Unfortunately,

the Nôm script was reformed in the early 19th century, and existing Nôm literary classics were re-transcribed in the new reformed script. Nearly all of the original manuscripts were subsequently lost. Existing dictionaries [e.g., the online dictionary at nomna.org] are based on the new reformed script. Some older Nôm writing survives in missionary archives, dating from the 17th century, and there may well be vernacular materials circulating in some localities in which the script still represents pre-reform writing, but this has yet to be investigated. (Holm 2013: 77)

Did răng 'tooth' have earlier spellings with phonetics indicating the initials I reconstructed for its earlier stages: e.g., *gz-, *gʐ-, *ʐ-, etc.? Do any of those earlier spellings survive?

The character with 粦 lân as its phonetic may have been created by a speaker of a central or southern dialect in which *-n became [ŋ]. VIETNAMESE RHOTACISM REVISITED

In his paper on the Arapaho sound change *s- > n-, Guillaume Jacques wrote,

Uncontroversial examples of rhotacism in word-initial position are rare. Vietnamese constitutes however a probable one, as shown by Ferlus (1982):

*ks 'tooth' > *z > răng

Proto-Viet-Muong *Cs-initial clusters become r- in modern Vietnamese, probably through a spirantized intermediate stage *z.

I am accustomed to initial consonants devoicing (e.g, *g > k in Vietnamese) and generally don't understand why the reverse occurs (e.g., *s > [z] in Dutch and German), so I'd rather avoid the latter if I can. Instead of voiceless *ks becoming voiced *z, I propose that the voicing of r- is due to intervocalic lenition:

*kVs 'tooth' (cf. Thavung kasâŋ 'id.') > *kVz > răng

I also reconstruct such lenition in Tangut (though medial *-z- became initial z- - phonetically [ɮ]? - instead of r-).

Compare the development of *kVs- with that of clusters:

Proto-Viet-Muong *kVs- *ks- *kr- *gr-
Initial devoicing (after tone split*) *kr-
Intervocalic voicing *kVz-
Presyllable vowel loss *kz-
Cluster voicing assimilation *gz- *kr̥-
Sibilantization *ks- *kʂ-
Retroflexion *gʐ- *kʂ-
Cluster reduction *ʐ- *ʂ-
Rhotacism r- s- [ʂ]

I don't know of any Vietnamese words going back to *ks- as opposed to *kVs-, but sau [ʂaw] 'after' may be cognate to Written Khmer <krau> 'outside' and sầm [ʂəm] 'reverbrating crash' may be cognate to Written Khmer <gram> 'sound of thunder' (Gage 1985: 506).

Retroflexion occurs in Sanskrit in which /k/ + /s/ becomes [kʂ]. Pulleyblank proposed a similar change in Old Chinese.

Mường Khến thăng 'tooth' (Barker 1966) may go back to *s- without a presyllable; it has the same initial as thắc 'hair' < *s- (cf. monosyllabic Thavung sɔ́k). *kr- in Mường Khến became kh-: e.g., khau 'behind'.

(10.24.0:24: Vietnamese r- and Mường Khến th- correspond to the 15th-16th century northern Vietnamese initial transcribed as  tʂʰ- and ʐ- in the Chinese transcription of Hua-Yi yiyu [Ferlus 1982: 91]. ʐ- is an exact match for my *ʐ-. Could tʂʰ- be a compression of *gʐ-?

*gʐ- > *kʂ- > *kʰʂ- > *tʂʰ-?

Mawo Qiang has kʰʂ- but not kʂ- [Sun 1981: 29].)

Vietnamese r- has become [z] in Hanoi. This [z] is probably not a retention of earlier *z.

*Syllables with voiced initial consonants developed different allotones which became phonemic after initial devoicing: e.g.,

Before tonogenesis /kra/ [kra] /gra/ [gra]
Tonogenesis /kra¹/ [kra¹] /gra¹/ [gra²]
Initial devoicing /kra²/ [kra²]

The locus of distinction between the two syllables shifted from onsets to vowels. *S > N IN ARAPAHO AND S < *N IN KHITAN

Guillaume Jacques recently uploaded a paper on the Arapaho sound change *s- > n- which "is unparalleled in the world's languages". I won't give away how he bridges the gap between Proto-Algonquian *s- and Arapaho n- (see pages 51 and 54 for his two hypotheses), but I will mention that I was reminded of Khitan

<pu.is.iń> = [pusiɲ] 'lady'

from Liao Chinese 婦人 *fužin, ultimately from Old Chinese *bəʔ 'wife' and *nin 'person'.

How did Old Chinese *n end up as s in Khitan?

1. Old Chinese nonemphatic *n became Late Old Chinese *ɲ.

2. partially denasalized to *ɲɟ in Northeastern Late Middle Chinese.
3. *ɲɟ weakened to *ɲʑ (as reconstructed by Karlgren; UPSID has no language with such a sound).

4. *ɲʑ simplified to *ʑ.

5. became   (phonetically [ʒ] or [ʐ]?).

6. Khitan had no *ž, so Liao Chinese was borrowed as s. (Less Khitanized borrowings have palatal ź for Liao Chinese and may be newer.)


1. Is there any non-Chinese language in which nonemphatic dentals (e.g, *n) palatalized while nonemphatic alveolars (e.g., *s) did not? (*s eventually did palatalize to [ɕ] in Mandarin, but that was long after emphasis was lost.)

2. In Northwestern Late Middle Chinese, all nasals could become prenasalized stops, but in Northeastern Late Middle Chinese, only became a prenasalized stop. Why was more prone to partial denasalization? (Southern Late Middle Chinese preserved *ɲ.)

3. Why was *ɲɟ the only prenasalized stop that weakened to a prenasalized fricative? Contrast it with Northwestern Late Middle Chinese *mb which did not become *mβ, a sound absent from UPSID.

4. Why wasn't Liao Chinese borrowed as Khitan palatal ś?

5. Was [ɕ] an allophone of Khitan /s/ in the vicinity of /i/?

6. Was the Khitan small script character


read as [si] or [ɕi] in certain environments? My guess is that it was

[is] or [iɕ] word-initially: e.g.,

<is> = [is] 'nine' (cf. Mongolian yisün 'id.')

<is.g.i> = [isgi] or [iɕgi] 'clan name transcribed in Chinese as 乙室己 *yišigi'

[is] after a Ci-character: e.g., <ci.is> = [cis] (cf. Mongolian cisu 'blood')

[si] or [ɕi] after a CV-character whose vowel is not V: e.g, <pu.is.iń> = [pusiɲ] or [puɕiɲ]

Similar rules may have applied to the Khitan small script character


which might have been read as [ɲi] or [iɲ] depending on its environment. I was reluctant to read it as [iɲ] in final position since final [ɲ] is absent from languages of the region (Mongolian, Manchu, Korean). However, if it was [in], how would it be different from


According to Kane (2009: 37), <in> was "[o]nly used in transcriptions from Chinese". I would add that <in> also represents the genitive suffix of Chinese loanwords ending in -i. This implies that final [in] was only in Chinese loanwords with or without inflection*, whereas final [iɲ] was in native Khitan words and heavily Khitanized Chinese loanwords like [pusiɲ] ~ [puɕiɲ]. I suspect final [iɲ] is from an earlier *-in since <iń> is a genitive ending like <an>, <en>, <on>, <un>, and <n> which all end in [n].

*10.23.0:32: I interpret genitives of Chinese loanwords ending in -i as stem + [n] sequences: e.g., <xoŋ di.in> 'emperor.GEN' was [xoŋdin], not *[xoŋdn] with a long vowel.

Interestingly, the genitive of the -final native word <śa.rí> 'court attendant' is <śa.rí.en> with <en> rather than <in> or <iń>. The -i final loanword for 'emperor' could also take the genitive ending <en>: <xoŋ di.en>. TRACING TAKERU (PART 2)

In my 2003 paper, I traced the root of Old Japanese take-r-u 'one who is fierce' back to *takia- and proposed a chain shift:

- pre-Old Japanese *e rose to Old Japanese i
- this left a gap that was filled by *ia (monophthongized to *ɛ?) which became Old Japanese e

In short:

*ia > *ɛ? > *e > i

However, I now agree with Unger (2009: 50) that this chain shift is incorrect, though for a different reason: not all Proto-Japonic *e rose to Old Japanese i. Hence there was no gap for *ia monophthongized to *ɛ?) to fill (though obviously the frequency of e was reduced by the raising).

Pre-Old Japanese *e was preserved in word-final position in Central Old Japanese (Frellesvig & Whitman 2008: 22, 26): e.g., *ipe 'house' became Western, Northeastern, and Central Eastern* Old Japanese ipe, not *ipi (though the final vowel did raise in Southeastern Old Japanese ipi 'id.').

According to Frellesvig & Whitman (2008: 22), "dialects/varieties [of Old Japanese] differed in the domain (word, morpheme, root) with reference to which 'final' was defined." On p. 26, they contrast Western Old Japanese kape-r- < *kape-r-  'to return' with Eastern Old Japanese kapi-r- < *kape-r- 'id.' *e did not raise in morpheme-final position in Western Old Japanese kape-r-, whereas it did raise in Eastern Old Japanese kapi-r-.

Just as Western Old Japanese kape-r- goes back to *kape-r-, perhaps Western Old Japanese take-r- goes back to *take-r-.

I propose that all native Japonic roots (e.g., *take- 'fierce') can only contain (C)V syllables. Hence all native Proto-Japonic vowel sequences only occurred at morpheme boundaries: e.g., *taka-i 'bamboo'. On the other hand, early loanwords may have had diphthongs: e.g., *mumə/ai 'plum' (possibly a loan from Middle Chinese 梅 *məj 'id'., though the first syllable is unexplained; see section 3.3 of Vovin 2005 which argued that the word is not Chinese). I regard the *-ə/ai of 'plum' as a diphthong since there is no *-i-less bound form that I would expect if the word were *mumə/a-i. 

*But how can Northeastern and Central Eastern Old Japanese ipa- 'house' be derived from *ipe? Maybe this word does have to be reconstructed with *-ia after all, contrary to my proposed constraint above. *ia monophthongized to -a in the east and -e in the west. If 'house' once had *-ia, does it violate the constraint because it's a Koreanic loan as proposed by Vovin (2009: 172)? (However, there is no Koreanic-internal evidence for reconstructing final *-ia in the ancestor of Korean chip 'house'.) And is the free form ipe really from *ipia-i, just as other free forms consist of a bound form plus *-i?

See Russell (2009) for the Eastern Old Japanese ipV-forms of 'house' and Vovin (2009: 170-172) for further discussion of its origins. TRACING TAKERU (PART 1)

In a critique of my 2003 paper, Unger (2009) pointed out two chronological contradictions:

- I claimed that pre-Old Japanese *ia became Old Japanese secondary e after pre-Old Japanese primary *e became Old Japanese i, but reconstructed secondary e in a word dating before *e became i.

- I claimed that pre-Old Japanese *au became Old Japanese secondary o after pre-Old Japanese primary *o became Old Japanese u, but reconstructed secondary o in a word dating before *o became u.

However, I did not contradict myself twice. Although I was wrong to claim that pre-Old Japanese *au became Old Japanese secondary o for the reasons I stated in my last two posts, Unger seems to have misunderstood what I wrote. According to his book,  I reconstructed pre-Old Japanese 'one who is fierce' as

*take-ro < Proto-Japonic *takia-ra-u

with secondary *e and *o (from *ia and *au), but in fact I reconstructed it as

*takɛr-o < Proto-Japonic *takiar-o

with (not secondary *e) and primary (not secondary) *o and different (and incomplete*) segmentation. This word became Old Japanese takeru after raising.

Here is how I thought mid vowels developed between pre-Old Japanese and Old Japanese (including my erroneous monophthongization of *au; see tables 8 and 9 of my paper):

Proto-Japonic Pre-Old Japanese Old Japanese
primary *e i (unless raising was blocked)
*ia *ia or *ɛ? secondary e
primary *o u (unless raising was blocked)
*au *au or *ɔ? secondary o

The last line is wrong, but there is no internal contradiction.

10.21.0:41: Although Unger (2009: 50) wrote that "Miyake believes [the pre-Old Japanese attributive ending *-ro of 'one who is fierce'] goes back to pJ [Proto-Japonic] *ra-u (84, 123)", I did not write any such thing on pages 84 and 123 of my article.

First, the attributive ending of *takɛr-o is *-o, not *-ro. *-r- is a verb-deriving suffix that Unger (1993: 115) reconstructed as *-re- in the verb underlying 'one who is fierce'. (Attribute forms of verbs can function as nouns. See section of Vovin's A Descriptive and Comparative Grammar of Western Old Japanese.)

Second, on page 84, I did incorrectly claim that Proto-Japonic *au became Old Japanese secondary o, but did not mention the attributive ending which I reconstructed as Proto-Japonic primary *-o (not *ra-u!).

Third, on page 123, I wrote,

the [Middle Chinese] *-ɔ or *-o reading that was current for 鹵 [the character used to transcribe the final syllable of pre-Old Japanese *takɛr-o] confirms the PJ [Proto-Japonic] attributive suffix [primary] *-o [not *ra-u!]  reconstructed by Thorpe (1983).

It is ironic that Unger criticized me for errors that I did not make, while not noting the error that I did make (having overlooked a passage in his 1993 book about the lack of evidence for *au as a source of Old Japanese secondary o).

He could have also taken me to task for not indicating a morpheme boundary between the root *takɛ and the verb-deriving suffix *-r- of *takɛ-r-o, but he didn't, perhaps because he was focusing on the vowels.

