I also made an *au-kward error on p. 548 of my 1999 PhD dissertation (emphasis mine),

Two possible sources for OJ [Old Japanese] wo [= A-type o] are pre-OJ *ua and *uə [...] Two other possible sources for OJ wo are pre-OJ *au and *əu.

(I am relieved that I cannot find any similar passage in my 2003 book.)

Martin (1987: 58) wrote,

Starostin ([1975:] 279) believes [Old Japanese] Cwo [= A-type o] may have come also from Cau in some instances, since the phonogram [高] HIGH koo < kɔɔ < kau < [MC = Middle Chinese] .kaw* is used as well as the phonogram [古] OLD KWO < [MC] ˙ku/˙wo*** [to write OJ A-type o]. But there seems to be no particular correlation found with the pertinent vocabulary items.

I agree with Martin. Take the Old Japanese place name KoAsi, for instance. Its spellings include both HIGH and OLD:

高志 MC *kaw tɕɨʰ

古之 MC *koˀ tɕɨ

Similarly, Old Japanese toA 'door'/'gate' was written with Middle Chinese *-aw, *-əw, and *-o graphs:

刀 MC *taw 'knife'

斗 MC *təwˀ < Late Old Chinese *toʔ 'ladle'

吐 MC *thoˀ 'to spit'

And Old Japanese moA was written as 毛 Middle Chinese *maw 'hair'.

Are these spellings evidence for *Vu as a source of Old Japanese type A o - or [Vw] as a variant pronunciation of Old Japanese A-type o? No and no.

First, most transcription evidence points to [o] as the pronunciation of Old Japanese A-type o. Why would o have a occasional variant pronunciation [aw] after k-, t-, and m- - a set of consonants with nothing in common - but not after other consonants?

Second, early Japanese writing, like Vietnamese quốc ngữ romanization, reflected a mishmash of different influences. Quốc ngữ combines Italian gi-, Portuguese nh-, s-, x-, and circumflexes for higher vowels, and innovative elements without European precedents that I know of (e.g., ư and ơ). It would be a mistake to try to read Vietnamese as if it were Italian or Portuguese. Similarly, it would be a mistake to try to read Old Japanese texts as if they were in Middle Chinese. Old Japanese writing combined existing Korean peninsular writing conventions (themselves based on earlier Chinese transcriptional conventions) with new, uniquely Japanese writing conventions based on different dialects of Middle Chinese from different places and periods.

The *-aw phonograms are carryovers from Korean peninsular writing. They are read with -o in Sino-Korean and -ow in the prescriptive Sino-Korean of the 15th century. There is no evidence for *Vu-type diphthongs or *Vw at any earlier stage of Korean. Hence they probably stood for Co-syllables in Korean peninsular writing and that usage was imported into Japan.

The *-əw phonogram 斗 'ladle' was also a carryover from Korean peninsular writing. It might have been chosen as a symbol for *to when it was still pronounced *toʔ in Chinese, and that usage may have persisted on the peninsula and later even in Japan long after *toʔ became *təwˀ in Chinese. (Sino-Korean tu and 15th century prescriptive Sino-Korean tuw reflect Late Middle Chinese *tə́w. -u(w) is the closest Korean equivalent of *-əw since Korean lacks *əu-type diphthongs and *əw.

*10.20.0:51: Martin (1987) used a period at the lower left of Middle Chinese syllables to indicate the 'level tone'.

**10.20.1:21: Martin (1987) used a period at the upper left of Middle Chinese syllables to indicate the 'rising tone'. I don't know why he wrote ˙ku/˙wo instead of the ˙kwo or ˙kuo I would expect as his equivalent of Karlgren's Middle Chinese *ʿkuo.

10.20.1:57: ADDENDUM: I restricted myself above to discussing the three A-type syllables with Middle Chinese *-aw phonograms from Martin (1987: 58), but similar arguments also apply to these phonograms:

for OJ soA:

嗽 MC *səwʰ < Late Old Chinese *soh 'to cough'

for OJ roA:

樓 MC *ləw < Late Old Chinese *lo 'storied building'

漏 MC *ləwʰ < Late Old Chinese *loh 'to leak'

The last two are carryovers from Korean peninsular writing, and the first is similar to the homophonous Korean peninsular phonogram 漱 'to gargle'. So they are not evidence for Old Japanese *səw and *ləw. 漱, 樓, and 漏 were chosen as phonograms for *so, *Ro, and *Ro on the peninsula at a time when they were still pronounced with *-o in Chinese. Later those phonograms were exported to Japan, where they retained their sound values even after *-o broke to *-əw in Chinese. (10.20.10:12: There was no obligation to update orthographies to keep up with sound changes in Chinese.)


Today I got a copy of Unger's (2009) The Role of Contact in the Origins of the Japanese and Korean Languages which contained a critique of my 2003 paper. His book reminded me about a chain shift hypothesis that I had proposed and forgotten.

Here's part of it: pre-Old Japanese *o raised to Old Japanese u, leaving a gap that was filled by pre-Old Japanese *au which monophthongized into Old Japanese o. There is no doubt about the first part. But the second part is wrong.

The earliest strata of Sino-Japanese (collectively called 'Go-on') has both -u and -o corresponding to Middle Chinese *-o. Early Go-on readings* underwent raising whereas later Go-on readings did not: e.g.,

Sinograph Middle Chinese Pre-Old Japanese Old Japanese
*khoʔ *ko ku (preraising borrowing)
*koʔ (not yet borrowed into Japanese) ko (postraising borrowing)
*lo *ro ru (preraising borrowing)
ro (postraising borrowing)

If pre-Old Japanese *au really monophthongized after raising, Middle Chinese *-aw should correspond to early Go-on -o < *-au and late Go-on -au. Yet it always corresponds to Go-on -au except after labials where it corresponds to Go-on -ou (not *-o!). Middle Chinese *-aw was a common rhyme. It is unlikely that

a. Middle Chinese *-aw syllables were never borrowed before raising and monophthongization

b. early borrowings of Middle Chinese *-aw syllables were all replaced by postraising, post-monophthongization borrowings

c. borrowings of Middle Chinese *-aw syllables were all exempt from monophthongization

It is more likely that *au-monophthongization simply never occurred, so Middle Chinese *-aw was borrowed into Japanese as -au and remained as -au until monophthongizing to -ɔː in Late Middle Japanese and ultimately raising to [oː]. Unger (1993: 27) wrote that "there is no evidence to support Ono [Susumu]'s claim for */au/ and */ua/ [as the earliest reconstructible sources] for OJ [Old Japanese] A-type o**."

Frellesvig (2008: 176) and Frellesvig and Whitman (2008: 39) regarded pre-Old Japanese */ua/ (but not */au/) as a source of Old Japanese A-type o; the latter gave Old Japanese kanzopəy-*** 'to count' from *kansu 'number' + apa-i- 'join' as an example on p. 19. On the other hand, Unger (1993: 99, 114) reconstructed *kana-Nsuba-pe with *uba (not *ua) as the source of 'to count'.

*10.19.1:39: The readings in the table are taken from Numoto's 1995 index of readings in Ruiju myōgishō from c. 1100 AD. Taishūkan shin Kan-Wa jiten lists a reading ku for 古 that looks like a preraising borrowing, but I am not confident that it is a genuine reading as opposed to an artificial modern creation. Is there any attested word in which 古 is read ku?

**10.19.1:19: Modern Japanese o corresponds to two or three types of vocalic entities in Old Japanese:

Type Miller-Mathias notation My reconstruction Yale notation Unger, Frellesvig, Whitman
A ô o wo wo
B ö ə o o
indeterminate o o o

The exact value of type B o is open to interpretation - it could have been ɔ, ɤ, or ʌ - but in any case it was less labial than type A o.

Pre-Old Japanese *ua might have become *wo on the way to type A o, but I doubt that type A o itself was [wo] since I don't know of any Cwo-syllables in other Altaic-type languages and later stages of Japanese. Moreover, it would be strange if Old Japanese permitted Cwo but not Cwa in native words. (kwa and gwa are in Chinese loanwords.) Is there any attested language with Cwo but without Cwa?

***10.19.0:34: All Old Japanese forms are in my reconstruction: e.g., I rewrote Frellesvig and Whitman's kazwope- as kanzopəy-. TURKIC TENS Wiktionary derived Russian sorok 'forty'

from Turkic кърк (“40”) by dissimilation k–k > s–k; related to Turkish kırk.

Why was s- chosen to replace k- instead of, say, p- which is grave like k-? Is there a hierarchy of probable replacements for different kinds of dissimilation? Absolute predictions for replacements may not be possible. Old Chinese *-m dissimilated after labials to *-ŋ in Middle Chinese (e.g., 風 *puŋ < *pəm), whereas Middle Chinese *-m dissimilated after labials to *-n in Cantonese (e.g., 凡 faan < *faam; cf. Meixian Hakka fam). On the other hand, I doubt that P-m would dissimilate to P-ɲ or P-ɳ. Similarly, I doubt that k-k would dissimilate to, say, zh-k. But these are uninformed guesses. Has anyone done a study of dissimilation?

Turkic unrounded high ï would be a better match for Russian unrounded high y than Russian rounded mid o. So why isn't the Russian word *syrok? According to Trubachev, its Old East Slavic ancestor was *sŭrkŭ with a short high ŭ corresponding to Turkic high ï. The first ŭ strengthened to o, another o developed through pleophony, and the final ŭ was lost.

Róna-Tas (1998: 74) reconstructed the Proto-Turkic numeral for 'forty' as *kïrk. It bears no resemblance to 'four'. There is no single pattern for Proto-Turkic numerals from 'twenty' to 'ninety':

Type one *bir ten *on
A: no relation to *on 'ten' or any other numeral two *èki twenty *yègirmi
three *üč thirty *otuz
four *tȫrt forty *kïrk
five *bēš fifty *elig
B: no relation to *on 'ten'; + suffix *-mIš six *altï sixty *alt-mïš
seven *yeti seventy *yet-miš
C: + *on 'ten' eight *sekiz eighty *sekiz-on
nine *tokuz ninety *tokuz-on

*yègirmi 'twenty' looks vaguely like *èki 'two' plus the *-mIš of 'sixty' and 'seventy', but the resemblance may be coincidental. Note that 'twenty' is used to form the numerals for 11-19 in Old Turkic: e.g., bir yigirmi 'eleven' (lit. 'one twenty').

How did such a composite system develop? Are 'eighty' and 'ninety' transparent because they are innovations that had not been subjected to obscuring changes? Could *-mIš be an earlier word for 'ten'?

Erdal (2004: 220) regarded *elig 'fifty' as "identical with the word 'hand'", but Clauson (1972: 141) regarded ellig 'fifty' ("certainly with double ll") as distinct from elig 'hand'.

Could the opaque terms for 'forty' etc. - and even many numerals in general - have originated as words for sets of a certain number of items? OF FIREWOOD AND FUR

I forgot to mention last night that I thought Xiang 炮 [pʰau⁴⁵] might be somehow related to its near-homophone 包 'bundle' (Middle Chinese *pæw). The aspiration could reflect a prefix: *k-p- > *pk- > pʰ-. (I reconstruct that same sequence of changes in Korean and Tangut.)

Today it occurred to me that the selection of 炮 'to bake, roast' for [pʰau⁴⁵] 'ten' wasn't purely phonetic; 火 could imply that the 包 'bundles' were of firewood. Did a word [pʰau⁴⁵] for 'a bundle of ten sticks' come to mean 'ten'?

Such a semantic shift is parallel to 'a bunch of forty sable pelts' becoming Russian sórok 'forty'. (What is the origin of Russian sorók with the stress on the second syllable?)

I also forgot to ask if Xiang 炮 [pʰau⁴⁵] could be followed by 'one' through 'nine': e.g., is 'eleven' (一)炮一 [(i²⁴) pʰau⁴⁵ i²⁴]? A PHAU-PLEXING NUMERAL

On Columbus Day yesterday, I learned that

Columbus never wrote in his native language, which is presumed to have been a Genoese variety of Ligurian (his name would translate in the 16th-century Genoese language as Christoffa Corombo pron. IPA: [kriˈʃtɔffa kuˈɹuŋbu]).

That got me thinking about Mao Zedong who wrote in Classical Chinese and Mandarin but not in his native language: the Shaoshan dialect of Xiang.

Last night I discovered Wu Yunji's (2005) A Synchronic and Diachronic Study of the Grammar of the Chinese Xiang Dialects and learned about

the word 炮 [pʰau⁴⁵] 'ten' (etymology unknown) [...] used as a cardinal number only. It can be combined only with 'one', such as in 一炮只客 [i²⁴ pʰau⁴⁵ tsa²⁴ kʰɤ⁴⁵] [one - ten - CL - guest] 'ten guests', but not in 'twenty guests'. Twenty guests would be 二十只客 [ɤ²¹ sɪ²⁴ tsa²⁴ kʰɤ⁴⁵], in which [sɪ²⁴] corresponds to Mandarin 十 shí 'ten'. The word [pʰau⁴⁵] can be used a criterion to identify the Xiang dialects [...] This word occurs in most of the localities of the Xiang dialect group but is rarely found in other localities.

What is the origin of this word? Can [pʰau⁴⁵] 'ten' precede a noun by itself, or must it always be preceded by [i²⁴] 'one'? [i²⁴ pʰau⁴⁵], literally 'one ten', makes me think [pʰau⁴⁵] was originally a word referring to a set of some sort. [pʰau⁴⁵] superficially resembles Proto-Austronesian *sa-puluq 'ten' and its descendants, but I think that's just a coincidence.

Where are the "other localities" that Wu referred to? WU WITHOUT WO, JI WITHOUT JE?

In my previous entry, I proposed that Pittayaporn's (2009) Proto-Tai *wu "could be rewritten as *wo since he has no *wo in his reconstruction, and reflexes include ɔ and o as well as u." My unstated assumption was that a language does not have wu without wo (unless it lacks the vowel o). Similarly, I assumed that a language does not have ji without je (unless it lacks the vowel e).

But are these assumptions true? Are there languages with high and mid vowels which have glide-high vowel sequences (wu, ji) but lack glide-mid vowel sequences (wo, je)?

Gong's Tangut reconstruction had wu, wo, and ji but not je. In 2007, I rewrote his jij as jej to match the Tibetan and Sanskrit transcription evidence - the first step toward my own reconstruction of Tangut which has -ɨe, -ie (phonetically [ɰe] and [je]?; below), and -wo but not wu or ji.

Tangut rhyme Tibetan transcription Transcribed Sanskrit Gong This site (2007) This site now
36 -e -e -jij -jej -ɨe
37 -(y)e -ie

(I have excluded uncommon Tibetan transcriptions.)

I think glide-high vowel sequences are less favored than glide-mid-vowel sequences because glides sound almost like high vowels.

Does any language have the glide-high vowel sequence ɰɨ?

10.15.0:36: According to Wikipedia, Guaraní has [ɰɨ] in gyresia [ɰɨˈɾe̞sia] 'Greece'. (Are country names not capitalized in Guaraní?) Guaraní even has a nasalized [ɰ̃] (written g̃) which isn't in UPSID.

Does any language without mid vowels have [ji] and [wu] (as opposed to allophones of /ji/ and /wu/ with lowered vowels: [jɪ wʊ])? DYEING IN THE SOUTH: EVIDENCE FOR EARLIER SOUTHERN CHINESE *-OM

Pittayaporn (2009: 159) reconstructed Proto-Tai 'to dye' as *ɲwuːm C and referred to Vietnamese nhuộm 'id.' as evidence for *-w-. I think his *wu could be rewritten as *wo since he has no *wo in his reconstruction, and reflexes include ɔ and o as well as u. The word is from Chinese 染 which Baxter and Sagart (2011: 94) reconstructed as *C.n[a]mʔ in Old Chinese. (The brackets indicate that *a is not the only possible vowel.) Why do Tai and Vietnamese have rounded vowels in this word? There are two possibilities:

- the Old Chinese word was *Cɯ-nomʔ (in my reconstruction); its *-o- dissimilated to *-a- before *-m in mainstream Chinese but remained intact in the southern Chinese dialect(s) that are the sources of the Tai and Vietnamese words

- the Old Chinese word was *Cɯ-namʔ (in my reconstruction); its *-a- assimilated to *-o- before *-m in the southern Chinese dialect(s) that are the sources of the Tai and Vietnamese words but remained intact in mainstream Chinese

Another word of this type is 南 'south'. Baxter and Sagart (2011: 36) reconstructed it as *nˤ[ə]m in Old Chinese, but it has a rounded vowel in Vietnamese nồm 'southern wind' and nôm 'demotic' (i.e., 'Vietnamese' = 'southern'). William Boltz pointed out to me that 扶南 (now read Funan in Mandarin) might have been a transcription of an early Khmer *bnɔm (now Phnom as in Phnom Penh). Given the Vietnamese evidence for a rounded vowel in 'south', it is likely that 扶南 was read as something like *buo-nom in southern Chinese. But should such an *-om be projected back into Old Chinese?

My guess is no. If not for the above evidence, I would reconstruct 'dye' with *-am and 'south' with *-əm in Old Chinese. I think those two rhymes merged into *-om in the dialect(s) that were the sources of Proto-Tai *ɲwoːm C and Vietnamese nhuộm, nồm, and nôm.

The palatal initial and *-w-/-u- of Proto-Tai *ɲwoːm C and Vietnamese nhuộm reflect consonant and vowel shifts in nonemphatic syllables. *namʔ would have become emphatic if not for the high-voweled presyllable *Cɯ-:

*Cɯ-namʔ > *Cɯ-nɨamʔ > *nɨamʔ > *ɲɨamʔ > southern *ɲuomʔ, *ɲiemʔ elsewhere

Compare 'to dye' with 'south' which became emphatic because of its low-voweled presyllable:

*Cʌ-nəm > *nˁəm > *nəm > southern *nom, *nəm elsewhere

In my Old Chinese reconstruction, the height of the first vowel of a word determined the presence or absence of emphasis:

unstressed *ɯ, stressed *i, *ə, *u > [-emphasis]

unstressed *ʌ, stressed *e, *a, *o > [+emphasis]

Lower vowels bent upward in [-emphasis] syllables (e.g., *a > *ɨa > *ie and *o > *uo above), whereas higher vowels bent downward in [+emphasis] syllables (e.g., *i > *ei > southern *ai as in 雞 *kai 'chicken', borrowed into Proto-Tai as  *kaj B [Pittayaporn 2009: 327]; emphatic may have lowered and backed to stressed *[ʌ]).

I reconstruct similar vowel classes for Tangut which underwent nearly identical bendings, though there is no evidence for emphasis in Tangut. These changes in Tangut were probably due to intense contact with Chinese implied by a large number of Chinese loanwords in Tangut.

*10.13.23:20: ADDENDUM: The word 染 is of particular interest to me since its character transcribed the Xianbei word for 'Chinese' which may be ancestral to Manchu nikan 'Chinese'.

