I've seen this Proto-Celtic word list before, but I didn't notice voiced aspirates in it until now:

*mori-steigh-(e/o-) 'sea'

*men-n-dh-e/o- (?) 'want'*

*ati-od-bher-to- (?) 'sacrifice'

Are those pre-Proto-Celtic forms? I thought Proto-Celtic lost aspiration in voiced consonants:

Proto-Indo-European *gh *dh *bh > Proto-Celtic *g *d *b

*5.14.0:42: This reminds me of Avestan mazdā- 'wisdom' < *mn̥s-dheʔ 'mind-place', though the first root is in the e-grade in Celtic. CHU AND KRA-DAI (PART 2)

Here's my attempt to reconstruct the Old Chinese (OC) phonetic series of 楚 (Schuessler 2009 series 1-62, Karlgren 1957's series 88 plus 90) to make it fit Chamberlain's (2016) hypothesis from part 1.

The series has five types of Early Middle Chinese (EMC) readings (ignoring final consonants):

I. *sɨə-*Cɯ-sa- (*kɯ-sa-?) (胥湑稰諝糈壻婿)

II. *ʂɨə- < *kɯ-sa- (疋疏蔬梳糈)

III. *tʂʰɨə- < *kʂʰɨa- < *kɯ-sa- (楚 only)

IV. *ŋæ- < *ŋgʐa- < *N-k-sa- (alternate reading of 疋 only)

V. *sej < *se (alternate reading of 壻婿 only)

The high-vowel presyllables of types I-III conditioned medial *-ɨ- which in turn conditioned the raising of *a to *ʂɨə.

The high-vowel presyllables of type I was lost after conditioning medial *-ɨ-, but they fused with *s in types II and type III. *kɯ-s- that fused early became EMC *ʂ- via *kʂ-; *kɯ-s- that fused late became EMC *tʂʰɨə- via *kʂʰ-.

Type III *kʂʰɨaʔ might have approximated an early Kra-Dai *kraʔ, especially if it were phonetically something like [kʁaʔ].

(5.12.0:56: Or if 'Kra' were [kʐaʔ]. Cf. Polish krz [kʂ] from *kʐ- < *krʲ-. Pittayaporn 2009: 99 reconstructed *ks- as a Proto-Tai source of Proto-Southwestern Tai [and hence Siamese] *kʰr-, though he does not list any examples of Proto-Tai *ks-, and he reconstructed the Proto-Tai cognate of 'Kra' as *kraː C 'slave' with *kr- rather than *ks-. Siamese kʰaː C1 'slave' lacks the -r- that would point to medial *-s-. If *ks- became Siamese kʰr-, perhaps *kz- became *kr- and then Siamese kʰ-.)

*N- fused with *k- to form the *ŋ- of type IV.

(5.12.0:11: OC *a fronted to after retroflexes.)

The *-e rhyme of type V is anomalous and unique to 壻~婿 'son-in-law'; it cannot be reconciled with the *-a rhyme of the other types.

5.12.1:03: Added all examples of each type listed in (Schuessler 2009: 59) plus 疋 as the sole example of type IV which was not listed in Schuessler. CHU AND KRA-DAI (PART 1)

Chamberlain (2016) proposed that the name of the state now known as 楚 Chǔ in Mandarin is the same name as Kra as in Kra-Dai. This is an ingenious idea. But does it really work?

The rhymes certainly match. 楚 ended in *-aʔ in Old Chinese, and 'Kra' in Proto-Kra-Dai was something like *kraʔ (cf. Ostapirat's Proto-Kra *kra C 'Kra' and Pittayaporn's Proto-Tai *kraː C 'slave'; I interpret the C tone category as *-ʔ like Norquest 2016).

The trouble is the initial. If 楚 had initial *kr- in Old Chinese, it would have become Early Middle Chinese †kæʔ and Mandarin †jiǎ. But instead it became Early Middle Chinese †*tʂʰɨəʔ and Mandarin chǔ [tʂʰu] with aspirated retroflex initials.

Can those initials be reconciled?

Pulleyblank (1962: 129) proposed that Old Chinese *skʰ- might have become Early Middle Chinese *tʂʰ-. Later, Pulleyblank (1965: 206) proposed Old Chinese *kʰs- as a source of Early Middle Chinese *tʂʰ-. But there is no *s in Proto-Kra-Dai *kraʔ. *s- is likely to have been in the Old Chinese reading of 楚 since nearly all readings of characters in the 疋 phonetic series began with *ʂ- or *s- in Early Middle Chinese. There is no evidence on the Chinese side directly pointing to *k- in 楚 or any other member of the 疋 phonetic series, though 疋 does have another Early Middle Chinese reading *ŋæʔ which could mechanically be derived from an Old Chinese *ŋraʔ - close to *kraʔ but with a velar nasal rather than a stop.

Next: How can I make Chamberlain's idea work?

5.11.11:56: Added reference to Pulleyblank (1965) and link to Pulleyblank (1962). ARMENIAN, KOREAN, AND BURMESE APPROACHES TO KHITAN OBSTRUENTS

In my last entry, I wrote,

the Khitan transcribed Liao Chinese *t as both <t> and <d>

There are similar inconsistencies with other obstruents and to a lesser extent even in the spelling of native Khitan words: e.g., 'second' is spelled with both 162 <c> and 104 <dz> (Kane 2009: 115).

I originally thought that Liao Chinese and Khitan had different obstruent systems: e.g., LC had an unaspirated : aspirated distinction whereas Khitan had a voicing distinction. But that wouldn't explain the inconsistency in Khitan native words.

Today it occurred to me that Khitan might have had Armenian-style variation:

The major phonetic difference between dialects is in the reflexes of Classical Armenian voice-onset time. The seven dialect types have the following correspondences, illustrated with the t–d series:

Correspondence in initial position

Indo-European *d
*dʰ *t
Erevan t
Istanbul d
Kharpert, Middle Armenian d
Malatya, SWA
Classical Armenian, Agulis, SEA t
Van, Artsakh t

But of course Khitan had only two obstruent series, not three.

Might the use of certain spellings correlate with certain locations and/or time periods? They would then reflect the obstruent series of different regional/chronological varieties of Khitan. The unspoken assumption of Khitan studies is that the language was homogeneous over a wide area for a long period, but that is unlikely.

Another possibility is that Khitan was like modern Korean in which unaspirated obstruents have voiced and voiceless allophones conditioned by different environments: Sino-Korean 德 /tək/ appears as

[dək] after a sonorant

[tək] elsewhere

Could 254.020 <d.ei> ~ 247.020 <t.ei> transcribing Liao Chinese 德 (Kane 2009: 253) have had a similar distribution?

A final possibility is that Khitan was like Burmese in which etymological voiceless consonants may be voiced in close juncture. Wheatley (2009: 729) explains that in Burmese,

[c]lose juncture is characteristic of certain grammatical environments [...] But within compounds the degree of juncture between syllables is unpredictable; the constituents of disyllabic compound nouns (other than recent loanwords) tend to be closely linked, but compound verbs vary, some with open, some with close juncture.

The above possibilities are not mutually exclusive for Khitan. THE KHITAN EMPEROR SHENGZONG IN UNICODE

Today I discovered that lookalikes for all four Khitan large script characters for 聖宗皇帝 'Emperor Shengzong' (r.  979-1031) exist in Unicode:


Of course it's only the first two characters that are interesting; they are unknown to nearly all literate in Chinese. The last two are identical to Chinese 皇帝 'emperor'. (5.9.23:21: Neither 𫝢 nor 伋 are in Jun Da's frequency list of 12,041 Chinese characters.)

'Emperor Shengzong' exemplifies how the Khitan large script to a Chinese eye is a mix of familiar and alien elements. The first two characters combine famliar elements

夕 'evening' + 卞 'hat' = 𫝢

亻 'person' + 及 'to reach' = 伋

in unfamiliar ways.

𫝢 turns out to be a variant of 升 'to rise', which in turn was a homophone of 聖 *šiŋ 'sage' in Liao Chinese aside from its tone. 𫝢/升 and 聖 were not homophones until the late first millennium AD, so the use of 𫝢 for 'sage' may date from the Liao dynasty and is probably not a carryover from the pre-Liao Parhae script hypothesized by Janhunen. Why didn't the Khitan simply recycle 聖 'sage' the way they recycled 皇帝 'emperor'? Was 聖 'sage' too complex for the Khitan large script which favored a low number of strokes per character?

In Chinese, 伋 is a name character of no known meaning. (It is the birth name of Confucius' grandson 子思 Zisi.) It would have been pronounced *ki in Liao Chinese and not 宗 *tsuŋ like 'ancestor'. So the reasoning for 伋 as 'ancestor' is unclear (though at least the 亻 'person' radical makes sense). Might a Khitan or even a Parhae word for 'ancestor' have sounded something like *ki?

(5.9.9:39, revised 14:16: Was 伋 a semantic compound invented by someone who might not have known about the rare character 伋? But I know of no semantic compounds unique to the Khitan large script. The closest instance I can think of is


which consists of 天 'heaven' over 土 'earth'. It is not a true semantic compound because it does not represent a word for 'heaven and earth' or 'world' (the sum of 'heaven and earth'); 土 'earth' seems to disambiguate an unknown Khitan word for 'heaven' from 天 for <tên>, a borrowing from Liao Chinese. The semantic function, if any, of 及 'to reach' in 伋 'ancestor' is less clear.

The Dictionary of Chinese Character Variants has no 伋-like variants of 宗. What I will call Janhunen's Question remains unanswered: If the Khitan wanted a script to distinguish themselves from the Chinese, why did they keep or replace characters seemingly at random? I still think the only possible answer is that they didn't do that - rather, they adapted a sister script of Chinese [Janhunen's hypothetical Parhae script]. The situation is somewhat parallel to that of Cyrillic which is related to the Latin alphabet but not derived from it; they are 'cousins', not 'daughter' and 'mother'.)

Although the shapes of 皇帝 'emperor' are uninteresting, the question of how we know their readings is worth examining. Kane (2009) reads them as <hoŋ di> (= <ghong di> in the transcription system on this site).

However, I have not found any Khitan small script phonetic spelling of the first half of 皇帝 'emperor' or any of its homophones in Chinese. I would expect such a spelling to be 340.071 <h.ong> with voiceless 340 <h> rather than voiced-initial 076 <gho>. (There is no known small script character <gh> without a vowel, and devoiced to *x in Liao Chinese.) No spelling <h.ong> is in Qidan xiaozi yanjiu (1985: 460). Has such a spelling been found in the thirty-plus years since the publication of that book?

Kane (2009: 244) lists 247.339.339 <t.i.i> as a small script spelling of the second half of 皇帝 'emperor'. Unfortunately, he does not cite a source for this spelling, and it is not in Qidan xiaozi yanjiu (1985: 375). I presume <t.i.i> is from an inscription discovered after Qidan xiaozi yanjiu was written. The <t> of <t.i.i> does not necessarily invalidate Kane's reading di for 帝 since the Khitan transcribed Liao Chinese *t as both <t> and <d>, and they transcribed Liao Chinese *i as both <i> and <i.i>.

5.9.0:33: Why is the name character 伋 glossed in English as 'deceptive' at zdic.net?

5.9.0:49: Kane (2009: 181) also lists a second Khitan large script character ⿰歹卞 for 聖 'sage' with 歹 'bad' on the left instead of  夕 'evening' from Liu and Wang (2004: 27, character 150). That character has no Unicode lookalike; it is character 0177 in N4631 ("Proposal on Encoding Khitan Large Script in UCS") which does not seem to list 𫝢 from Kane (2009: 183). Where is 𫝢 attested? Regardless of whether 𫝢 is an error for ⿰歹卞 and hence not a real Khitan large script character, I have no doubt that  ⿰歹卞 is a variant of the Chinese character 𫝢 and is a phonetic loan for  聖 'sage'.

I also think that 𫝢 / ⿰歹卞 <shing> may have been the inspiration for the vaguely similar Tangut character


2shen3 'sage'

whose Tangraphic Sea analysis has been lost.

5.9.22:31: Are Khitan large script characters

1054 (升 + a dot on the right)

1056 (1054 with the first stroke 丿 stretching over both vertical strokes of 廾 plus a dot on the right)

in N4631 further variants of 𫝢 / ⿰歹卞 <shing>?

5.10.1:49: Chinggeltei's  關於契丹文字的特點 (1997: 110) includes 𫝢  in its list of Khitan large script characters. OBLIQUE AFFRICATES IN CHINESE

Today on Wikipedia I saw that standard Mandarin 斜 xie [ɕjɛ] 'oblique' corresponded to Lower Yangtze Mandarin

colloquial [tɕia]

literary [tɕiɪ]

with affricate initials. The colloquial reading preserves an earlier -a going all the way back to Old Chinese; the literary reading has an innovative raised vowel [ɪ].

The dictionary Middle Chinese initial is *z-. Other dialects of Middle Chinese might have had *dz-. In any case, the Old Chinese word began with *sɯ-, though what was between that *sɯ- and *-a is not clear: *sɯ.ɢa, *sɯ.ja, and *sɯ.la are all possible. There is no known external comparison that could narrow down the possibilities. The character 斜 has the phonetic 余 *Cɯ.la, but the character 斜 dates from Han times, and at that point *ɢ, *j, and *l might have already merged into *j. (邪 'slant' - a homophone of 斜 in Middle Chinese - may be a pre-Han spelling of the same word. But its phonetic 牙 has a velar nasal initial *ŋ-!)

My hypothetical Middle Chinese *dz- might be from *sɯ.ɢ- > *s.ɢ- > *zɢ- > *zd- > *dz-. But it's more likely that it results from a Late Old Chinese or Middle Chinese confusion of *z- with *dz-. Japanese merged *z- and *dz- into /z/ which is now [dz] initially, [z] medially, and [ddz] when geminated.

Xiaoxuetang reports affricate initials in 斜 in

Mandarin: 天長 Tianchang [tsʰ] (the sole Mandarin example on the site)

Wu: 丹陽 Danyang [dʑiɑ] ~ [dʑiɒ], etc.

(Hui: no data; NB: this 徽 Hui is not the Mandarin-speaking Muslim 回 Hui, whose name is pronounced with a different tone)

Gan: 湖口 Hukou [dʑia], etc.

Xiang: 雙峰 Shuangfeng [dʑio], etc.

Min: 廈門 Amoy [tsʰia] (colloquial; literary [sia]), etc.

Yue: Cantonese [tsʰɛ] (where long ago I first observed this affricate initial corresponding to Middle Chinese *z-; I didn't know such an initial was in Mandarin too)

Ping: 永福 Yongfu [tsʰiə], etc.

Hakka: 梅縣 Meixian [tsʰia] (colloquial; literary [sia]), etc.

The affricate initial is represented in nearly every branch. No Jin variety on that website has an affricate reading. But all but one of the unclassified varieties has an affricate initial.

It seems that literary varieties of Middle Chinese kept *z- (> modern [s]) apart from *dz- while colloquial varieties merged them to various extents.

5.8.13:40: For comparison, let's see if the above dialects also have affricates for Middle Chinese 徐 *zɨə 'to walk slowly; a surname':

Mandarin: 天長 Tianchang [tʃʰʮ], etc.

Wu: 丹陽 Danyang [dʑyz] (sic), etc.

Hui: 旌德 Jingde [tsʰʮ], etc.

Gan: 湖口 Hukou [dzi], etc.

Xiang: 雙峰 Shuangfeng [dy] (sic) ~ [dʑy], etc.

Min: 廈門 Amoy [tsʰi] (colloquial; literary [su]), etc.

Yue: Cantonese [tsʰœy], etc.

Ping: 永福 Yongfu [tsʰy], etc.

Hakka: 梅縣 Meixian [tsʰi], etc.

The only Jin variety with a reading is the most well-known: 太原 Taiyuan [ɕy]. 徐 is a common surname, so it must be in other Jin varieties. The absence of affricates in Jin readings of 斜 'oblique' makes me guess that 徐 also lacks affricates in the rest of Jin, but I don't know.

The unclassified varieties have a mix of initials: e.g.,

富川 Fuchuan [sy]

鍾山 Zhongshan [θy]

賀州 Hezhou [ty] (cf. the stop [d] in Shuangfeng above)

道縣 Daoxian [tso]

連州 Lianzhou [tsʰɛu]

To work out what's going on with them would require studies of their individual phonologies. It is a shame that Xiaoxuetang doesn't seem to have initial, rhyme, and tonal inventories online for each variety. In theory I could extract inventories from the data, but I don't have the time to do that right now. HAVE A ČĪZBURGERU: ENGLISH BORROWINGS IN LATVIAN

After mentioning Latvian datums last time with its combination of a Latin neuter suffix -um and a Latvian masculine suffix -s, I was curious to see how Baltic languages dealt with a recent influx of English loans. Baltic languages and Greek are the only modern Indo-European languages I know of that still retain ancient -s suffixes in the nominative case.

I guessed that all Latvian borrowings of English consonant-final stems would be placed in the first masculine declension like datums. And it does seem that is generally the case. See these two lists. Even sibilant-final stems are assigned to that declension: e.g., bizness (which is biznes-s and not copying the -ss of the English spelling) and finišs (< finish + -s). I might have expected them to be assigned to the second declension with -is or the third declension with -us.

The exceptions I've seen so far end in -er in English:

adapteris < adapter

menedžeris < manager

peidžeris < pager

porteris < porter

taimeris < timer

Were they assigned to the second declension by analogy with some earlier wave of -eris loans?

Not all English -er words become -eris words in Latvian: cheeseburger has become čīzburgers (with an un-English pronunciation of burger with [u] - †čīzberger would have been closer to the English original). Maybe -burger is by analogy with hamburgers, perhaps in turn influenced by Russian <gamburger>, also with [u]? No, maybe -burger is simply based on a spelling pronunciation.

