Let me take a brief diversion into Khitan syntax. (I say that as if any of you could stop me!)

While looking through Kane (2009) for the umpteenth time, I noticed something odd that should have caught my eye years ago. Khitan had three equivalents of Chinese 四字功臣 'four character meritorious official' (i.e., an official whose title is written with four Chinese characters):

1. <FOUR us.g.d g.ung c.in> 'four characters meritorious official'

2. <g.ung c.in FOUR us.g.d > 'meritorious official four characters'

3. <g.ung c.in us.g.d FOUR> 'meritorious official characters four'

1 follows the modifier-modified order that is the norm in 'Altaic' languages and even in non-Altaic Chinese. 2 and 3, however, have un-'Altaic' order. 4-6 are structurally similar to 1-3:

4. <ONE us.g.en g.ung c.in> 'one character-GEN meritorious official'

5. <g.ung c.in EIGHT us.g.d> 'meritorious official eight characters'

6. <g.ung c.in us.g.d SIX> 'meritorious official characters six'

<g.ung cin> is a Chinese loan; it is bimorphemic in Chinese but was probably monomorphemic in Khitan, so I do not ever expect to see

*<cin g.ung>

'official meritorious'. (Similar Chinese loans retain Chinese morpheme order in Vietnamese which has un-'Altaic' modified-modifier order.)

At first I thought 2 and 5 had mixed order -

modifier + modified: numeral + 'character'

modified + modifier: 'meritorious official' + (numeral + 'character')

- but then I realized 'four characters' in 2 and 'eight characters' in 5 could be analyzed as single syntactic units following nouns rather than as numeral-noun sequences.

How can the un-'Altaic' order in 2, 3, 5, and 6 be explained? Are there other Khitan phrases with modified-modifier order? KHITAN SMALL SCRIPT CHARACTERS IN AISIN GIORO (2012) BUT NOT KANE (2009)

I have begun to compile a database of Khitan small script characters to facilitate my study of Aisin Gioro Ulhicun's Khitan reconstruction. So far it includes 473 characters in two numbering systems (Qidan xiaozi yanjiu/Kane's and Aisin Gioro's):

378 from Qidan xiaozi yanjiu

2 added to those 378 in Kane (2009)

90 in Andrew West's font that are not in Qidan xiaozi yanjiu or Kane (2009)

3 that are in Aisin Gioro (2012) but not in any of the above sources or N3820. (4.12.0:30: I can't see the Khitan characters in N3918R, but I assume they are the same as those in N3820, as the total number of characters has not changed.)

The latter three are her numbers 109,  234, and 293:

Unfortunately her reconstructions of their readings are in her 2011 book 契 丹語諸形態の研究 which I haven't seen. AISIN GIORO'S RECONSTRUCTIONS OF KHITAN VOWELS

After four posts in a row about consonants, it's time to look at vowels for a change.

Aisin Gioro Ulhicun has not yet publicly released a full description of her reconstruction of Khitan phonology, but I can attempt to reverse-engineer it from the fragments in this 2012 article which has her latest reconstructions of many Khitan small script characters in the fourth column. (Some reconstructions are in Aisin Gioro's 2011 book 契 丹語諸形態の研究 which I haven't seen.) Those reconstructions have eleven vowels:

i  ï u
ö ə o
æ a ã ɑ

Other vowels (e.g., ü, ɪ, e) may be in the 2011 reconstructions I have not yet seen.

The core vowels appear to be i, ə, a, u, and o. The others appear only in restricted environments to the best of my limited knowledge:

- æ is only in closed syllables (cf. English [æ] which cannot appear in word-final position)

- ã is only in 45 <qa> ~ <qã> 'khan' (= 051 <ha> in Kane 2009; see Andrew West's list for the glyphs corresponding to each of Kane's numbers which are all from Qidan xiaozi yanjiu except for the last two)

Do any known northeast Asian languages have nasal vowels?

If we didn't know about the word 'khan' in other languages, would it be possible to reconstruct a nasal vowel?

I suppose it is possible that 'khan' is the only word in Khitan with a nasal vowel because it was borrowed from a language in which *-an became *-ã, but I wouldn't bet on that.

- ɑ is only in the closed syllable 160 <tʃɑl> (= 183 <car> in Kane 2009)

Is <ɑ> a typo for <a>? If not, why not reconstruct <tʃal>? What is the evidence for a back allophone of */a/ before a final (velarized?) */l/?

- ö is only in 324 <u> ~ <ö> (= 372 <û> in Kane 2009); it transcribes Chinese *u and in native words corresponds to Mongol ö and perhaps u (see Kane 2009: 80, 99, and 105)

- ʊ is only in 313 <ʊŋ> and 320 <tʃʊŋ> (= 357 <úŋ> and 367 <źuŋ> in Kane 2009) for Chinese loanwords

I don't see why 313 (Kane 357) can't be reconstructed as <uŋ>. I don't know what the difference was - if any - between its rhyme and the rhyme of Kane's 106/345 <uŋ>; all three characters transcribed Chinese *-uŋ. Kane (2009: 77) regarded 346 as a variant of 345, though he gave no examples of their interchangeability. Kane 181 <iúŋ> for 龍 *ljuŋ or *lyŋ might have been <üŋ> (= Aisin Gioro's 158 <juŋ>).

320 (Kane 367) is probably <ywiŋ> since it transcribed Chinese 榮 *jwiŋ (which still rhymed with *-iŋ words during the Liao Dynasty; see Kane 2009: 249) and is clearly derived from 榮:

*jwiŋ shifted to *juŋ by the Yuan Dynasty and did not develop a *z-like initial until after the Yuan Dynasty, long after the fall of the Khitan. It never had an affricate initial in Chinese, so I do not know why Aisin Gioro reconstructed 320 with<tʃ->. (This section revised 4.10.23:27.)

- ï is presumably only for Chinese loans, though I wonder if it also existed in native Khitan words (Korean kŏran < *kətan 'Khitan' may imply a Khitan *qïtan, and Janhunen [2003: 5] reconstructed in pre-Proto-Mongolic.)

Only six vowels (the core five pljus æ) appear in diphthongs:

Rising diphthongs: iV = /jV/?


I have included ju since it's not clear to me how iV and jV are different in Aisin Gioro's reconstruciton.

Rising diphthongs: uV = /wV/?


I could have included ui if it was /wi/, but I suspect it was the high counterpart of oi /oj/ (see below) and the mirror image of ju (see above).

Falling diphthongs: Vi = /Vj/?

  əi oi
æi ai  

Falling diphthongs: Vu = /Vw/?

  əu, (jəu)  

Comparing the four tables above, a pattern emerges:

V in iV/Vi is never nonlow and front

V in uV/Vu is always nonhigh and central

Offhand I wonder if Aisin Gioro's core vowel system is compatible with a height harmony system:

high i ə u
low æ = /e/? a o

This is like the height harmony system of Middle Korean (and my reconstructions of Old Chinese and pre-Tangut). However, the limited distribution of æ makes me wonder if it was just an allophone of /a/ or /ə/.

I don't know where ö would fit. The conflicting clues for its pronunciation - back [u] or front [ø]? - reminds me of Manchu ū [ʊ] which was written like Mongolian ü.

I suppose the Chinese loan vowels ï and ʊ would fall into the high and low series.

I doubt ã and ɑ (as a phoneme distinct from /æ/ and /a/ as opposed to an allophone of /a/) ever existed. *C(.)R-USTERS IN BLACK TAI AND BAO YEN

I concluded "S-implification in Black Tai and Bao Yen" by writing,

Without looking at the development of other Proto-Tai *C(.)r-clusters in the two languages, I cannot be confident about these reconstructions.

I already gathered all the reflexes of Proto-Tai *C(.)r-clusters in Black Tai in that post. I list them again below in a more convenient tabular format along with the corresponding reflexes in Bao Yen from Pittayaporn (2009). Unlike Pittayaporn, I distinguish between *qr- and *q.r-, and I reconstruct *c.r- instead of *cr-.  I have added reflexes of *ʰr- and *r- for comparison.

Proto-Tai *pr- *p.r- *br- *tr- *r.t- *ʰr- *c.r- *kr- *qr- *k.r- *q.r- *gr- *voiced C.r- *r-
Black Tai pʰ-/f- t- p- h- tʰ- h- s- c- h- h-
Bao Yen pʰj- pʰ- pj- kʰ- r-, (l-) r-

See "S-implification in Black Tai and Bao Yen" for more on the development of *K(.)r-clusters. Details on other types of *r-clusters follow.

Notes on Black Tai:

1. Pittayaporn (2009) has /pʰ/ correponding to /f/ in Gedney's data in Hudak (2008). The former is more conservative:

*pr- > *pɣ- > *px- > /pʰ-/ > /f-/

2. *p.r- may have become *pr- and then *tr- after original *pr- and *tr- had been lost. This new *tr- then simplified to /t-/.

3. *br- lost all trace of its medial:

*br- > *bɣ- > *bɰ- > *bj- > *b- > /p-/

Compare with *gr- whose medial became *-j- and palatalized the preceding velar:

*gr- > *gɣ- > *gɰ- > *gj- > *kj- > /c-/

(The relative chronology of changes I do not discuss in detail is not intended to be exact: e.g., devoicing might have preceded palatalization.)

4. *r.t- merged with *tr- and perhaps became /h-/ via a dental fricative stage:

*r.t- > *tr- > *tɣ- > *tx- > *θ- > /h-/

Then again, if *kr- became a *kx- that simplified to *x-, then perhaps *tx- also simplified to *x-.

5. *voiced C.r- merged with *r-.*ʰr- and *r- may have become *x- and *ɣ- after original *x- and *ɣ- had become *kʰ- and *g- (now /kʰ/ and /k/). These new velar fricatives then backed and merged as /h/.

6. *c.r- (Pittayaporn's *cr-) may have become a third kind of *tr- dating between the other two (original *tr- and *tr- from *p.r-):

Proto-Tai *r.t-/*tr-merger *-r- > *-x- sesquillabic compression; *Cx- > *x- cluster assimilation Black Tai
*p.r- *p.r- *p.r- *pr- *tr- /t/
*tr- *tr- *tx- *x- /h/
*c.r- *c.r- *c.r- *tr- *tθ- /tʰ/

Cluster assimilation required one part of a cluster to become more like the other:

*pr- > *tr- (labial to dental)

*tr- > *tθ- (voiced sonorant to voiceless obstruent)

That is my attempt to find commonality between two otherwise seemingly very different paths of change.

Notes on Bao Yen:

1. *-r- became /(ʰ)j/ after labials as well as *g-. I don't understand why this didn't happen after *k- and *q-:

*pr- > /pʰj/

*br- > /pj/

*gr- > *kj- > /c/

but *kr-, *qr- > /kʰ/ (not */kʰj/)

Maybe there was a constraint against coronals + *-j-.

2. Pittayaporn's Proto-Tai *p.r- has two kinds of reflexes in Bao Yen:

/pʰj/ (like *pr-: e.g., 'shuttle of loom')

/pʰ/ (unlike *p.r-: .e.g., 'cucumber')

My guess is that some *p.r-words (e.g., 'shuttle of loom') compressed into monosyllables before others (e.g., 'cucumber') in pre-Bao Yen.

Here is how original *pr- and secondary *pr- might have developed:

*pr- > *pɣ- > *pʰɣ-  *pʰɰ- > /pʰj-/

*p.r- > *pr- > *pɣ- > *px- > /pʰ-/

3. Proto-Tai *ʰrwɯ:j A became Bao Yen /wi: A1/ rather than */wi: A1/, presumably because Bao Yen does not have initial /hw/. I don't know that for a fact; I only know that /hw/ is not in any Bao Yen word in Pittayaporn's data.

Note that Proto-Tai *hw- sans *-r- became Bao Yen /pʰ/.

4. I didn't reconstruct *kx- (from *kr- and *qr-) simplifying to *x- in Bao Yen, so I won't reconstruct *tx- (from *tr- and *r.t-) simplifying to *x-. Instead, I'll have *tx- fuse into *θ- and back to /h-/ (cf. Black Tai note 4 above).

5. Unlike Black Tai, Bao Yen did not merge *ʰr- and *(voiced C.)r-. Only the former became /h/; the latter (generally?) remained *r- (Pittayaporn found a single case of /l/ < *voiced C.r- - perhaps a loanword?).

6. I'm not happy with how I bridged *c.r- and /tʰ/ in Black Tai (note 6 above), but I can't think of any better solution, and for now I recycle it for Bao Yen. THE ORI--IN OF MOHAWK'S ONLY AFFRICATE

I rediscovered Mohawk when looking for a language lacking m and found it mentioned in Wikipedia's article on bilabial nasals. I last wrote about Mohawk five years ago after seeing it on a stop sign. Back then I didn't mention Mohawk's only affricate /dʒ/ which apparently is quite different from other obstruents judging from Wikipedia's description of Mohawk phonology:

- it is always voiced: [dz] ~ [dʒ] (depending on dialect)

- it patterns in clusters like a sonorant rather than an obstruent

One other unusual characteristic of /dʒ/ is its ability to combine with /j/ in both initial and medial position. Is /dʒj/ pronounced [dʒj] which would be very difficult to distinguish from [dʒ] without [j]? Or is /dʒj/ pronounced as palatal [dʑ] or [ɟ]?

I wonder if /dʒ/ was originally a voiced sonorant like *r which does not exist in modern Mohawk (though Proto-Iroquoian had *ɹ). I am not confident about that solution for three reasons:

1. I don't know what /dʒ/ corresponds to in other Iroquoian languages. Is it from Proto-Iroquoian *ts?

2. I don't know of any other language in which *r(j) hardened to /dʒ(j)/: *r > > *ʒ > /dʒ/ or *r > > *dʐ > /dʒ/.

3. If *r hardened to /dʒ/, I would expect other instances of fortition. Do such instances exist?

Like Mohawk, Proto-Iroquoian lacked labials other than *w. Did pre-Proto-Iroquoian undergo lenition: e.g., did *p and *m weakened to *w?

If Proto-Iroquoian had no *m, where did Cherokee /m/ come from? Loanwords? S-IMPLIFICATION IN BLACK TAI AND BAO YEN

I've read the introduction to Ostapirat (2000) many times, but recently this passage on p. 19 jumped out at me, and not just because of the un-PC use of "inferior":

"Kra", the autonym which originally means 'human being' [...] Cf. the related form in Black Tai /saa C1/, which has been borrowed as Vietnamese /xá/ to designate various inferior ethnic groups in Vietnam

How did Kra come to have initial [s] in Black Tai and Vietnamese? (Vietnamese x is [s].)

Black Tai is spoken in northwestern Vietnam, and I initially thought that perhaps it had undergone the shift

*Cr- > *ʂ- > s-

which had also occurred in northern Vietnamese. (Southern Vietnamese still has [ʂ].) The resulting /saa C1/ was then borrowed as xá.

However, that was not the case:


*pr- > Black Tai /pʰ-/ (Pittayaporn 2009: 140) or /f-/ (Gedney 0287, 0300) but northern Vietnamese s [s]

*br- > Black Tai /p-/ (e.g., Gedney 0647)

*tr-, *kr- (and my *qr-; see below) > Black Tai /h-/ (Pittayaporn 2009: 141, 143; e.g., Gedney 0081 and 0082) but northern Vietnamese s [s]

*cr- > Black Tai /tʰ-/ (Pittayaporn 2009: 142; e.g., Gedney 0706) but northern Vietnamese s [s]?

*gr- > Black Tai /c-/ (e.g., Gedney 0160)

*qr- (= my *q.r-; see below) > Black Tai /s-/ (Pittayaporn 2009: 144; e.g., Gedney 0124)

*C.r-sequences (i.e., sesquisyllables beginning with *CVr-)

*p.r- > Black Tai /t-/ (e.g., Gedney 0345)

*k.r- > Black Tai /s-/ (e.g., Gedney 0120)

*Unknown voiced consonant + r- > Black Tai /h-/ (e.g., Gedney 0310)

Gedney numbers refer to cognates in Hudak 2008.

Pittayaporn's Proto-Tai has no *dr-, *ɟr-, or *ɢr-; these may be accidental gaps.

Pittayaporn (2009: 337) reconstructed Proto-Tai *kraː C 'slave' which should have become Black Tai */haa C1/ but is /saa C1/ as if it were from *qraː C or *k.raː C. My guess is that pre-Black Tai retained a sesquisyllabic *k.raː C that collapsed into a monosyllabic *kraː C in other early Tai varieties.

Bao Yen  (Pittayaporn 2009), another Tai language in Vietnam, also seems to have retained *k.raː C. Compare its reflexes of *K(.)r- with those of Black Tai (Gedney in Hudak 2008). Exclamation marks indicate forms that would be irregular if they did not come from sesquisyllables.

Gloss Proto-Tai (Pittayaporn 2009) Bao Yen Black Tai
slave *kraː C (my pre-BY and pre-BT *k.raː C) /saa C1/ (!) /saa C1/ (!)
spider *krwaːw A (my pre-BY and pre-BT *k.rwaːw A) /saaw A1/ (!) /saaw A1/ (!)
to imprison *k.raŋ A /saŋ A1/ /saŋ A1/
six *krok D (my pre-BY *k.rok D) /sok DS1/ (!) /hok DS1/
to seek *kraː A (my pre-BY *k.raː A) /saa A1/ (!) /haa A1/
to sift *qrɤŋ A (my pre-BY *q.rɤŋ A) /sɤŋ A1/ (!) /hɤŋ A1/
egg *qraj A (my pre-BT *q.raj A) /kʰaj B1/ /saj B1/ (!)
mountain stream *qrwɤj C /kʰuəj C1/ /huaj C1/
to laugh *krɯəw A /kʰuə A1/ /hua A1/
fish net *kreː A /kʰɛː A1/ /hɛ A1/
mortar *grok D /cok DS2/ /cok DS2/

Unlike Pittayaporn, I distinguish between *q.r- and *qr-; his *qr- corresponds to my q.r- whose reflexes are like those of *k.r-, and my new *qr- has reflexes like those of *kr-.

Like 'slave', 'spider' was sesquisyllabic in the ancestors of Bao Yen and Black Tai.

Pre-Bao Yen apparently retained sesquisyllabic forms of 'six', 'seek', and 'sift' that became monosyllables in pre-Black Tai and other early Tai varieties.

Conversely, pre-Black Tai retained a sesquisyllabic form of 'egg' that became a monosyllable in pre-Bao Yen and other early Tai varieties.

Here is how *K(.)r- might have simplified:

Black Tai

Proto-Tai Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Today
*kr-, *qr- *kr- *kɣ- *kx- *x- /h/
*k.r-, *q.r- *k.r- *kr- *kʂ- *ʂ- /s/
*gr- *gr- *gɣ- *gɰ- *gj- *kj- /c/

I cannot reconstruct a *kʰ-stage between *kr-, *qr- and /h-/ because Black Tai has a /kʰ-/ distinct from /h-/.

Bao Yen

Proto-Tai Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Today
*kr-, *qr- *kr- *kɣ- *kx- *kʰ- /kʰ/
*k.r-, *q.r- *k.r- *kr- *kʂ- *ʂ- /s/
*gr- *gr- *gɣ- *gɰ- *gj- *kj- /c/

The relative chronology is only approximate.

Without looking at the development of other Proto-Tai *C(.)r-clusters in the two languages, I cannot be confident about these reconstructions. MEETING AT NIGHT: GEMINATION IN UKRAINIAN AND BELARUSIAN DECLENSION

While looking for a free online Ukrainian grammar, I found Martin Dietze's "Ukrainian Grammar Short Reference" at his herbert.the-little-red-haired-girl.org (what a domain name!) with the paradigm of зустріч 'meeting':

Case/number Ukrainian Belarusian Russian
Nominative/accusative singular зустріч ніч ноч ночь
Vocative singular* зустріче ноче -
Genitive/dative/locative singular
Nominative/vocative/accusative plural
зустрічі ночі ночы ночи
Instrumental singular зустріччю ніччю ноччу ночью
Genitive plural зустрічей ночей начэй ночей
Dative plural зустрічам ночам начам ночам
Instrumental plural зустрічами ночами начамі ночами
Locative plural зустрічах ночах начах ночах

I've included the paradigm of ніч 'night' and the paradigm of its Belarusian and Russian cognates for comparsion. (The Belarusian and Russian cognates of зустріч are сустрэча and встреча which belong to a different declension.)

Why do Ukrainian and Belarusian have stem-final gemination (in bold) in the instrumental singular? My guess is that the consonant lengthened to compensate for the loss of the vowel which is still preserved in Russian orthography:

*nočьju  > U [nʲitʃʲːu], B [notʂːu], R [notɕu]**

Compare how a vowel rather than a consonant lengthened in a similar environment in Japanese: *tiyu > *tyu > chū.

I am also reminded of Pali geminates from Sanskrit *-(C)Cy-: e.g.,

maccu- 'god of death' < mṛtyu- 'death'

maccha- < matsya- 'fish' (Does the aspiration of geminate cch indicate that Sanskrit s was aspirated [sʰ] like Korean s? See Jacques 2011 for more on aspirated fricatives.)

According to Wikipedia, there was no gemination if there were one or more consonants before *CьjV. Gemination would have resulted in a C1C2ː-cluster which doesn't exist in Ukrainian. See Wikipedia and Mayo (1993: 903) for further constraints on gemination in Ukrainian and Belarusian.

4.6.20:39: The Ukrainian verb лити [lɪtɪ] 'to pour' has similar stem-initial gemination: e.g.,

*lьju > ллю [lʲːu] 'I pour' (cf. Belarusian and Russian лью [lʲu])

However, the stem-initial gemination of Ukrainian ссати [ɑtɪ] and Belarusian ссаць [atsʲ] 'to suck' appears to be from *sъ- with instead of *ь; cf. Russian сосать [sɐsatʲ].

*Dietz did not list a vocative singular for зустріче, so I supplied one by analogy with ноче.

**Can Ukrainian, Belarusian, and Russian speakers tell each other apart by their pronunciations of ч /č/? I have never heard Belarusian, and have relied on Wikipedia's IPA guides (Ukrainian, Belarusian, and Russian) for the phonetic forms here. To my poor ear, Ukrainian and Russian ч sound similar, and both sound completely different from Mandarin j [tɕ] even though Wikipedia's IPA for Mandarin j is identical to its IPA for Russian ч.

Tangut fonts by Mojikyo.org
Tangut radical and Khitan fonts by Andrew West
Jurchen font by Jason Glavy
All other content copyright © 2002-2013 Amritavision