I was unaware of grammatical gender until I started studying German at age 12. I've wondered if I would have been scared of it if I hadn't been exposed to the concept at a young age. It wasn't in any language I had been exposed to up to that time: English, Hawaii Creole English, Japanese, or Mandarin. And it hasn't been in any East Asian language I've ever seen until I saw Kane's 2009 book on Khitan.

Khitan words that Kane translated as English adjectives may or may not have gender concord: e.g.,

With concord (Kane 2009: 126)

<WHITE (f.) eu.ul> 'white cloud (f.)'

<WHITE♂ m.em> 'white ice (m.)'

<♂> corresponds to a dot added to the default feminine graph.

The pronunciation of

<WHITE (f.)> and <WHITE♂>

is unknown.

Without concord (?) (Kane 2009: 90, 105, 108, 126)

<ai.d bo.hu.án> 'sons' = 'male children'

cf. <ai.d> 'fathers'

<mó bo.qo> 'daughter' = 'female child'

<mó bo.hu.án> 'daughters' = 'female children'

<mó ku> 'wife' = 'female person'

cf. <mó> 'mother'

True Khitan adjectives

I hypothesize that Khitan adjectives - as opposed to all Khitan words that are translated as adjectives - have gender concord. This concord may be indicated

with distinct suffixes:

usually <-er> (m.) and <-én> (f.), but other suffixes exist:

<-d.o.ɣo> in <tad.o.ɣo> ~ <tod.o.ɣo> 'fifth'; cf. <taw> 'five'

<-qó> ~ <qu> (m.) and <-qú> (f.) for <m.as-> 'first, eldest, great'

Are there other adjectives with these suffixes? Are there other suffixes? Are these rarer suffixes remnants of an earlier, more complex adjectival declension system?

with distinct words (?):

<m.o> 'first, eldest, great' (m.)

<GREAT> 'first, eldest, great' (f.; not known if this word is similar to <m.o> or <m.as->)

the <o> ~ <as> alternation reminds me of how Sanskrit -as becomes -o before voiced consonants and a-, though I doubt a similar process had occurred in the history of Khitan

Khitan pseudo-adjectives

I consider <ai.d bo.hu.án> to be nouns in apposition due to the double plural marking (<-d> and the irregular plural of <bo.qo> 'child').

<mó bo.hu.án>, on the other hand, is a compound noun with plural marking on the second half only. A equivalent sequence of nouns in apposition would be <mó.t bo.hu.án> with the feminine plural suffix <-t>. (Is that suffix cognate to the masculine plural suffix <-d>? Did <-d> originate as a <-t> that voiced before a following vowel?)

<mó. bo.qo> and <mó ku> are also compound nouns in spite of the fact that neither is written as a single polygram: i.e., as <mó.bo.qo> or <mó.ku>.

Is the compound noun <ai bo.qo> attested, or is it redundant since <bo.qo> by itself can mean 'son' as well as 'child'?

I presume that if one wanted to specify 'father's child' or 'mother's child', one would have said

<ai.en bo.qo> 'father-GEN child'

<mó.on bo.qo> 'mother-GEN child' (is <mó.on> attested? why not <mó.ón>?)

Two (or three) case studies


<da.lỏ> 'seven' can also be translated as 'seventh' before masculine nouns. (Kane 2009: 144 lists no examples of 'seventh' before feminine nouns.) I presume 'seventh' is the first half of compound nouns, not a true adjective. But I am surprised because all ordinals from 'first' through 'sixth' and 'eighth' exhibit gender agreement. Why is 'seventh' exceptional? Were <da.lỏ.er> (m.) and <da.lỏ.én> (f.) possible? Or were there noninflecting adjectives in Khitan?

'Left' and 'right'

<ci.g.en> ~ <dzi.g.en> 'left' and <bo.ra.(i)a.an> 'right' have no known gender endings before <(u.)ur> 'division'. <-en> and <-an> are known genitive endings for consonant-final and -a stems. Are 'left' and 'right' really NOUN-GEN morpheme sequences? Were the nouns for 'left' and 'right' cig ~ dzig and bora(y)a?

Next: Paternal Aid, Annual Ice, and a Hundred Mothers G-ÉN-D-ER ER-RORS?

I write birthday wishes not only to honor those whom I respect, but also to practice the Khitan small script and Khitan grammar. Reading about Khitan isn't enough for me; I want to use it to learn it, even if I embarrass myself in public.

All my postings on extinct languages - and especially those on TJK (Tangut / Jurchen / Khitan) - are probably full of errors. No reconstruction of a language - a highly educated guess at best - is equivalent to the real thing, though one can come close.

It's particularly hard to write in languages that lack both native speakers and established grammars. Khitan has no Panini or even a Whitney; its structure must be extracted from a limited number of texts by modern scholars who still can't completely read either of its two scripts.

A number of things about my attempts at Khitan constantly bother me. Perhaps gender agreement is at the top of the list.

Khitan has multiple past tense suffixes. More on these later. There is a pair of suffixes that indicates the gender of the subject of the verb:

<XX born.én> 'XX (female) was born' (cf. Kane 2009: 174)

<XY born.er> 'XY (male) was born' (cf. Kane 2009: 86)

There is also a construction

<XX-GEN BORN.én DAY> 'the day XX (female) was born' = 'XX's birthday' (cf. Kane 2009: 86)

(GEN = genitive)

but what is the construction for 'the day XY (male) was born'? Last night I assumed it was


with masculine <er> instead of feminine <én>. But what if 'day' is a feminine noun whose gender-sensitive modifiers must also be feminine? Do all sensitive adnominal forms (forms before nouns) match the gender of the following nouns, or do verbs and adjectives behave differently: e.g.,

VERB + gender of preceding/implied subject + NOUN

ADJECTIVE + gender of following noun + NOUN

Is <BORN.er DAY> attested? I can't find it in the few texts that I have on hand.

Next: Sometimes Sensitive GRANT JONS-EN ...-ER ...-DE ESEN-ER!

Tonight's birthday wish is in one long vertical line of Khitan:

<j.o.n.s.en>: <-en> = possessive suffix
  <BORN.er>: <-er> = masculine past tense suffix
<DAY.de>: <-de> = temporal suffix: 'on the day ...'
<s.en.er>: <s.en> = esen 'long life'; <-er> = object suffix

Grant Jons-en ...-er ...-de esen-er! (The readings of <BORN> and <DAY> are unknown.)

'On the day Grant Jones was born, [I wish him] long life!'

All character components except <g> have been covered in previous birthday posts, though the polygrams (component combinations) for Grant's name are new.

The Khitan small script allows writers to combine consonants in un-Khitan combinations like Gr-:

G ra
n t

The e in Jones is silent, so it's not written:

J o
n s

The component is <j> here but was <ji> in  <go.ji.ra>. I assume that it was <j> before vowels and <ji> elsewhere. LONG LIFE TO DEATH!

Today is

<m.or.t t.od.en BORN.er DAY>

Mort Todd's birthday (see the breakdown of his name here*)

and this time I want to do something different. I can't help but try to translate a name like that into Khitan:

<tu.úr.bo.ń.de s.en.er>

'death-to long-life-(object)'

= '(I wish) long life to death'

tuúr-boń is an honorific word for 'died' but might also mean 'death' since -boń can also be a suffix that turns verbs into nouns (Kane 2009: 91, 155).

I wanted to use the imperial word for 'die' but I'm not sure (a) how it was pronounced and (b) more importantly how to form a noun from it.

-de after tuúr-boń 'death' indicates that death is a recipient.

isen or esen (spelled <s.en> without i or e) is followed by the suffix -er indicating that it is the object of a verb (that I've left out).

Summing up, the pattern in this phrase is

indirect object-de direct object-er

Here are the breakdowns of the polygrams (multi-part characters) for Mort Todd:

M or

The t character appears again in the polygram for Todd. Khitan has no distinction corresponding to upper or lower case, so the same t character can be used at the beginning or end of a polygram.

T od

The Khitan spelling reflects the pronunciation of Todd with a single [d]. Hence there is no need for an extra d character. THE EN-IGMA OF THE KHITAN G-EN-ITIVE

In last night's post, I introduced the Khitan genitive endings. A very simple (and of course inaccurate) pair of rules accounts for most cases in Kane (2009: 132-136):

If a word ends in a vowel, it takes a genitive ending with that vowel plus -n.

If a word ends in a consonant, it takes the genitive ending -en.

Now let's look at the genitive endings one by one and examine exceptions solely using the very limited data in Kane (2009: 132-136).

1. -Vn endings

1a. -an

All -an follow -a nouns.

However, not all -a nouns take -an:

poŋca 'investigation commisioner' (< Chn 訪察 *foŋcha) takes -en

1b. -en

-en generally follows consonants other than

after rounded vowels


See 1d-1e below.

-en also follows the vowel-final noun śarí 'court attendant'

I would expect -en to also follow -e nouns, but Kane listed no examples of -e-en.

The two -e nouns that Kane lists have surprising genitives. See 2 below.

See 3-4 below for more exceptions.

1c. -in

All -in follow -i nouns.

However, not all -i nouns take -in. See 1b above and 3 below.

1d. -on

All -on follow -o(ŋ) nouns.

However, not all -o nouns take -on. See 4 below.

1e. -un

All -un follow -u(ŋ) and -w nouns.

2. -n

Unlike 1a-1e, -n apparently replaces the final -e of a noun if one takes the Khitan small script spellings at face value:

<ún.e> 'now' ~ <ún.n> 'of the present'

Did Khitan really have a phonemic geminate /nn/ in final position?

<û.e> 'the tribal title 于越  yuyue' ~ <û.n> 'of a yuyue'

Yuyue is the modern Mandarin reading of the old transcription for this title.

于越 was read as something like *üwiet in Late Middle Chinese and as üüe in Liao Dynasty Chinese.

Wittfogel and Feng (1949: 432) propose Turkic ögüt 'counsel' as a possible source, but neither the Khitan small script spelling nor the Chinese transcription point to a -g-. Did medial *-g- lenite to zero in early Khitan between the period when this word was borrowed and the period when the word was written in Chinese and the Khitan small script? The lenition of medial *-g- would not be surprising given the lenition of medial *-b- in 'five':

*tabu > taw 'five' (cf. Written Mongolian tabun, Daur taaw [Todaeva 1986]; more Daur and other Mongolic forms here)

If the word ever had a final -t, that consonant was lost by the time the word was written in the Khitan small script texts available to us. It's possible there was a <t>-spelling back in the earliest KSS texts, but we have no dated KSS inscriptions between 925, the year of the creation of the KSS, and 1052. Andrew West has a chronological list of KSS inscriptions. This Wikipedia list is sortable.

3. -iń

Follows plurals ending in -r, -iii (sic), -d.
But 'two (people)', probably ending in -r, takes -en, not -iń.

dalo non 'seven generations' also takes -en, perhaps because it lacks a plural ending and is literally 'seven generation'.

but 'seven' is followed by a plural noun in SEVEN po-od 'seven hours'; see Kane (2009: 139-142) for other examples of numeral + noun-PL.

4. -i

Follows Qid(un) 'Khitan' (with an odd second vowel) and boqo 'son, child'

The expected genitives are *Qid(un)-en and *boqo-on.

boqo also has an irregular plural boɣuan (instead of the expected *boqo-d)..

Are -i nouns remnants of a declension class that was once larger?

How can irregularities be explained?

I suspect we are seeing the collapse of an earlier, richer system of declension and vowel harmony. So far, it seems that scholars speak of Khitan as if it were a stable, single, unchanging language. However, the two Khitan scripts were in use for about 270 years (if their origin dates are accurate) - including about 65 years under Jurchen rule - over a wide area. Thus variation may have at least three sources other than random errors:

1. Chronological variation: was 10th century Khitan like 12th century Khitan?

2. Geographic (i.e., dialectal) variation

3. Jurchen influence during the Jin Dynasty: Did Jurchen first language speakers make mistakes in Khitan? Did Jurchenisms become acceptable in late standard Khitan even among Khitan first language speakers?

A Khitan morphological database with the date and location (if known) of the text for each word may help to resolve apparent irregularities - and introduce new mysteries to be solved. MAKER-DE TUMU EUÚR!

Today is

Maker-en BORN-er DAY (the Khitan readings of BORN and DAY are unknown)

'Macker's birthday'

so I wish to tell him in Khitan,

Maker-de tumu euúr!

'Macker-to ten-thousand years-of-life'

'To Macker, ten thousand years of life!' = 'Long live Macker!'

Macker is written phonetically in Khitan as a combination of four characters:

M a
k er

But wait. Don't the Khitan phrases above have five-character spellings for 'Macker'?


Those of you who've read other entries in my birthday series will recognize the ending -de 'to':

In English, one says "to X", but in Khitan one said "X-de".

The other ending -en '-'s' is one of the first small Khitan characters I ever noticed. It's debuting on my blog 15 years later:

Although it can be translated as '-'s', not all instances of '-'s' correspond to -en. Khitan possessive endings vary depending on the preceding word (Kane 2009: 132-136):

-an after -a

-en after consonants

-in after -i

-on after -o

-un after -u

and in some cases, -n, -iń, -i

I've only outlined the general pattern. I'd like to examine exceptions later. MEN WITH HATS

In these three tangraphs from "A Tangut Qua-ndary"

3966 1vɨị 'taste'

3968 2khɪ 'taste'

4243 1vɨị 'pear'

the vertical line (alphacode: bae) of

+bae + hem (or pix if bae is under the 그 of hem)

was placed under the 'roof' of

alphacode: jui

What was jui? It appears on the left or right of 23 tangraphs:

jui on the left

Tangraph LFW# Alphacode Reading Gloss Source of jui Notes
3917 juidiodom 1tʃɨəə to sew, patch up unknown; perhaps each other? cf. 0688, 2568 below
3969 juibaxbelcin 2dwu to mend, patch, repair
3970 juidim 1dʒɨõ 2nd half of 2dʒɛ 1dʒɨõ 'to do, make' 3977 one reduplicative word with two spellings?; not attested outside dictionaries? odic?
3977 juihom 1dʒɨõ 2nd half of 2dʒɛ̃ 1dʒɨõ 'to play' 3970
3998 juibiobeucin 1juu to taste 3966 cf. 4046, 5387 below

The analysis of 3998 confirms that the qua of 3966 consists of jui + bae.

jui on the right

Names for phonetics are in my lay Tangut romanization.

Tangraph LFW# Alphacode Reading Gloss Source of jui Type
0662 henfimjui 1khiəə to grind 2647 khy-phonetic
0687 beljui 2piẹ̃ shovel ? pen-phonetic
0688 harjui 1viẹ to mend 2568 semantic: sewing;
pen-phonetic (but vowel isn't nasal!)?
0690 tuajui 1vəuʳ (transliteration) 3909 pu-phonetic
0705 herbaejui 1ziẹ time 2647 ?
0715 hoajui 2piẹ̃ ancient Chinese battle axe 2653 pen-phonetic
2568 baepikjui 1rəʳ to stitch, sew 5391 semantic?: 'stripe' > 'pattern' > 'to sew'?
2643 dexjui 1piẽ tent 5898 (3909?) pen-phonetic?; 3909 is a better phonetic than pu-phonetic 5898
2647 dexfiujui 2khiəə line, ranks (0662?) khy-phonetic
2653 dexdoojui 1piẹ̃ horn 0715 pen-phonetic

would 'ancient Chinese battle axe' really be created before 'horn'?
3908 doejui 2ləu proper; upright; regular
2bi 'step, pace'
2bi 'step' has no semantic or phonetic similarity to 2ləu 'proper'; maybe 2647 'line, ranks' was the real source: 'in line' > 'proper'?
3909 girjui 1pəu the surname Pu 2653 semantic?: shares initial with 2653 but not rhyme; was this family associated with horns?
4046 coepikjui 1khɪ bitter 3966 semantic: taste; also khy-phonetic, though the vocalic match is loose?
4340 boxdexfiujui 2khiəə a kind of tree ? khy-phonetic
5363 gosjui 2thew to play 2653 1piẹ̃ has no semantic or phonetic similarity to 2thew 'to play'; phonetically similar to 5391 2sew, but are there other cases of th- ~ s- phonetic series in tangraphy?
5387 guajui 2ʃwɨii astringent (3966?) semantic: taste?

loan from Middle Chinese 澀 *ʂɨp; -w- reflects a pre-Tangut *P-prefix; vowel length may reflect a lost final stop; tone 2 may reflect a pre-Tangut *-H suffix
5391 pikjui 2sew piebald; stripe 5391 semantic?: 'to sew' > 'pattern'?
5898 tarjui 1phiuu to cover, shelter 2653 (3909?) pu-phonetic?; 3909 is a better phonetic than pen-phonetic 2653

jui apparently has at least eight functions (A1-A3, B1-B5):

A. Semantic

1. sewing/pattern: 3917, 3969, 0688, 2568, 5391

2. taste: 3998, 4046, 5387

3. line: 3908

B. Phonetic

1. jon-phonetic?: 3970, 3977

2. khy-phonetic: 0662, 2647, (4046?), 4340

3. pen-phonetic: 0687, (0688?), 0715, 2643, 2653

4. pu-phonetic: 3909, 5898?

5. Cew-phonetic: 5363?, 5391?

C. Unknown: 0705

Did literate Tangut know which function of jui was relevant for any given tangraph, or did they just blindly memorize jui-combinations?

Tangut fonts by Mojikyo.org
Tangut radical and Khitan fonts by Andrew West
Jurchen font by Jason Glavy
