I should compile a file containing every proposed Khitan large script character reading I can find to avoid embarrassing myself.

In "Si-cret of the Roofed Four", I had no readings for

but last night I found the reading of the first character in Kane (2009: 181) and tonight I stumbled upon the reading of the second just a few lines down on the same page. I've seen that page many times. Ideally I should memorize all the readings on it (and in Kane 2009 as a whole).


corresponds to its Liao Chinese lookalike 子 *tsɨ 'child' and its homophone 紫 *tsɨ 'purple'. Its Khitan reading is very close to those of

<(t)sin> ~ <(t)siʔ>

corresponding to Liao Chinese 晉 *tsin,*sin, and 習 *siʔ with final consonants. I tentatively conclude that the 广 roof is an arbitrary addition indicating a pronunciation similar but not identical to 子.

广 cannot be a symbol specifically indicating the presence of a final consonant because it also distinguishes


whose reading lacks a final consonant from

<kon> or <xon>?* 'ten'

whose reading may have had a final -n.

The reading <(t)so> is based on a correspondence with Liao Chinese 左 *tso 'left, to assist'. It reminds me of Japanese -so 'ten'** in archaic numerals like yaso 'eighty' < 'eight-ten'. Could 广+十 have originally represented a word for 'ten' in a Japonic language spoken in Parhae? Or is 广+十 derived from 左 *tso? 广 is similar to 𠂇 and 十 is similar to 工. But I don't know of any 广+十-like cursive or variant forms of 左. I did, however, just discover a Dunhuang variant of that resembles Jurchen

<jakunju> 'eighty'.

*The readings <kon> and <xon> are guesses based on the Jurchen numerals for twelve through nineteen


<jirhon> ~ <oniyohon>

which might be of Khitan origin. Janhunen (2003: 399) reconstructed Pre-Proto-Mongolic *-kU-n or *-xU-n '-teen'. I am uncertain about the initial consonant of the suffix because I do not know whether Jurchen borrowed '-teen' before or after *k lenited to h [x] in Jurchen.

**The native Old Japanese word for 'ten' was təwo < ?*təwə. A-NET-ER ENGIMA

In "Four Rhinos", I proposed that the Khitan large script character

<(t)sɨ> ~ <(t)si>

was a cognate of 四 Liao Chinese *sɨ 'four'*. However, it turns out that there is an identical Chinese character (like 𠕀 but with a hook at the bottom right) which is a variant of 网 Liao Chinese *oŋ 'net'.

By coincidence, Korean 넷 net means 'four' ... but I digress.

If the Khitan large script character really is related to 网 'net', I might expect its readings to resemble some non-Chinese word for 'net'. But I know of no such words:

Written Mongolian tour

Manchu asu**

Korean kŭmul

Japanese ami

Perhaps the Khitan word for 'net' was si or began with si-.

There is a similar Khitan large script character that has a roof or hat-like element on top:

It looks like a man sititng on a 又 stool; ㄣ is his arms - the right pointing up and the left pointing down, 丷 is his head (or is he two-headed?), and 冂 is his legs. No, I don't really believe this is a pictograph.

I do think it is a compound. I have not yet seen the top element in any other Khitan large script graph. Was it specially created for this graph? Does it indicate a slightly different reading: e.g., <dzɨ> instead of <(t)sɨ>?

It appears in the second line of the Khentii stone inscription of 1084:

<? ? sɨ ? er ?>

The third character (cognate to 巳 Liao Chinese *sɨ?) represents a Sino-Khitan borrowing whereas the fifth represents native Khitan er. (Liao Chinese did not have the er of modern standard Mandarin). Does the second character appear anywhere else?

*In this scenario, some variant of 四 Liao Chinese *sɨ was devised or borrowed as a phonetic symbol, while the Khitan word for 'four' was written with unrelated characters (carried over from the Parhae script?):


The Khitan large script characters for 'one', 'two', 'three', and 'five', on the other hand, were identical to their Chinese equivalents: 一二三五. As Juha Janhunen (1994) pointed out, it is unlikely that the Khitan deliberately altered Chinese characters (e.g., 'four') at random. It is more likely that the non-Chinese characters are the products of gradual, random 'mutation' over a long period rather than arbitrary innovations in 920.

**Here's an extremely unlikely scenario: In Parhae, there was a Tungusic language with a cognate of Manchu asu for 'net', and that word was written as

which was then borrowed by the Khitan to write the Chinese syllable *sɨ resembling the *-su of *asu 'net'. FOUR RHINOS AND THE ORIGIN OF THE KHITAN LARGE SCRIPT

In "Si-cret of the Roofed Four", I wrote about


which I derived from Liao Chinese *tsɨ.

Kane (2009: 178) listed two other Khitan large script graphs that had <si>-like readings:


<(t)sɨ> ~ <(t)si> : Liao Chinese 慈刺 *tshɨ,*tshiʔ

(Neither LC reading has initial *s-, so I don't know why Kane listed its reading as [sï]. I assume there is evidence for reading it with <s> elsewhere.)


<(t)si>: Liao Chinese 漆齊 *tshi, 西 *si

The first character looks like a cognate of 四 Liao Chinese *sɨ 'four'. Compare it to the open-bottomed cursive forms of 四 at cidianwang.com and this open-bottomed variant from the Song Dynasty 古文四聲韻. The vowel <ɨ> indicates that the reading <sɨ> cannot be based on pre-Late Middle Chinese since 'four' was *si in Early Middle Chinese and Late Old Chinese.

The second character is clearly a cognate of 犀 Liao Chinese *si 'rhino'. One of its variants has an exact lookalike in the ROC's Dictionary of Chinese Character Variants. The vowel <i> indicates that the reading <si> cannot be based on pre-Liao Chinese since 'rhino' was *siei in Late Middle Chinese, and *sei in Early Middle Chinese and Late Old Chinese.

These characters with <(t)si>-like readings made me wonder how the Khitan large script was created. There is no doubt that it did not appear ex nihilo, though it is not certain whether it was based directly on the Chinese script or "an old local variety of the Chinese script" (e.g., Parhae or Tabghach) as proposed by Juha Janhunen (2003: 395-396). If this prototype - let's call it X - were originally meant to write language X, how would it be modified to fit Khitan?

Janhunen wrote that the large script "may have undergone some normalization in the context of the Liao empire," though I think it actually lacks normalization: e.g., nine different versions of 'six':

Such variation may imply age, though even the Khitan small script exhibits variation, and there is no doubt that the Khitan small script was a true innovation.

My guess is that the Khitanization of an existing script would involve the creation of new characters to represent segments, syllables, and/or words in Khitan that were absent in language X. I would predict that graphs for <q> and <ɣ>-syllables would be innovations. Yet the Khitan large script character 上 <qa> ~ <ɣa> (Kane 2009: 178) looks exactly like Chinese 上 with a Liao reading *ʂaŋ. Conversely, Khitan syllables with exact Chinese equivalents could be written with Khitan large script characters that made no phonetic sense from a Chinese perspective (e.g., 至 for <an> despite its Liao reading *tʂi) or entirely un-Chinese characters: e.g.,

<ai> and <ai>

instead of  哀 *ai. And the vowels of <(t)sɨ> and <(t)si> for

indicate readings of relatively recent origin. Are these 'updated' readings - old shapes with new readings assigned on the basis of Liao Chinese* - or were the characters also newly created? Why create characters for <(t)sɨ> and <(t)si>? Were those syllables absent from language X? Could the newer characters of the Khitan large script imply gaps - accidental and systematic - in the phonology of language X?

could have been devised to represent the syllables *tsɨ, *sɨ, and *si that were not in language X which had *ɕi.

One problem with this scenario is that the first character does not correspond to any Chinese character with a Liao reading *tsɨ even though it resembles 子 with that reading. But maybe it was originally intended to stand for <tsi> with a front vowel.

Another is that the first character corresponds to characters for Liao Chinese closed syllables


even though it has an apparent phonetic 子 implying an open syllable. 子 did represent a closed syllable *tsəʔ in Old Chinese, but I doubt the Khitan knew that, and even if they did, why would they use a derivative of a *-əʔ graph to write *-in syllables with a different vowel and final consonant?

*Cf. the multiple strata of readings in Sino-Japanese: e.g., 行 has

- the Go-on reading gyou < Old Japanese ŋgiyaũ < Early Middle Chinese *ɣæŋ

- the Kan-on reading kou < Old Japanese kaũ < Late Middle Chinese *xaŋ

- the Tou-Sou-on reading an < Middle Japanese < Song Chinese *ɦaŋ

Could  variation in Khitan large script readings reflect strata of readings (archaisms inherited from language X vs. innovations based on then-current Liao Chinese pronunciation) as well as degrees of nativization of Sino-Khitan (e.g., nonnativized <ts> vs. nativized <s>)? SI-CRET OF THE ROOFED FOUR

In "Ru-ted in a Ru-fless House?", I wrote,

Sinograph-like Khitan large script characters with readings that are closer to Middle Chinese than Liao Chinese are probable candidates for borrowing from the Parhae script.

I once thought that


might have such a reading. It resembles Chinese 子 with a 广 'roof'* added on top, and <(t)si(n)> is closer to the Middle Chinese reading *tsi of 子 than to its Liao Chinese reading *tsɨ.

Khitan had no native ts, so Chinese ts could be borrowed into Khitan as s as well as ts. Japanese and Vietnamese are two other languages with ts that have similar patterns of borrowing: e.g., 子 as Sino-Japanese shi < *si and Sino-Vietnamese tử < *sɨ.

However, Liao Chinese also could be borrowed into Khitan with or without nativization: e.g., the Liao Chinese syllable *ʂɨ was Khitanized as

nativized <ś.i> and nonnativized <ʂ> (without a vowel!) and <ʂ.ɨ> (Kane 2009: 244).

So <(t)si(n)> could be of Liao rather than Parhae origin.

The final <(n)> is unexpected and reflects the fact that <(t)si(n)> corresponds to Liao Chinese 晉 *tsin and 信 *sin as well as Liao Chinese 習 *siʔ (Kane 2009: 178). Could it have been <(t)sin> in one position and <si(ʔ)> in another?

The function of the 广 'roof' is unknown. This element occurs in at least three other Khitan large script characters. At least two have 'roofless' counterparts:

With 'roof' Without 'roof'



(Unattested? 十 plus dot does occur as a right-hand element of other Khitan large script characters:
<?> <?> <ai> <śiŋ>)



(Unattested? Looks like the Chinese character 矢, which also resembles a Khitan small script character.)

Did 'roofed' characters have readings that differed in a certain way from the readings of 'roofless' characters? Or were the 'roofs' just arbitrary additions?

The last two 'roofed' characters have obscure Chinese lookalikes:

广+子, a variant of 序 Liao Chinese *sy

𢇻, a variant of 知 Liao Chinese *tʂi and *ʂiʔ

序 Liao Chinese *sy vaguely resembles Khitan <(t)si(n)>, but the match is weak and probably coincidental.

The relevance of Chinese 𢇻 to Khitan 𢇻 cannot be determined until the reading of the latter is discovered.

*I use the term 'roof' loosely here. The radical 广 has been called 'shelter' in English to distinguish it from 宀 'roof'. RU-TED IN A RU-FLESS HOUSE?

Andrew West is blogging again for the first time since May. I have a lot to say, but not enough time to say it, so I'll restrict myself to a single comment. One of the Khitan large script graphs that Andrew mentions is



About a month ago - on the afternoon of September 13 - it occurred to me that this graph might be cognate to Chinese 盧 which was pronounced *lu during the Liao Dynasty.

盧 consists of three parts:

虍 'tiger', usually written 虎

what looks like 田 'field'

皿 'vessel, dish, bowl'

In Old Chinese, 盧 was read as *ra and was used to write a number of (near-)homophones: 'cottage, house; hound; lance shaft, black' (Karlgren 1957: 37). None of those words have anything to do with tigers, so I regard Old Chinese *hra 'tiger'* as a phonetic in 盧 *ra. (Cf. Sagart 1999: 41 who viewed 虍 *hra 'tiger' as a phonetic in 虜 *raʔ 'captive' and 攄 *Cɯ(H)ra 'to extend'. The reconstructions are mine.) Old Chinese *ra became Middle Chinese *lo, and 盧 *lo was a transcription of Indic ro: e.g., in 毗盧遮那 *bi lo tɕiæ na 'Vairocana'.

The function of what appears to be 田 'field' is unknown. Perhaps it represents food, as Shuowen glosses 𧆨 (without 皿 'vessel' on the bottom) as 'food vessel', though as Karlgren noted, this meaning is not attested in a text. 盧 with 皿 'vessel' is also glossed as 'food vessel' in Shouwen, but "there are no pre-Han text examples in support of this". Thesaurus Linguae Sericae does not list 'food vessel' as a meaning of 盧. I do not know whether Xu Shen, author of the Shuowen, guessed that 盧 stood for a defunct word for 'food vessel' on the basis of its bottom component 皿 or if such a word was still current though no longer written. (*ra 'food vessel' might have been a prestigious word when its character was coined, but it could have then been regarded as 'vulgar' - i.e., not worth writing - while its character became a phonetic symbol for various unrelated *ra-words. Premodern Chinese texts only reflect a fraction of the Chinese varieties in use; many words must have come and gone without leaving a trace in written records.)

I think the Khitan large script character



is a simplification of the lower two-thirds of 盧: 田 atop 皿 - a 盧 'house' without its phonetic 虍 'roof' (actually a tiger, but vaguely similar to 广 'roof').

The Khitan reading <ru> probably postdates the raising of Middle Chinese *o to Liao Chinese *u. If the Khitan character were carried over from the Parhae script, I would expect it to be read as <ro> rather than <ru> since 盧 was pronounced *lo, not *lu, during the 大 Tae Dynasty (which surprisingly doesn't have its own Wikipedia article yet).

Sinograph-like Khitan large script characters with readings that are closer to Middle Chinese than Liao Chinese are probable candidates for borrowing from the Parhae script.

*虎 'tiger' with two extra strokes was read *hraʔ. The significance of the glottal stop is unknown. The word is similar to Shorto's (2006) Proto-Mon-Khmer *klaʔ  'tiger' and Matisoff's (2003: 599) Proto-Tibeto-Burman *k-la 'tiger'. The correspondence of Old Chinese *-r- and non-Chinese *-l- may imply irregular borrowing (through an *-l-less intermediary?) from some early Southeast Asian language. The Tangut member of this word family is

2ləi < *Cʌ-liH (*kʌ-liʔ?) 'tiger'

Pre-Tangut *-i may in turn be from an even earlier *-a. BOR-OUGHT TO LIGHT

I should have checked Aisin Gioro (2012) to see if she had figured out the readings of the sets of similar Khitan small script characters that I mentioned in "Tracing the Line-age".


is still <a> (which is the reading I learned 16 years ago - it was the first small script character I memorized).

The reading of

<a>-plus-stroke is said to be in Aisin Gioro 2011b, which I can't identify since the paper has no bibliography.

Her readings for this trio

are <ur ~ or>, <bur ~ bor>, <?>.

The third character is not in her paper.

The added stroke of the second character indicates an initial <b>.

Is the choice of vowel (<u> or <o>) determined by harmonic rules?

Other Khitan small script characters whose readings contain similar vowel alternations in her reconstruction are

<yu> ~ <yo>, <u> ~ <o>, <u> ~ <ö>

I do not know the difference, if any, between her two <u> and


Kane (2009: 372) wrote that the usage of her <u> ~ <ö> "is similar to <ú> [= her <u> ~ <o>]". All three <u>-type vowel symbols were used to write Liao Chinese *-u: e.g., (Kane 2009: 246-247; the notation is his):

部 <b.u> ~ <b.û> ~ <b.û.ú>

武 <w.u> ~ <w.ú>

蜀 <ś.ú> ~ <ś.û>

I don't know what to make of the double vowel marking of <b.û.u>. I can't find any examples of <u>-sequences in the Khitan small script corpus that I have on hand. CON-PHỞ-UNDED

Nguyễn Đình-hoà (1980: 66) listed ten Chinese food items with Vietnamese names borrowed from Cantonese. Unfortunately he did not include their Cantonese sources in either romanization or Chinese characters. I am puzzled by phở, my favorite Vietnamese dish, as there is no Cantonese syllable like it. Even its falling-rising tone is non-Cantonese. The closest Cantonese word I can think of is 粉 fan 'noodles' with a high rising tone.

The English Wikipedia has two etymologies for phở:

The Oxford Dictionary suggests the word "pho" may be derived from French pot-au-feu (beef stew). Other observers suggest that the soup's name originated from Cantonese rice vermicelli hofan (河粉), which is abbreviated as either fan2 (粉, phấn) or Ho2 (河, Hà in [Vietnamese]), the two sounds giving the name "pho".

I think French feu [fø] is a better phonetic match for phở [fəː] than Cantonese fan [fɐn]. (Vietnamese has no [ø].)

Given that phở "first originated in the early 20th century in northern Vietnam" during the last days of the nom script, I had doubts that it had a nom spelling, but Trần 2004 lists one: 頗, read as phả 'quite' in Sino-Vietnamese. The nomna.org database also has a semantophonetic spelling 米+頗 with 米 ̃ 'rice'.

