Amaravati: Abode of Amritas

19.11.23.23:59: THE ETYMOLOGY OF CANTONESE 'TONGUE'

Wiktionary gives two etymologies for Cantonese 脷 lei⁶ 'tongue':

From 利 (“benefit; profit”), used as a euphemism for “tongue”), which is homophonous to 折, 蝕／蚀 (sit6, “to be at a loss”).

Alternatively, it may be from 舐 (OC *ɦljeʔ [= *mI-leʔ in my reconstruction], “to lick”), preserving the Old Chinese initial *l- (Schuessler, 2007).

Both of these proposals have issues.

Let's start with the second one which is closer to my position. 舐 *mI-leʔ should hypothetically become Cantonese ˟sei5, not lei6. I think it might be more accurate to say that lei6 is related to 舐 *mI-leʔ rather than from it. lei6 may be from a derivative of 舐 *mI-leʔ that lost its first syllable and has a nominalizing suffix *-s ('lick-NMLZ' > 'that which licks' = 'tongue'):

*mIleʔ-s > *mIlies > *məlieh > *lie̤ > 19th c. li6 > lei6

The presence of *mə- blocked the Late Old Chinese sound change *l- > *j- from applying to -l-. *mə- must have been dropped at some point after *l- > *j-.

Now for the homophone avoidance taboo etymology [revised 11.24.20:06]: It makes sense in Cantonese now, but would it have made sense at the Proto-Yue level? The 脷 lei6-type word for 'tongue' is widespread throughout Yue Chinese¹ (see this map), and therefore is likely to have been in Proto-Yue. 舌 'tongue' and 折 (蝕 is a Cantonese respelling) 'to be at a loss' are homophonous in the Middle Chinese phonological tradition and were probably also homophonous in Proto-Yue.

However, Wiktionary also reports 脷 for 'tongue of an animal' in Sichuan (i.e., Sichuan Mandarin) and Hakka, though it does not specify any dialects. There is no Yue in Sichuan, so 脷 there cannot be explained away as a Yue loan. The only Hakka 脷-like word for 'tongue' that I could find in Wiktionary is 脷錢 'tongue' in 陸川 Luchuan Hakka. 脷錢 in that variety and in 柳州 Liuzhou Mandarin is a borrowing from neighboring Yue dialects and is not evidence for reconstructing the word represented by 脷 back to the common ancestor of Hakka and Mandarin as well as Yue.

But if 脷 is in one or more varieties of Sichuan Mandarin, then the word represented by 脷 would be reconstructible in Old Chinese. (The character脷 seems to be a relatively recent invention and cannot be 'Old Chinese'.) And in the early 2nd century, Late Old Chinese 舌 *ʑɨat 'tongue' and 折 *dʑiet 'to lose' might not yet have become homophones (I don't have any rhyming data for that period), so there would be no motivation to replace 折 with 利 *lis 'profit', assuming homophonic substitution existed 1900 years ago.

Without any source to verify that 脷 is in Sichuan, I wouldn't bet on that scenario.

So I need to go back up to the Yue level and ask:

1. How old is the practice of homophonic taboo substitution? Can it be reconstructed at the Proto-Yue level?

2. How common is the practice of homophonic taboo substitution in Yue varieties?

3. Where did the practice of homophonic taboo substitution originate?

4. How did the practice of homophonic taboo substitution spread: inheritance or diffusion?

¹The earliest attestation of the character 折 representing a word 'to lose' (not quite 'to be at a loss', but close enough) is in 漢書 Hanshu (Book of Han, 111 AD).

APPENDIX: Cantonese readings of 舐

Above I wrote that the expected reading of 舐 should be ˟sei5. 粵語音韻集成電子版 A Chinese Talking Syllabary of the Cantonese Dialect: An Electronic Repository has five readings:

laai2 < *l̥ieʔ < *sI-leʔ, cognate to Old Chinese 舐 *mI-leʔ 'to lick'

At one stage of Old Chinese, *sl- fused to *l̥-.
Old Chinese *-ICe has two reflexes in Cantonese: -ei < 19th c. -i and, in a minority of cases, -aai. The latter seems to be either native (meaning that the bulk of Cantonese vocabuary was borrowed or remodelled after a prestiget dialect) or substratal.

lem2 < Old Chinese *l̥emʔ 'to lick'

This word has no known Old Chinese spelling and first appears in Middle Chinese as 舔, so I cannot give an Old Chinese character for it.

lim2, ditto but with a less conservative rhyme, the reflex of Old Chinese *-em in the newer, majority layer of Cantonese vocabulary
saai2 < *sI-leʔ, cognate to Old Chinese 舐 *mI-leʔ 'to lick'

At one stage of Old Chinese, *sl- fused to *s-.

saai5 < *ʑ- < *mʑ- < *mj- < Old Chinese 舐 *mI-leʔ

This is the closest to the ˟sei5 I predicted, but its rhyme is the (older?) minority reflex of Old Chinese *iCe.

Now for the first. As far as I know,

APPENDIX 2: How can Old Chinese *sl- fuse in two different ways?

The two fusions occurred at different times. I don't know the relative chronology, so I provide two different scenarios below.

Scenario 1: *sl- > *l̥- occurred first.

stage 1	stage 2	stage 3	stage 4	example
*sIl-	*sI-	*l̥-	*l̥-	laai2
*sIl-	*sIl-	*sl-	*s-	saai2

Scenario 2: *sl- > *s- occurred first.

stage 1	stage 2	stage 3	stage 4	example
*sIl-	*sI-	*s-	*s-	saai2
*sIl-	*sVl-	*sl-	*l̥-	laai2

At an even earlier stage some or all *sl- could have been *sVl-. First vowel loss in both scenarios above is unpredictable; both monosyllabic and disyllabic variants of *s(I)-leʔ existed in stage 2, just as full and abbreviated variants exist in English today (e.g., select [səˈlɛkt] ~ [slɛkt]).

19.11.22.23:55: SINO-TIBETAN WORDS FOR 'TONGUE'

The native Cantonese word 脷 lei⁶ 'tongue' is reminiscent of l-words for 'tongue' found elsewhere in Sino-Tibetan: e.g.,

Old Chinese 舌 *mIlat

a Late Old Chinese form like ?*mliet was borrowed into Proto-Hmong-Mien as *mblet (Ratliff 2010: 48)

Tangut 𗢯 3190 1lhwa4 < pre-Tangut *PIl̥aC
Written Burmese lhyā < Proto-Lolo-Burmese *sla^1b (Burling 1967: 97; Burling disregarded Burmese spelling, so there was no evidence pointing to a *glide in his data; perhaps the reconstruction can be emended to *slja^1b)
Written Tibetan lce < pre-Tibetan *ɣl̥ʲe (Hill 2013: 195), ljags < pre-Tibetan *ɣlʲaks (hon.; my attempt at a Hill-style reconstruction; I've changed his *ḫ to *ɣ which is his phonetic value for *ḫ)

I wish I knew the Pyu word to complete the set of the 'big five' Sino-Tibetan literary languages, but Pyu basic vocabulary is all but unknown.

To keep things simple, I have not looked at other potentially related *l-words in Chinese, much less other Sino-Tibetan *l-words for 'tongue' or 'lick' available at STEDT.

Before one jumps to the conclusion that all of the above must share an *l-root, one should note Schuessler's (2007: 467) warning:

Initial *l- is a near-universal sound symbolic feature for 'lick / tongue', hence similar words in other languages are not likely to be related, such as MK-PVM [Mon-Khmer-Proto-Viet-Muong] *laːs 'tongue' [Ferlus]; Kam-Tai: S[iamese] lia^A2 < *dl- 'to lick' [cf. ], PKS [Proto-Kam-Sui] *lja² ? [Thurgood].

Proto-Kra *l-ma^A 'tongue' (Ostapirat 2000: 223; cf. Proto-Kam-Sui *ma^A 'id.' [Peiros]), Proto-Hlai *liːnʔ 'id.' (Norquest 2016 appendix: 127), Proto-Tai *liːn^C 'id.' (Pittayaporn 2009: 389), and Proto-Austronesian *lidam (on the basis of only Puyuma and Rukai; Blust and Trussel 2019) also fit the pattern. (A single Proto-Kra-Dai word for 'tongue' doesn't seem to be reconstructible.)

Continental 'Altaic' words for 'tongue' have noninitial l-: Ming Jurchen ilenggu ~ ilenggi, Written Mongolian kelen, and Turkish dil. (But peripheral 'Altaic' words don't: e.g., Korean hyŏ < *he and Japanese shita.)

European examples are English lick and Latin lingua 'tongue'. (The latter, of course, has an irregular l- < *d- which became the t- in tongue. Wiktionary derives the l- of lingua by analogy with lingō 'I lick', the true Latin cognate of lick. If we ignore that inconvenient fact, we could be daring and 'reconstruct' a 'Proto-World' *lV 'tongue/lick'. No.)

Schuessler was of course warning against linking Sino-Tibetan words to non-Sino-Tibetan words which happen to share the same initials, but lookalikes do also occur within families: e.g., lick and lingua. There could, at least in theory, be two unrelated lateral roots for 'tongue' in Sino-Tibetan.

Trying to reconcile the small set of Sino-Tibetan forms that I listed at the beginning runs into all sorts of difficulties:

Prelaterals (i.e., whatever comes before the L: prefixes or first syllables of disyllabic roots?): If Old Chinese *mI- and pre-Tangut *PI- are prefixes, what are their functions? Maybe the unknown pre-Tangut labial *P- was *m-. (The high vowel *-I- in both proto-forms is needed to account for the fronting of *a.) The labials in those prelaterals clash with Burling's Proto-Lolo-Burmese alveolar *s- and Hill's pre-Tibetan velar *ɣ-.

Laterals: Chinese and Proto-Lolo-Burmese have a voiced *l-, pre-Tangut has voiceless *l̥- (pre-Tangut *Sl- would correspond to Tangut l- + vowel tension), and pre-Tibetan has both voiced *-lʲ-. and voiceless *-l̥ʲ- with palatalization that might be a trace of a preceding high vowel *-I-:

*ɣIl̥- > *ɣIl̥ʲ > *ɣl̥ʲ-

*ɣIl- > *ɣIlʲ > *ɣlʲ-

Vowels: Three types are in the six words at the top:

Cantonese -ei only recently broke from *i.
One pre-Tibetan word has *e.
The others have *a.

Codas:

Cantonese tone 6 points to a voiced pre-Cantonese initial *l- and a pre-Cantonese final *-(p/t/k-)s.
Old Chinese has final *-t.
Proto-Lolo-Burmese has no coda.
Pre-Tibetan has both codaless and *-ks forms.
Tangut *-C could have been

*-k (as in pre-Tibetan)
*-t (as in Old Chinese)
or even *-p (cf. labial-final 舔 'to lick' < Old Chinese *l̥emʔ [unattested!])

- the same three stops that might have preceded *-s in pre-Cantonese.

If one regards the various codas as suffixes, one should ideally be able to identify the functions of those suffixes. Affixation can be a dangerous pseudoexplanation for mismatching segments in forms under comparison.

This exercise shows how far we are from being able to reconstruct Proto-Sino-Tibetan. Much more work needs to be done on subgroups before the outlines of their common ancestor can emerge.

19.11.21.22:01: BACKLOGS AND RECOMMENDED READING: DMITRIEV, PHAN AND DE SOUSA, FERLUS, KING

I don't like interrupting series because I rarely get back to them - two examples being my Golden Guide posts (which I stopped almost five years ago!) and a series on Mon that I started in September but haven't even posted until today. (I've posted the Mon series on my front page above yesterday's post even though my other September posts have long since fallen off.)

The Mon series should make up for my dearth of original content today. I don't have time to say much about today's finds:

Sergey Dmitriev's Мы живём в Древнем Китае. Энциклопедия для детей (We Live in Ancient China: An Encyclopedia for Children; 2018) should be translated into English. I love the comics-style pages. It would be nice for American children to see that China is much more than either a factory or, worse yet, an adversary.
John D. Phan and Hilário de Sousa's "A Preliminary Investigation into Southwestern Middle Chinese" (2016) demonstrates why my old belief that Sino-Vietnamese (SV) was based on a Cantonese (Ct)-like language was wrong:

Ct has aspirated reflexes of *voiced obstruents but SV doesn't
Ct has tone 1' with *voiced sonorants but SV has tone 1
Ct 有 jau lacks the h- in SV 有 hữu

Phan's 2013 PhD dissertation set me straight six years ago, but it was good to see a short, clear demonstration of the differences between Ct and SV.

Is a slide about velar softening missing?

Pyu makes a cameo appearance on slide 3 (in which Chenla is further northwest than I'd expect)

Was the Dong Son culture VIetic-speaking? Ferlus (2009) and I would say yes, but Schliesinger (2018) would say no:

the people we call today Vietnamese were even more recent arrivals in the Red River Delta as previous thought, probably arriving from the 10th century AD onward, and that the migration (or movement) of Viet-Muong people generally has been from south to north and not reverse. (p. 4)

But this claim of late arrival clashes with the fact that Vietnamese is full of layers of Chinese loanwords going perhaps as far back as the end of the Dong Son period. Those words were acquired during a millennium of Chinese rule in what is now northern Vietnam - a region that Schliesinger regards as a purely Tai area until the 10th century. Vietnamese could not have gotten all those loans via Tai because Vietnamese has far more Chinese loanwords of various ages than the Tai of northern Vietnam.

I first saw Ross King's "Ditching 'diglossia': Ecologies of the spoken and inscribed in pre-modern Korea" (2015) over a year ago, but it took a second look tonight for me to be sold on the inadequacy of the term diglossia to describe the linguistic situation of premodern Korea. Ross suggests the term Sinographic Cosmopolis, but I am afraid it is too long to catch on.

19.11.20.23:23: WHAT IS THE RELATIONSHIP BETWEEN THE KHITAN SMALL SCRIPT AND THE JURCHEN LARGE SCRIPT? (PART 2: THE LOYALTY PRINCIPLE)

(Back to Part 1)

In the Khitan large script, there is an nearly one-to-one relationship between words and character blocks: e.g., the trisyllabic word taulia 'hare' is written as a single block of three characters:

<tau.li.a>

Exceptions are polysyllabic Chinese loanwords which are written with one block per syllable: e.g., the name

<340.339.303 244.357> <h.i.ing s.ung> Hingsung (Xing 2.2)

from Liao Chinese 興宗 *1hing 1tsung 'flourishing ancestor'. (Khitan had no /ts/ in its native phonological inventory, so Chinese /ts/ was often approximated as /s/.)

In theory the name Hingsung could have been written as one five-character block

<340.339.303.244.357> <h.i.ing.s.ung>

since neither /xiŋ/ nor /suŋ/ are words in Khitan, but the loyalty principle of imitating the original Chinese spelling with separated syllables overruled the normal lexical principle of one block per word.

The loyalty principle has no equivalent in the Khitan large script (KLS). There is no strict one-to-one correlation between Chinese characters and KLS characters:

gloss	Chinese	Liao Chinese pronunciation	KLS	transliteration	Khitan pronunciation
mountain	山	*1shan	山	MOUNTAIN	shan
hundred	百	*4pai	高	bai	bai
emperor	皇帝	*1'hong¹ 3ti	皇帝	EMPEROR₁ EMPEROR₂	hongdi
(a name)	韓	*1'han	何至	ha an	han
commander	帥	*3shoi	夫坐	sho oi	shoi

Strictly speaking, 'mountain' and 'commander' are probably parts of Chinese borrowings in Khitan rather than Khitan words.

I have not seen the Chinese borrowing bai 'hundred' outside the 耶律昌允 Yelü Changyun KLS inscription (1062); the usual word is native jau.

The fact that the name 'Han' and 'commander' are written with two KLS characters may indicate that either the KLS had no phonograms <han> and <shoi> for the monosyllables han and shoi or that the KLS may have had logograms pronounced han and shoi which were inappropriate for 'Han' and 'commander' because they stood for other words. A study of multiple-character KLS spellings for Chinese monosyllables may enable us to guess which monosyllables did not have phonograms in the KLS.

"May", because there is at least one case of a Chinese monosyllable with both one- and two-character KLS spellings: 上 *3shang corresponds to

~~

<shang> ~ <sha.ang> ~ <sha.ang>

in lines 3, 1, and 17 of Yelü Changyun. There is also a KLS 北 <shang> used to write Liao Chinese 尚 *3shang. Perhaps 仲 and 北 are morpheme-specific logograms corresponding to the Liao Chinese homophones 上 and 尚.

The KLS does have a character 上 which looks exactly like Liao Chinese 上 *3shang, but KLS 上 represents the syllable ha instead of shang. KLS can be disorienting from the perspective of someone accustomed to the Chinese script because so many KLS characters do not function like their Chinese lookalikes: e.g.,

高 KLS <bai> vs. Liao Chinese *1kaw
何 KLS <ha> vs. Liao Chinese *1'xo

the KLS sound value may reflect a Sino-Parhae pronunciation of 何 as *ha (cf. Sino-Korean ha and Middle Chinese *ɣa; all these a-readings predate *a-raising/rounding in Chinese)

至 KLS <an> vs. Liao Chinese *3chi
夫 KLS <sho> vs. Liao Chinese *1fu
坐 KLS <oi> vs. Liao Chinese *3tso
仲 KLS <shang> vs. Liao Chinese *3chung
北 KLS <shang> vs. Liao Chinese *4puj
上 KLS <ha> vs. Liao Chinese *3shang

Did the Khitan randomly decide to retain Chinese-like readings for some Chinese characters (山, 皇, 帝) and assign arbitrary non-Chinese readings to others (高 etc. in the list above)? I don't think so. I think the un-Chinese readings of 高 etc. originate from the use of those characters as semantograms for non-Chinese languages. 高 etc. may also be simplifications of more complex Chinese characters: e.g., 至 could be a 'katakana' phonogram reduction of a semantogram like

侄厔𦤵䑒咥姪庢挃洷𤞂𦤷晊桎𦤶𦤹𦤺眰祬秷𦤻

𦤼䘭臷臸蛭𢰙𦤿䑓輊𦥁𧠫𫇎䬹銍𥔊𦥂𦥄𦥅𦥆𦥇

𦥈𦥉𦥊𦥋𦥌𨖹𦥎𦥎𦥏𫇑𨆧𪗻𥒓𦤸𦤽𦤾𦥀𦥃𦥍𪏀

𫇏𫇐𬛱𬛳𬛶𬛷𮍢

致𠊷𡍶㨖㮹㴛𤸓緻𦟔𦥐𧤡𧩼䞃䦯𩋩

室𠋤𢯶𧫡𩋡鰘

窒㗧䏄膣螲

𡌥𡏀𣖭

(Some of those characters may postdate the creation of the KLS and therefore be disqualified as potential cognates of KLS 至 <an>.)

Once again I am out of time, so I didn't get to write about the Jurchen text from part 1, much less come even remotely close to answering the title question. What I originally thought might be a single post just gets longer.

¹Yesterday it occurred to me that I could differentiate between 'yin' and 'yang' tones in my tonal notation by marking yang tones with '. I project the absence of non-1 yang tones (2'-, 3'-, 4'-) in modern Mandarin back into Liao Chinese, but I could be wrong.

19.11.19.23:15: WHAT IS THE RELATIONSHIP BETWEEN THE KHITAN SMALL SCRIPT AND THE JURCHEN LARGE SCRIPT? (PART 1)

The short and oversimplified answer is that there isn't any.

The real answer is more complicated.

The defining characteristic of the Khitan small script is how its characters are combined into blocks. For that reason, Shimunek (2017) calls it the 'assembled script' to avoid commiting to the term 'small script'¹.

Kiyose (1977: 27-28) proposed that the elusive Jurchen small script is nothing more than the Jurchen large script characters combined into Khitan small script-like blocks. The known examples of these Jurchen blocks are in 弇州山人四部稿 Yanzhou shanren sibu gao (Draft [Catalog of] the Four Categories of Yanzhou Shanren['s Library]; 16th c.) and 方氏墨譜 Fang shi mopu (Mr. Fang's Ink [Cake] Book, 1588) and on a 牌子 paizi (travel pass).

Here is Kiyose's (1984) decipherment of the eight blocks in Yanzhou and Fang shi which can be seen at Wikipedia:

row	block #	block	transliteration	meaning	Chinese	meaning
1	1		gen.giyen	bright	明	bright > wise
	2		wan	prince (< Chn)	王	prince
	3		tiqo.ci.ghun	heedful-if-?	慎	heedful
	4		de	virtue (< Chn)	德	virtue
2	5		duwin	four	四	four
	6		tuli.le	outside	夷	foreigner
	7		hiyen	all (< Chn)	咸	all
	8		an.da.hai	guest	賓	guest

'When a wise prince is heedful of virtue /

'Foreigners from the four quarters come as guests'

(tr. by Kiyose 1984: 84)

Unlike most Khitan small script blocks, the blocks in that text are purely vertical: e.g., <tiqo.ci.ghun> and <an.da.hai> are vertical stacks of three characters - an arrangement never found in the Khitan small script. (But two-element vertical stacks like <gen.giyen> and <tuli.le> are occasionally found in the Khitan small script.)

Making images of those blocks and their components took so long that I don't have time to write about the words they represent or how they're strung together! Next time ...

In the meantime, I thank Jason Glavy for making the font that is the basis of nearly all my 600+ Jurchen images. (Seven of the eight images in the transcription of the Yanzhou/Fang shi text above are modifications of characters from his font; only <duwin> 'four' is unaltered.) I couldn't have written any of my many posts about the Jurchen script over the last eight years without his font.

¹The terms 'small script' and 'large script' are only known from Chinese sources. 遼史 Liao shi (History of the Liao Dynasty) vol. 64 says that 耶律迭剌 Yelü Diela

能習其言與書，因制契丹小字，數少而該貫。制契丹小字，數少而該貫。

'was able to learn their [the Uyghur] spoken language and script. Then he created (a script) of smaller Khitan characters which, although few in number, covered everything.' (tr. by Kane 1989: 2)

That passage hints at the possibility of the Khitan small script being somehow influenced by Uyghur and indicates that the small script had 'few' characters. The 'assembled script' has characters combining into words as in the Uyghur script and has fewer characters than the other Khitan script (the 'linear script'), so I am certain that the 'assembled script' is the small script (and that the 'linear script' is the large script).

Contrast the "few" characters of that passage with the description of the creation of a Chinese-like first Khitan script with "several thousand characters" in 新五代史 Xin wu dai shi (New History of the Five Dynasties) vol. 72:

多用漢人，漢人教之以隸書之半增損之，作文字數千，以代刻木之約。

'He [阿保機 Abaoji, the first Khitan emperor] employed many Chinese, who taught them [the Khitan] how to write by altering characters in the clerical script, adding here and cutting there. They created a script of several thousand characters, replacing the contracts made by making notches on wood.' (tr. by Kane 2009: 167)

That earlier script must be the large script which has over a thousand known characters and resembles Chinese more strongly than the small script.

Unfortunately, these passages from Liao shi vol. 2 using the term 'large script' do not give any specifics:

五年春正月乙丑，始制契丹大字。

'Fifth year: spring, first month, yichou day: work began on the creation of the Kitan large script.'

九月 [...] 壬寅，大字成，詔頒行之。

'Ninth month [...] renyin day: The large script was completed. It was implemented by imperial edict.' (tr. by Kane 2009: 167)

There was no Khitan script before the fifth year of Abaoji's reign, so the large script must be the earliest Khitan script - the "script of several thousand characters" mentioned in Xin wu dai shi.

19.11.18.22:06: JURCHEN LARGE SCRIPT CHARACTER DERIVATIONS: <STAR>, <GIYA>, <HOTO>, <LE>

The first of these occurred to me last night; the rest are from today.

Jurchen	Jurchen reading	Jurchen gloss	Jurchen etymology	cognate sinograph	Chinese gloss	source reading
~	osiha	star	< Proto-Tungusic *xōsī (Vovin 1996 class handout)	牛	cow	Para-Japonic cognate of Proto-Japonic osi or usi 'cow'
	giya [kʲa]	street	< post-Early Middle Chinese 街 *kja	家	house	post-Early Middle Chinese *kja
	hoto	-	-	cf. 土 'earth'		a word related to the source of Manchu hoton 'city'
~	le	-	-	礼	ceremony	Middle Chinese *lḛj

A. The Jurchen logogram <STAR> may be a Parhae cognate of standard Chinese 牛 <COW> which was once used to write a para-Japonic cognate of Proto-Japonic *osi or *usi 'cow' and later borrowed to write an unrelated Jurchen soundalike osiha 'star'. That borrowing must postdate the loss of *x- in pre-Jurchen.

(The resemblance between Proto-Tungusic *xōsī 'star' and Japanese hoshi 'id.' is fortuitous. Japanese h- goes back to *p-, and Proto-Tungusic *p- became p- [later f-] rather than zero in Jurchen.)

The second form of <STAR> with ㇓ on the left and a hook on the bottom is from Grube (1896: 1). Without access to the Berlin manuscript that he used, I cannot verify how accurate his handwritten form of <STAR> is.

B. The Jurchen phonogram <giya> may be a Parhae cognate of standard Chinese 家 <HOUSE> (post-Early Middle Chinese *kja) used to write the syllable giya [kʲa]. Jurchen speakers borrowed giya 'street' from post-Early Middle Chinese 街 *kja 'id.' (via Sino-Parhae?; cf. Sino-Korean 街 ka) but wrote it with a version of 家 which was homophonous with 街 in the Chinese known to the Jurchen. (That variety of Chinese had merged the rhymes of 街 and 家, whereas modern standard Mandarin 街 jiē reflects a variety that had not merged those rhymes.街 and 家 are not homophonous in any modern variety of Mandarin at 小學堂 Xiaoxuetang: compare their readings here and here.)

C. The Jurchen phonogram <hoto> containing 土 <EARTH> may have originated as a logogram <CITY> for an areal word attested in Koguryŏ place names (as 忽 *hot), Manchu (hoton, a loan from Mongolian), and Mongolian qota(n). The logogram could have originated in the Parhae script (to represent the *hot-word from Koguryŏ) or in the completely lost Northern Wei script (to represent a Serbi cognate of Mongolian qota[n]). This logogram may be original to the Parhae or Northern Wei precursor of the Jurchen script and therefore lack a cognate in the standard Chinese script.

D. The Jurchen phonogram <le> [lə] may be a cognate of standard Chinese 礼 <CEREMONY>. In Liao and Jin Chinese, 礼 was pronounced *li which would have been a less than optimal match for Jurchen [lə]. The use of a cognate of 礼 to represent the syllable /lə/ probably predates the shift of *-ej to *-i in northeastern Chinese: i.e., it may go back to the Parhae script or perhaps even the Northern Wei scirpt.

Hiragana れ <re> and katakana ㇾ <re> are respectively derived from a cursive form of 礼 and the right side of 礼, so I regard them as potential 'relatives' of <le>.

The second form of <le> with 天 on the left and a hook on the bottom is from the Berlin copy of the Ming dynasty Bureau of Translators vocabulary and could be a mistake for the correct form with 夫 on the left. The Jurchen script in extant copies of the vocabulary was probably written by Chinese scribes and hence may contain nonnative errors. It might be difficult to differentiate between genuine Ming Jurchen innovations and scribal errors. I assume that the dots of the Ming Jurchen characters

<DAY> inenggi and <MOON> biya

are genuine innovations, but the replacement of 夫 with 天 in <le> may not be.

19.11.17.21:07: SINO-PARHAE INFLUENCE ON THE JURCHEN SCRIPT?

Janhunen (1994: 133) proposed that the Jurchen script was a descendant of an "old local system of writing" rather than a 12th century creation as commonly assumed.

An obvious candidate for a concretely identifiable historical entity that had the potential to create a written language in pre-Liao Manchuria is the Bohai 渤海 [= Parhae] kingdom (698-926).

[...]

The Khitan and Jurchen "large" scripts were likewise not true "inventions" but, rather, natural stages in an evolutionary process that extended backwards through the Bohai script to some early northern variety of the Chinese script. [...] There is also the possibility that the Korean state of Koguryeo 高句麗 (-668), often regarded as the direct precursor of Bohai, was somehow involved. The influence of United Shilla 新羅 (668-918), a contemporary of Bohai, appears somewhat less likely, but cannot be completely ruled out. (pp. 114-115)

If Janhunen's hypothesis is correct, I might expect some peninsular features in the Jurchen script. Unfortunately, little is known about the languages of the peninsula prior to the invention of hangul in the 15th century, and very little is known about languages outside of Shilla. So it is dangerous to project Shilla features onto the rest of the peninsula, and a greater leap still to assume such features might have reached Parhae in the north.

Nonetheless let's suppose that one of those features - the *-r (> modern -l) that characterizes Sino-Korean (i.e., Sino-Shilla) - was present in the Chinese known in Parhae. It was certainly present in the Chinese of the capital in northwestern China, but whether the feature also existed in northeastern China and Parhae next door is open to question. Let's answer the question in the affirmative for now. If Sino-Parhae had *-r readings for Chinese characters - and its local characters - I would expect *-r local characters to appear in the Jurchen script as symbols for CVr(V) syllable (sequence)s.

In Sino-Korean, 失 <LOSE> is pronounced 실 shil < *sir. Jin (1984: 14) derived the Jurchen phonogram

~

<šir> (the earlier form is on the left)

from ... 失 <LOSE>. The -r of the Jurchen reading could reflect a Chinese or Sino-Parhae *-r.

How many other Jurchen CVr(V)-characters resemble Chinese characters with Sino-Korean -l (< *-r) readings?

I am not saying that Jurchen has Sino-Korean features. I use Sino-Korean as the only available proxy for Sino-Parhae: how Chinese characters might have been pronounced in Parhae to the north of Old Korean-speaking Shilla.

I am also not saying that all Jurchen CVr(V)-characters must be derived from Chinese characters with Sino-Korean -l (< *-r) readings.

In theory Parhae characters for native Koreanic *CVr(V) words could have been recycled for Jurchen CVr(V)-sequences.

Koreanic need not be the only source of CVr(V)-readings in Jurchen. If Vovin (2012) is correct and Jurchen was already written in Parhae two or more centuries before the establishment of the Jurchen Empire, Parhae characters (渤海字? 渤字?) could have functioned as semantograms for Jurchen words:

Chinese character X meaning Y : Parhae character X' for Jurchen word CVr(V) meaning Y'

Semantograms for Khitan CVr(V) words could have been reused for unrelated Jurchen CVr(V) words and syllable (sequence)s.

Going beyond Jurchen and Khitan, I have already proposed that the Jurchen character

<HORSE> mori(n)

is related to Chinese 保 <PROTECT> (Sino-Korean 보 po) which does not have a Sino-Korean reading ending in -r but which could have represented a para-Japonic (i.e., the peninsular sister of pelagic Japonic) morpheme (sequence?) mor(-i) 'protect(-INF)' (cf. the use of 保 for mori 'protecting' in Japanese names).

A very wild possibility is that some Jurchen CVr(V)-characters may be derived from Parhae characters representing Amuric (!) morphemes. Fortescue's 2016 Proto-Amuric ('Proto-Nivkh' at Wiktionary) reconstruction has *-r(V) and *-ʀ(V)-final roots.

To sum up my thoughts, I present a table of possible sources for Jurchen CVr(V)-readings:

source character	representing	recycled for	example
Chinese *-r characters	Sino-Parhae *CVr	Jurchen CVr(V)	<šir>?
Parhae characters	Jurchen CVr(V)?		?
	Khitan CVr(V)??		?
	Koreanic CVr(V)???		?
	para-Japonic CVrV????		<HORSE>?
	Amuric CVr(V)?????		?

My assumption here and elsewhere is that Jurchen character readings are not random in the way that Cherokee character readings appear to be random: e.g., Sequoyah assigned the reading a rather than dV to Ꭰ. (Cherokee <da de di do du dv> are Ꮣ Ꮥ Ꮧ Λ¹ Ꮪ Ꮫ.) I would like all Jurchen character readings to be derived either from Chinese or some non-Chinese language's approximate semantic equivalent of a Chinese morpheme (e.g., a para-Japonic *mor-i 'protect-INF' as a translation of Middle Chinese 保 *pa̰w 'protect').

Incredulity is no argument, I know, but I just can't bring myself to believe that 完顏希尹 Wanyan Xiyin did what Sequoyah did on a mass scale: take character shapes from existing scripts (the Chinese and Khitan large scripts) and assign hundreds of them to Jurchen morphemes and syllable (sequence)s at random. Sequoyah was illiterate until he invented his own script and may have never known English, but Xiyin

was fascinated by Chinese classics, and collected a large library when Jurchens seized and looted the capital of the Northern Song dynasty, Bianjing (present-day Kaifeng), in the Jin–Song Wars.

That happened either during the first siege of Bianjing in 1126 or the second in 1127 - years after c. 1119-1120, when Xiyin was said to have created the script. In theory Xiyin could have been illiterate until (or even after?) the siege and just liked the idea of having books he couldn't read, but I doubt that. The Jurchen had lived under literate rulers familiar with Chinese culture for centuries. Surely 阿骨打 Aguda, the founder of the empire, would have assigned a literate man to 'create' a script for his new state.

I put 'create' in quotes, since I think Xiyin standardized an existing script. Maybe standardized should also be in quotes, since the Jurchen script has a lot of variation. This variation may imply that the script has a lot of history behind it. (The Tangut script is most likely a true invention without a history, and it has far less variation.)

Some of that variation postdates Xiyin's time: e.g., the dots of

<DAY> inenggi and <MOON> biya

are not in the manuscript thought to be the earliest example of the Jurchen script. I am not counting the Parhae tiles that Vovin (2012) regarded as even earlier examples. If one counts those tiles, then that manuscript may be the earliest example of post-Xiyin written Jurchen.

Variation in the Jurchen script - and the Khitan scripts - is an issue deserving of much attention. Jin (1984) has already done some basic work by identifying which texts characters appear in. The next step is to create a visual chronology organizing characters by date: e.g.,

transliteration	1100s	1200s	c. 1500
<šir>
<DAY> inenggi
<MOON> biya
<COOKED> uru		?

Note how the earlier version of <šir> is closer to Jin's (1984) proposed Chinese source character 失 <LOSE>. The newer version of <šir> has a lookalike of the Khitan small script character 051 <qa>

atop a half-height 人, whereas the older version and 失 have a full-height 人 shape.

I included the Jurchen character <COOKED> because its later version looks exactly like 失 <LOSE>. The absence of a dot on the bottom of the late version could simply be a mistake in the Berlin copy of the Ming dynasty Bureau of Translators vocabulary. I have no idea why the shape of 失 <LOSE> - with or without a dot - was read as uru. Korean ilh- < ìrh- 'lose' and Old Japanese usinap- 'lose' only have one segment matching uru, and Proto-Amuric bək(ə)z- 'lose' doesn't match at all. Was there a Khitan root ur(u)- 'lose'? Might Khitan large script character 1511 <?>

have been read ur(u)-?

¹11.17.23:52: In 1834, Samuel Worcester inverted Sequoyah's Λ <do> to Ꮩ to differentiate it from Ꭺ <go>.

Tangut Yinchuan font copyright © Prof. 景永时 Jing Yongshi
Tangut character image fonts by Mojikyo.org
Tangut radical and Khitan fonts by Andrew West
Jurchen font by Jason Glavy
zAll other content copyright © 2002-2019 Amritavision