Janhunen (2012: 109) wrote that the Khitan large and Jurchen (large) scripts had "no functionally relevant 'radical' components".

Here's what he meant. This trio of Jurchen characters have different elements on the left resembling Chinese radicals:

cf. Chinese radical
cf. Chinese phonetic

干 <kan>
~ hūlha
火 <FIRE>

The non-干 elements of the Jurchen characters have no apparent semantic or even phonetic function:

The shared component 干 has no apparent semantic or phonetic function either.

Many Jurchen (large script) characters appear to be random combinations of Chinese elements (often with slight alterations) without any apparent logic linking their graphic structure to their Jurchen pronunciations or the meanings of the Jurchen morphemes that they represent.

Are Jurchen (large script) characters randomly constructed? I find that hard to believe given that the Jurchen elite was literate in Khitan and Chinese. I can imagine an illiterate script inventor coming up with unsystematic recyclings and alterations of existing shapes as a new script. (That is precisely how the Cherokee script was developed.) But whoever created the Jurchen script was the heir of traditions of literacy - traditions that may have gone back to Parhae if not earlier. (Janhunen [2012: 109] suggests a link between the Parhae script and the even earlier Serbi script that was totally lost.)

I propose that the structure of Jurchen characters tells us nothing about Jurchen itself because it reflects another language. Let's call that language X. Here's my xenogenetic scenario for the development of Jurchen characters.

The challenge, then, is to identify X, A, and B on the basis of the shape of the Jurchen character and its reading C. D is of no relevance.

Suppose that English were written in a Jurchen-like script based on the Japanese usage of Chinese characters.

In that scenario, Japanese is language X. The Chinese character 香 representing A = Middle Chinese *xɨaŋ meaning B = 'fragrance' was used to write an unrelated Japanese morpheme C = ka meaning B = 'fragrance'.

English speakers then used 香 to write syllables that sounded like C: e.g., C' = car. Any attempt to see an automobile in 香 would be doomed.¹

In short,

Can a similar formula be written with Jurchen un and hūlha as C'?

11.17.0:31: 香 <FRAGRANCE> looks like 禾 <GRAIN> atop 日 <SUN> but is actually an abbreviation of 𪏽: 黍 <MILLET> atop 甘 <SWEET> - the sweet smell of millet. Jurchen characters have relatively low stroke counts and could be abbreviations of more complex Chinese originals, so a Jurchen character that looks like Chinese <E.F> could be a reduction of <G.H> (just as 香 <GRAIN.SUN> is a reduction of 𪏽 <MILLET.SWEET>).

In Japanese, 香 was reduced to 𛀠 as a now-obsolete hiragana for ka. THE ORIGIN OF THE JURCHEN CHARACTER FOR 'CLOUD'

The Jurchen character for tugi 'cloud'

looks like Japanese 広 <WIDE>, the postwar simplification of 廣, plus a dot. But last night I realized it may be related to Chinese 云 <CLOUD>:

What I don't know is whether Jurchen <CLOUD> is a consciously altered Chinese character or the product of an alternate line of gradual evolution from an early form of 云 <CLOUD>. The first possibility fits the standard view of the Jurchen large script as a 12th century alteration of the Khitan large script (and perhaps also Chinese characters). The second possibility fits Janhunen's (1994) hypothesis: the Jurchen script is a descendant of the Parhae script, an organic offshoot of the Chinese script.

I continue to favor the second possibility because of the question Janhunen (1994) raised for the Khitan large script: if the Khitan goal was to differentiate their script from Chinese, why did they retain some Chinese characters as is? The question also applies to the Jurchen script to a lesser extent. The Jurchen script shares far fewer characters with the Chinese script, but nonetheless a handful of key obvious lookalikes remain:

Altered forms of all four appear only in late Jurchen. THE ORIGIN OF THE NAME NARA

1. When I first became interested in Korean in 1987, I instantly fell in love with the idea of what Leon Serafim would later call Koreo-Japonic: the idea that Korean and Japanese were related. (It would be at least a couple of years before I would learn of Altaic from Roy Andrew Miller's books.) Unfortunately at the time I was a high school sophomore and had no idea what linguistics was, much less how historical linguistics worked. So I uncritically accepted claims like the derivation of Nara from Korean 나라 nara 'country'.

Now I know that the earliest attested form of the name is 15th century 나랗 nàráh < *narak 'country'. The hypothetical final *-k matches up nicely with the coda of the phonogram 樂 Middle Chinese *lak in these old spellings for Nara:

乃樂 ~ 那樂 ~  諾樂 ~ 寧樂

(Old Japanese Ca-syllables were not usually written with *-k phonograms.)

But sound matches alone do not an etymology make. The semantics also have to make sense. And naming a place in Japan 'Country' makes no sense to me. I don't know of any parallels anywhere else.

Wikipedia on the Korean hypothesis:

American linguist Christopher I. Beckwith infers the Korean narak derives from the late Middle Old Chinese 壌 (*nrak, earth), from early *narak, and has no connection with Goguryoic and Japanese na.

I have no idea how Beckwith reconstructs Middle Old Chinese *nrak. I know of no evidence for *-r- or *-k in 壌 (the Japanese postwar simplication of 壤). Here's how I reconstruct the word:

I don't know of anyone other than Beckwith who reconstructs the word with *-r- or *-k.

壤 is, incidentally, the yang of 平壤 Pyongyang 'Level Earth'.

More from Wikipedia:

There is the idea that Nara is akin to Tungusic na. In some Tungusic languages such as Orok (and likely Goguryeo language [let's not go into the issue of whether a 'Koguryo language' existed]), na means earth, land or the like. Some have speculated about a connection between these Tungusic words and Old Japanesean archaic and somewhat obscure word that appears in the verb phrases nawi furu and nawi yoru ('an earthquake occurs, to have an earthquake').

Two problems:

11.15.22:25: Can the Korean etymology be saved? The phonetic match appears to be perfect. So any salvage work has to be done on the semantic end:

A. If *narak meant 'country', maybe the Japanese name is only the second half of a longer name *X narak 'the land of X'. But there is no evidence for any longer name.

B. *narak did not mean 'country'. The Koreanic root attested with the meaning 'country' from Middle Korean online could have had different meanings in earlier (and extinct) Koreanic languages: e.g., 'flat land' (cf. various Japanese nar- 'flat' words and the proposals to link them to Nara). Or *narak is not the root that became 'country' in Korean proper but some unrelated root - might Japanese 楢 nara 'oak' (one proposed source of the name Nara) be from a Koreanic *narak 'oak'? (But there is no attested Koreanic word for 'oak' like *narak - currently, oaks are 참나무 chham namu 'true trees', and no Korean names for types of oaks contain anything like *narak. Could chham namu be a later replacement for an older word *narak that only survives in Japanese? And might *narak have been a Koreanic translation of Japanese kashi 'oak'? Today there is a nearby city named 橿原 Kashihara 'oak plain'. Perhaps the whole region was once called Kashi. If *narak meant 'flat land', it could be even a translation of Old Japanese para [now hara] 'plain'. Wild idea: there were two unrelated Koreanic words *narak 'oak' and *narak 'plain', so *narak sufficed as a translation of the local name Kasipara 'oak plain'.)

In any case, I think the unusual spellings of Nara pointing to a final *-k make a Koreanic origin likely, though the underlying *narak may not have meant 'country'.

How did Nara get a Koreanic name? Two possibilities:

A. The name may date long before Nara became a capital - back to the period when 古墳 kofun were built there. I suspect kofun were a Japanese innovation triggered by the Koreanic influence that was certainly on the rise in those days, and Koreanic speakers brought it back home:

In recent years, South Korea has begun to allocate more resources toward archaeology, and keyhole tombs [i.e., kofun] have been found around the Yeongsan River basin, during the mid-Baekje [= 百濟 Paekche] Era. The keyhole tombs that have thus far been discovered on the Korean peninsula, were built between the 5th and the 6th centuries AD. [The earliest kofun in Japan date from the 3rd century AD.] There remains question over whether the tombs were made for Japanese aristocrats loyal to Baekje, Japanese merchants who controlled the region [Is there any evidence that any Japanese had such power in Paekche? Has anyone claimed the tombs are evidence for 任那 Mimana?], or a class independent from both Baekje and Yamato Japan.

Or how about Paekche aristocrats who had adopted a Japanese fashion?

B. Nara was named after a Japanese word by someone who may not have even been aware that the word was a borrowing from Koreanic. The trouble with this hypothesis is that it fails to explain the spellings of Nara pointing to an un-Japanese *-k.

I still doubt Japanese nawi 'earthquake' has anything to do with Tungusic na 'earth'. Just to illustrate the dangers of shared-monosyllable pseudoetymologies, I could claim that nation and nature are 'cognate' to Jurchen na 'earth'. But the initial n- of those two words go back to an earlier Latin gn- without any parallel in Tungusic which lacks initial consonant clusters.

2. The Jurchen word for 'frost' was transcribed in Ming Chinese as

塞馬吉 *sə ma ki ~ *saj ma ki (Bureau of Translators #9)

塞忙吉 *sə maŋ ki ~ *saj maŋ ki (Bureau of Interpreters #8)

(塞 has two possible readings: *sə and *saj.)

Looking at Kane's (1989: 136) reconstruction of the Jurchen word for 'frost' as semanggi got me thinking about a simple notation for vowel classes. The three vowels in that word could be symbolized as HLN:

Manchu vowel harmony permits H or L vowels to coexist with N but usually doesn't tolerate mixed H/L roots. Jurchen seems to have even more vowel harmony than Manchu, so I agree with Jin (1984: 193) that the Jurchen word for 'frost' was saimanggi (LLN; I treat ai as an L-glide sequence /aj/ and not as an LN vowel sequence).

(11.15.0:35: H/L terminology is also useful for my version of Chinese historical phonology: I claim that 壤 started out as *CInaŋ HL but harmonized to *CInɨaŋ HH. I regard the diphthong ɨa as the H counterpart of the L monophthong a.)

A similar kind of notation for languages with front/back harmony like Turkish would use the letters F, B, and N: e.g., the Arabic nonharmonic loanword kitap 'book' is FB.

3. Tonight I saw tzuris in this movie review which made me realize that tsuris (the only spelling I had ever seen until now) doesn't seem to have a singular in English. But it does have a singular in Yiddish. I should have guessed Yiddish got it from Hebrew since it has no German cognate.

11.15.0:18: What I don't understand is how Yiddish tsores became tsuris in English. (Wiktionary reports o-variants in English.) THE ORIGIN OF THE JURCHEN CHARACTER FOR 'PERSON'

How did I not notice the similarity between the Khitan large script character <ku> 'person' (left) and the Jurchen (large) script character <niyalma> 'person' until now (right)?

Khitan 仁 <ku> looks exactly like Chinese 仁 <HUMANE>. Why didn't the Khitan simply write ku with a lookalike of Chinese 人 <PERSON>?

人 <PERSON> and 仁 <HUMANE> have been homophonous since at least the early first millennium AD¹. Someone decided to use 仁 <HUMANE> to represent a non-Chinese word for 'person' because 仁 <HUMANE> is homophonous with 人 <PERSON> in Chinese. (And 仁 <HUMANE> also contains the left-hand variant 亻 of 人 <PERSON>.)

(11.14.1:26: Maybe it's remotely relevant that both 仁 and 人 have the kun [native] reading hito in Japanese. The most famous instance of 仁 hito is 裕仁 Hirohito. So famous it needs no explanatory link! Perhaps some non-Chinese language of Manchuria also had the same reading for both 仁 and 人.)

(11.14.1:41: 仁 can even mean 人  'person' in Chinese itself. See noun definitions 2 and 3 in the ROC's 教育部重編國語辭典修訂本.)

I've been deliberately vague about that "someone" who spoke a "non-Chinese" language because I do not know who chose 仁 for 'person' and when. Here are three possibilities:

A. That someone was Serbi, and 仁 originally stood for the Serbi word for 'person' - possibly a cognate of Khitan ku 'person'. But the Serbi script is unattested, so no one even knows if it was what Janhunen (1994: 441) would call 'Sinoform': i.e., Chinese-like in appearance, much less if it was ancestral to the Khitan large script (a possibility raised by Shimunek [2017: 211]).

B. That someone was from Parhae, and 仁 originally stood for the word for 'person' in some language of Parhae. That language could have been

Among the three entities formed by the Xianbei [= Serbi, including the ancestors of the Khitan], Fuyu and Yilou [= the ancestors of the Jurchen?], the Fuyu are the most obscure. If they were not Tungusic [like the Jurchen], they may have been Amuric. If they were not Amuric, they may have been another Palaeo-Asiatic entity, unconnected with the extant ethnic corpus of Manchuria. [See Janhunen 1996: 235 for more speculation about such entities in early Manchuria.]

The descendants of the Fuyu (Korean: 夫餘 Puyŏ) lived in Parhae.

C. That someone was Khitan.

C fits the standard account of the origin of the Khitan large script: that it was a Khitan 'invention' without any precedents beyond the standard Chinese script. B is my expansion of Janhunen's (1994) Parhae hypothesis, and A is built upon Shimunek's (2017) Serbi hypothesis.

Why does the Jurchen character for 'person' have an extra stroke added to 仁? Two possibilities:

A. The stroke might have been added in Parhae times to distinguish a semantogram for a non-Chinese word for 'person' from 仁 <HUMANE> which might have been used to write the Chinese morpheme 'humane' in some non-Chinese language. This type of strategy was productive in the Vietnamese nôm script: the optional extra stroke nháy 'blink' differentiates native Vietnamese 買 mới  'new' from Sino-Vietnamese 買 mãi 'to buy'. See Handel (2018: 151) for more examples of nháy. (I'm surprised nomfoundation.org doesn't have the nháy version of 買 mới.)

B. The stroke was added by a Jurchen in the 12th century to distinguish Jurchen niyalma from Khitan ku. There was no need to distinguish niyalma from 仁 which didn't exist in the 12th century (or later) Jurchen script. (But if the Jurchen were really interested in differentiating their large script from the Khitan large script [still in use in the Jurchen Empire], why do the two scripts still share characters: e.g., 一 <ONE> and 二 <TWO>?)

Lastly, here's a wild idea: Khitan 仁 <ku> may in fact be a distortion of a four-stroke variant of 人 <PERSON>: 人 plus two lines on the right². Grinstead (1972: 56-57) thinks that variant underlies the Tangut element 𘢌 <PERSON>. If so, then Khitan 仁 <ku> and Tangut 𘢌 <PERSON> are cognates. (The issue of the potential relationship between the Khitan large script and the Tangut script remains unexplored.)

¹It is unclear whether the homophony of 人 <PERSON> and 仁 <HUMANE> goes back any further than that, and it is also unclear whether the two words are related. Is their later similarity merely the result of the convergence of two unrelated etyma? See the discussion in Schuessler (2007: 440-441).

²Unfortunately this variant is not only absent from Unicode but also absent from the ROC variants dictionary (though that dictionary does have a similar variant with three lines on the right). It is attested as recently as a 1963 South Korean movie ad that I wrote about in September. I think I've also seen it in a Hong Kong comic book or movie poster from the 70s. LARGE STONE EGO

神武 <GOD MARTIAL> Jinmu, the legendary first emperor of Japan best known by his disyllabic Sino-Japanese name, has a longer native Old Japanese name

Kamu Yamatə Ipare-m-biko-n-ə sumera-mi-kətə

God Yamato Iware-GEN-prince-be-ATTR august-HON-act

'The August Agent [i.e., Emperor] Prince of Divine Yamato Iware'




in Nihon shoki (Chronicles of Japan, 720). Seeing that name again today for the first time in a long time made me wonder why the -re of Ipare corresponds to 余 <I>. There is no Old Japanese word re 'I', and at no point in Chinese language history up to the 8th century does 余 ever sound like re:

*CIla > *CIlɨa > *lɨa > *jɨa > *jɨə > *jə > *jø

I wonder if the Old Japanese place name 磐余 Ipare has nothing to do with 磐 ipa 'large rock'. The spelling might reflect a folk etymology for a pre-Japanese indigenous toponym.

11.13.1:07: It just occurred to me that if one only knew Iware, the modern Japanese form of Ipare, one might guess that 磐余 is <iwa ware> with both characters simultaneously representing the wa of Iware.  The trouble, however, is that 'large rock' was ipa in Old Japanese, not iwa.

Old Japanese 'I' was ware more or less as in modern Japanese (there could have been subtle phonetic differences that can't be reconstructed). But I can't think of any other examples of a semantogram for Old Japanese XY (here, ware) also serving as a phonogram for Old Japanese Y  (here, re). THE ORIGIN OF THE JURCHEN CHARACTER FOR 'HORSE'

In 1994 I first encountered Eric Grinstead's (1972: 16) explanation for the Jurchen characters for 'horse':

The Ruzhen [= Jurchen] language is like Manchu, which is a Tungus language, but some words could have been borrowed from Mongol. To take a very common word, and one characteristic of Mongol culture, the word for 'horse', we find the Ruzhen word to be something like 'mu-lin' (Grube, no. 138). The Mongol word for 'horse' is 'morin', not greatly different. In Grube's vocabulary we find a binome (of two characters),


which we will operate on according to the rules of deliberate alteration [from Chinese]. Adding a stroke this time, one to each character, we get 保列, pronounced in modern Chinese 'baolie'. This is reasonably close for a guess.

What bothered me was why a derivative of 12th century northern Chinese 保 *pɔw was used to write Jurchen mo- with a nasal initial. There was no shortage of Chinese *mo-characters forcing a scribe to fall back on a *p-character.

Tonight I realized why. Follow the logic here:

11.12.23:40: Jin (1984: 215) proposed that the (first) Jurchen character for 'horse' could be from a Khitan large script character in line 11 of the 蕭孝忠 Xiao Xiaozhong inscription (but not in N4631!):

< <?>

Could the character on the right have been a Khitan phonogram <mor(i)>?

The closest characters I can find in N4631 are

1217 <?> 1220 <sam>

Is 1217 a different interpretation of the character that Jin saw in Xiao Xiaozhong? I haven't seen that inscription myself.

1220 is unrelated (and nobody ever said it was related); it is a phonetic transcription of Liao Chinese 三 *sam 'three'. (No one really knows how the Khitan large script character 三 0113 <THREE> was pronounced.)

Why 1220 is pronounced sam is unknown. Did it originate as a logogram for a non-Chinese word *sam - in Khitan, in some language of Parhae, or even in Serbi (if the Khitan large script is a [partial?] offshoot of the lost Serbi script; see Shimunek 2017: 211)? (The Khitan large script could have three strata: Serbi, Parhae, and Khitan-only innovations.) KORNICKI'S "WHY ARE THERE SO MANY DIFFERENT SCRIPTS IN EAST ASIA?" (2018)

1. Peter Kornicki (author of a recent book I want to read) asks:

You don’t have to learn a new script when you learn Norwegian, Czech, or Portuguese, let alone French, so why does every East Asian language require you to learn a new script as well?

That is also true of most major South and mainland Southeast Asian languages. It would be fun to see Kornicki at a roundtable with experts on South and mainland Southeast Asian languages on an expanded version of his question.

Asia is the continent of scripts. Contrast with the Americas where the Latin alphabet has a near-total monopoly. (Two exceptions that come to mind are Cherokee and Canadian Aboriginal syllabics. I am unaware of any non-Latin scripts actively being used in Central and South America. Here's a map of the world color-coded by script types.)

2. Also by Kornicki: "How did a Japanese book come to be reprinted in Philadelphia in 1855?"

3. Before I found Kornicki's article, I was going to title this entry "Each Forehead" after this spelling of Nukata that I found in Osterkamp (2008: 222):

各田 <EACH FIELD> (normally 額田 <FOREHEAD FIELD>

nuka 'forehead' has been abbreviated as 各 (normally read kaku) before 田 ta 'field'.

Wikipedia lists another unusual spelling of Nukata:


農 is normally read - rarely nu - but never nuka. So 農多 looks like it should be read Nuta, not Nukata. I can't think of any other 'underwritten' case like this.

The reduction of 額 to 各 makes me wonder if Khitan large script and Jurchen characters - and/or their Parhae prototypes - were similarly reduced from more complex Chinese characters (which would explain why Andrew West observed that Khitan large script and Jurchen characters have "only half the number of strokes as traditional CJK characters on average"). Such reduction would make the logic behind their readings very difficult to recover. If not for the full spelling 額田, I would have a hard time figuring out why 各 is read Nuka in 各田.

Osterkamp's article also deals with silent characters in place names: e.g.,

Speaking of repetition, here no is written twice:

野 is not a silent character; it is read no 'field' and redundantly represents the second syllable of the name. 角野 is reminscent of Old Korean semantogram-phonogram spellings like

in which 音 represents -m. (夜音 is assumed to represent an Old Korean ancestor of later Korean pam 'night', but it might represent an unrelated, extinct -m word for 'night'.)

4. Looking at this 1605 copy of the 倭玉篇 Wagokuhen (c. 1489), I found a variant 晜 <SUN.YOUNGER BROTHER> of 昆 ani <SUN.COMPARE> 'older brother'. Why is 弟 <YOUNGER BROTHER> on the bottom?

(11.11.0:23: I suppose I could make up a story about an 晜 older brother being like a sun to a 弟 younger brother, but ... no.

Karlgren [1940: 231] cannot explain the structure of the standard graph 昆. 比 <COMPARE> is a drawing of two people. 昆 represents a variety of unrelated words pronounced *CA{q/k}u{n/r}: 'elder brother', 'descendants', 'afterwards', and 'numerous'.

Might the irregular aspiration in the [kʰ] of Mandarin 晜/昆 kun reflect a lost presyllable?)

Also found the erroneous reading seu for 昇 <RISE>, evidence for the merger of -eu and -you by 1605.

5. A CHAM CULTURAL SUBSTRATUM? Wish I could be at Nhung Tuyet Tran's Nov. 15 talk "Articulating Sinic Values at the Interstices of Empire: Literary Sinitic, Vernacular Vietnamese, and Neo-Confucianism in the Cham Heartland" (emphasis mine):

In 1718, in the coastal city of Quy Ninh, in what is now Vietnam’s south central coast, a group of students reprinted the "Guide for Young Learners by Category and Rhyme (指南幼學備品協韻)" in honour of their teacher. [...] More than a simple dictionary, I suggest that the bi-lingual glosses reflect the influence of Cham cultural patterns and habits in its articulation of orthodox Confucian values.

6. Alas, no abstract up yet for Sujung Kim's forthcoming talk "The Old Man and the Sea: Shinra Myōjin and Buddhist Networks of the East Asian ‘Mediterranean' " (2 March 2020). 新羅明神 Shinra Myōjin 'Shilla bright deity' is the guardian of 三井寺 Mii-dera a.k.a. 園城寺 Onjōji. See Shinra Myōjin save 円珍 Enchin (814-891) here.

7. I just realized that the normal reading Shiragi for 新羅 is the opposite of 和泉 Izumi from topic 3. 和泉 is 'overwritten' with a first character that has no phonetic value, whereas Shiragi is 'underwritten' as 新羅 <shin ra> without a character 城 <FORTRESS> for -gi 'fortress'. For some reason, the Japanese called Shilla Sira (now Shiragi) 'Shilla-fortress' rather than simply 'Shilla', though they adopted the spelling of the Shilla autonym 新羅.

8. While Shilla unified most of the Korean peninsula,渤海 Parhae ruled its northern part and much of Manchuria. I wonder what this was about ...

This term [名神 myōjin 'famous deities'] 'is first attested in the Shoku Nihongi [Continued Chronicles of Japan], where offerings from the kingdom of Bohai (Balhae [= Parhae]) are stated to have been offered to "the eminent shrines (名神社 myōjin-sha) in each province [of Japan]" in the year 730 (Tenpyō 2).

9. Did 小高句麗 Little Koguryo exist to the southwest of Parhae? 日野開三郎 Hino Kaizaburō first proposed it in his PhD dissertation 小高句麗国の研究 Studies of the Country of Little Koguryo (1958). Oddly Little Koguryo has no Japanese Wikipedia entry.

