<so nggiyan uliya aniya juwa emu biya orin ninggu inenggi>

'yellow pig year, ten one month, twenty six day'

1. orin ninggu 'twenty six' is a para-Mongolian (Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin jirghughan 'twenty six' containing an unrelated Mongolian word for 'six'.

Grinstead (1972: 16) noted that

ninggu 'six'

is an inverted Chinese 六 <SIX>. It is not like any of the variants of Khitan large script <SIX>:

Is the Jurchen graph a 12th century invention, or is it derived from a version of the Parhae <SIX> that the Khitan did not adopt for their large script?

The reading of Khitan <SIX> is unknown, but it might be something like Proto-Mongolic *jir-gu-xan 'two-three-NUMERAL' as reconstructed by Janhunen (2003: 17). Jishi read <SIX> as ʧirkɔ: i.e., as 'two-three'. But if Janhunen is right about *jir-gu-xan being an innovation, Khitan might retain an older Proto-Serbi-Mongolic root for 'six'.

The Khitan small script block

<085.033.288> <SIX.is.bun> (Epitaph for Empress 仁懿 Renyi, d. 1076)

might indicate that <SIX> ended in -i, given how the initial vowel of one block (here, the i of <is>) is often (but not always) the final vowel of the previous block (here, <SIX>).

2. What is the etymology of Hawaiian luakini 'large heiau [Hawaiian temple; < hei 'sacrifice' + ?] where ruling chiefs prayed and human sacrifices were offered'? It looks like a compound of lua plus kini, but I can't find any lua or kini that would transparently add up to 'sacrificial temple'.

3. Wikipedia on the Dzungar genocide:

[Qing emperor] Qianlong issued his orders multiple times as some of his officers were reluctant to carry them out. Some were punished for sparing Dzungars and allowing them to flee, such as Agui and Hadada, while others who participated in the slaughter were rewarded like Tangkelu and Zhaohui (Jaohui).

If Tangkelu is a Manchu name, it violates vowel harmony. I would expect Tangkalu or Tengkelu.

4. I wish I could look for Tangkelu in Giovanni Stary's A Dictionary of Manchu Names (2000). The book's National Library of Australia listing says it's in "Mandingo" (sic). No.

5. In actual Mandingo, "/g/ and /p/ are found in French loans." The language has /k c j t d b/, though. Are /h/ and /p/ in part or in whole from earlier *g and *p?

6. The IPA transcription of the Kazakhstani national anthem is so different from what one might think Kazakh sounds like solely on the basis of the Cyrillic or Latrin alphabet: e.g.,

[jɪrlɪkˈtɪŋ dɑstɑˈnə]

Ерліктің дастаны

<Erliktiņ dastany>

Erlik-tiń dastan-y

'courage-GEN epic-3.POSS.NOM' = 'epic of courage'

One might expect the pronunciation to be something like [erliktiŋ dastanɨ] on the basis of Cyrillic and Latin alone. And if one guessed that Cyrillic і was [i], what would one guess и is? (It's [ɪj] ~ [əj] according to this chart.)

The use of ы/y for [ə] reminds me of my own choice to use y for the Tangut neutral vowel which may have been [ə] or [ə]-like in one or more grades.

The 3rd person singular possessive suffix -ы/y is missing from this table. See Mukhamedova (2016: 81) on the Kazakh X-GEN Y-POSS 'Y of X' construction.

7. Why does Glosbe align Kazakh дастан 'epic' with Dennis in translations?

8. Until now I assumed that Turkic beg was a loanword from the Middle Chinese title 伯 *pæk. That is the etymology in Clauson (1972: 322). But Wiktionary has a second etymology:

the Middle Persian title bag (also baγ or βaγ, Old Iranian baga; cf. Sanskrit भग / bhaga) meaning "lord" and "master". Peter Golden derives the word via Sogdian bġy from the same Iranian root. All Middle Iranian languages retain forms derived from baga- in the sense "god": Middle Persian bay (plur. bayān, baʾān), Parthian baγ, Bactrian bago, Sogdian βγ-, and were used as honorific titles of kings and other men of high rank in the meaning of "lord".

The problem I have with this etymology is: why was  a in some Iranian language borrowed as Turkic e?

If /a/ in the Iranian source language was [æ], how can Slavic bog 'god' be a loan from Iranian? Was the Slavic word borrowed from a different Iranian source language in which /a/ was back and labial: [ɒ] or [ɔ]?

As for the Chinese etymology, the mismatch of initials (Chinese *p- vs. Turkic b-) is not a problem if the borrowing was in an early Turkic variety without p-. (Pre-Proto-Turkic *p- became Proto-Turkic *h- which was preserved in Khaladj and was lost elsewhere.)

The -g of beg might be a Turkic approximation of a  Chinese (allophonic?) [ɣ]-like pronunciation of *-k. Although Old Turkic did have gh, gh could not coexist with e, but g could. And at some point, Middle Chinese raised to *ɛ. Late Middle Chinese *pɛɣ was transcribed in the Tibetan version of the  千字文 Thousand Character Classic (c. 9th-10th c.?)as <peg.> which is close to Turkic beg. (However, the Turkic word is first attested in the 8th century, possibly when 伯 was closer to *pæk than *pɛɣ in western Middle Chinese.)

9. If I understand this correctly, Haddow is a Germanic/Celtic (Scots + Scots Gaelic) hybrid. Are there more common names like it?

10. Aacistak has been called "the Language Capital of the World". What is its more common name? YELLOW PIG 11/25

<so nggiyan uliya aniya juwa emu biya orin shunja inenggi>

'yellow pig year, ten one month, twenty five day'

1. orin shunja 'twenty five' is a para-Mongolian (Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin tabun 'twenty five' containing an unrelated Mongolian word for 'five'.

The initial of 'five' in Manchu is s-, not sh-. Neither Jurchen sh- nor Manchu s- matches the t- in the rest of Tungusic.

2. Last night I thought of a Chinese character for the first time in many years: 閼. It has the same phonetic as a character that I first encountered last week: 菸.

That phonetic is a drawing of a crow: 於/烏. 烏 still represents the word for crow, but its variant 於 has come to represent a nearly homophonous locative preposition.

Normally 於/烏-graphs represent open syllables in modern languages: e.g.,

So in Cantonese, I would expect 閼 and 菸 to end either in -u [u] or -yu [y]. But they don't:

The vowels are less of an issue (see the appendix) than the codas:

In other words, 於/烏 should represent *-a(ʔ)(s) syllables but not *-t syllables or *-n syllables. Should. But clearly 於 is a phonetic in

I have not found any evidence for 菸 being read with -n before the last millennium. At some point 菸 came to represent a word 'tobacco' < 煙/烟 Old Chinese *CAʔin 'smoke' normally written with -n phonetics (垔 and 因). The top component of 菸 'tobacco' is <GRASS> which makes sense. But the bottom component 於 is a poor phonetic (and 於 is unlikely to be an abbreviation of the uncommon character 閼 which also has non-n readings). Was 菸 'smelly grass' chosen to write an unrelated and phonetically different but semantically similar word 'tobacco'?

I found 菸 via Wiktionary's entry on yen. I forgot that yen could also refer to having a desire for something.

12.22.19:22: APPENDIX: Some *-a rhymes from Old Chinese to Cantonese:

*Voiceless initials condition Cantonese tone 1 unless there ae other conditioning factors:

At some point after tonogenesis,*ʔ- was lost, and zero initials became homorganic glides before high vowels:

Contrast with *ʔa > nonhigh [a] without a glide in Cantonese 閼 aat3 [aːt˧]. YELLOW PIG 11/24

<so nggiyan uliya aniya juwa emu biya orin duin inenggi>

'yellow pig year, ten one month, twenty four day'

1. orin duin 'twenty four' is a para-Mongolian (Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin dörben 'twenty four' containing an unrelated Mongolian word for 'four'. -ben is the 'feminine'¹ vowel variant of the -ban found in ghurban 'three', and both ghurban and dörben have a shared suffix -r- (Janhunen 2003: 47).

Rozycki (1983: 7, 93) regards Jurchen/Manchu duin and Written Mongolian dörben to be a "[p]re-loan correspondence": "words with a phonology consistent with native Tungus stock and for which there is no evidence of loaning". I regard the vague similarity of duin and Proto-Mongolic *dö- 'four' (as reconstructed by Janhunen 2003: 47) as coincidental.

¹I use the term 'feminine' to avoid committing to a front or higher vowel interpretation of e.

2. Yesterday I forgot how to pronounce 6ix9ine which looks like it was written in the Arabic chat alphabet (in which 6 is ط <ṭ> and 9 is ص <ṣ> or ق <q>). But it's actually a stylized spelling of six nine mixing logograms with letters. The Jurchen (large) script, Korean hyangchal, and Japanese script frequently have logogram-phonogram sequences for words. Perhaps the Khitan large script did too, but it's too poorly understood for me to be certain.

How did Tekashi 6ix9ine come up with the stage name Tekashi? Is it based on Japanese Takashi?

3. I knew Ў wasn't unique to Belarusian (in which it represents /w/), but I forgot which other language was written with Ў: Uzbek. Ў has since been replaced with Oʻ. Ў/ represents mid /o/, whereas О/O represents low /ɒ/ and /o/ in Russian loans. Did Uzbeks perceive Russian /o/ [o] ~ [ɔ]² as being lower than their /o/ and closer to their /ɒ/? Does native /o/ have a high allophone [ʊ]? That would explain why it was written as Ў: i.e., as У <U> with a breve rather than as О <O> plus a diacritic.

²For some reason, Wikipedia IPA has [ɛ] for Russian /e/ and [o] (not [ɔ]) for Russian /o/ even though this diagram shows the two vowels at almost identical heights with [o] lower than [ɛ] rather than the other way around.

4. Cyrillic Ӯ (Ұ after 1957; see here for other uses of Ӯ) for Kazakh /ʊ/ reminds me of Möllendorff's Ū for Manchu /ʊ/.

The 'feminine' counterpart of Manchu /ʊ/ is /u/, but Kazakh has no /u/. It has an interesting three-way categorization of vowels: -RTR, 0RTR (neutral), and +RTR. The [-RTR] and [0RTR] counterparts of [+RTR] // are /ɪ/ and /ʉ/. (Kazakh has no /i/ either. If the IPA symbols are taken at face value, apparently the only high vowel is central /ʉ/; /ɪ/ and /ʊ/ are slightly lower.)

Is Kazakh /œ/ backed if not central? It is a [0RTR] vowel like /ʉ əj ə/ despite being written with a front vowel symbol like the [+RTR] vowels /ɪ jɪ e æ/.

5. I wish I had a key to the 1964-1984 Kazakh Latin alphabet used in China (and in this 1977 edition of Mao's Selected Works).

6. Last night I found Handel (2006) while trying to find where I had first encountered the idea that Korean 바람 param < Middle Korean pʌ̀rʌ̀m 'wind' was a borrowing from Old Chinese. I thought I had read it in Pulleyblank (1962), but I couldn't find it there. This 2013 post reminded me I got it from William Boltz. My apologies to Professor Boltz.

Handel discusses 'wind' on page 1015. In footnote 8, he mentions an internal etymology relating Korean 'wind' to pul- < Middle Korean pǔr- 'to blow'. Although the semantic match is perfect, the phonetic match leaves much to be desired. First, I know of no other cases of a CʌC-noun from a CuC-verb. Second, Middle Korean pǔr- is a class 5 stem in Ramsey's (1986) typology; it is a disyllabic stem /pùúr/, and if I understand Ramsey (1978: 221) correctly, it goes back to *pùrɯ́- with high series vowels and a high-low pitch pattern unlike the low-pitched low series vowels of pʌ̀rʌ̀m.

7. This part of the Wikipedia article on the Common Turkic Alphabet puzzles me:

Some handwritten letters have variant forms. For example: Čč=Jj, Ķķ=, and Ḩḩ=.

But Lithuanian Karaim, the only Turkic Latin alphabet  that I know of with Č, distinguishes Č (for []) from J (for [j]). And I find it hard to believe that two letters with such different shapes could be variants only in Turkic usage.

Of course in general Latin letter usage there are some surprising variants. Would an alien guess that B and b are the same letter? Uzbek used to have в instead of b in the 1928-40 Yaꞑalif alphabet. (I am not italicizing в since I'm not sure if the old Uzbek italic в looked like Russian italic в.)

Turns out that "[t]he small letter B is ʙ (to prevent confusion with Ь ь)". Although Ь represented palatalization in Russian, in Yaꞑalif, it seems to have stood for Soviet Turkic vowels similar to Turkish ı: e.g., Tatar [ɤ]. Uzbek had no such vowel:

[æ] [ɒ]

Nonetheless I guess ʙ remained the lowercase version of B in Uzbek for consistency with the other variants of Yaꞑalif. You can see Uzbek ʙ here.

8. I've never looked at Karakalpak before today. I confess I forgot it even existed.

It has a nearly symmetrical vowel system with palatal vowel harmony. Only e has no nonpalatal counterpart.


It also has labial harmony. If the first vowel is nonlabial, then the second vowel cannot be labial. However, if the first vowel is labial, then the second vowel may or may not be labial. In any case, vowels must match in palatality.

How was Karakalpak /h/ written in Cyrillic? I can't find a Cyrillic letter for it.

9. Wikipedia says that

The [irregular] /otoosan/ form [for Japanese 'father'] first appears in the early Meiji period in educational materials mandated by the 文部省 (Monbushō, "Ministry of Education").

Did /otoosan/ replace earlier /otossan/ by analogy with the long vowel of /okaasan/ 'mother'?

/okaasan/ is itself irregular; it is from /okakasan/ with  irregular intervocalic /k/-loss.

Wikipedia lists Taiwanese borrowings of both words: 多桑 <MANY MULBERRY> tò-sàng and 卡桑 <kha MULBERRY> khà-sàng. Both reflect shorter Japanese forms without the honorific prefix o-.

19.12.18.xx:xx: YELLOW PIG 11/23

<so nggiyan uliya aniya juwa emu biya orin ilan inenggi>

'yellow pig year, ten one month, twenty three day'

1. orin ilan 'twenty three' is a para-Mongolian (Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin ghurban 'twenty three' containing an unrelated Mongolian word for 'three'.

2. Yesterday I learned that Eom Ik-sang still believes a number of Korean words conventionally regarded as native are actually borrowings from Old Chinese. Even if I assume the Old Chinese forms he cites are correct, there are still issues.

Perhaps the most convincing of his proposals is

Old Chinese 風 *pljəm (Li), *plums (Zhengzhang) 'wind' : Korean 바람 param 'id.'

I would prefer to cite Middle Korean pʌ̀rʌ̀m 'wind' which is even closer to the Old Chinese reconstructions that he cites.

Although I expressed some doubts about a liquid in the Old Chinese word for 'wind' in 2013, I would favor reconstructing that word as *prəm with *-r- now.

That aside, there is one other potential problem with the comparison: I don't think anyone's Old Chinese reconstruction for 'wind' ever had the vowel *ʌ. If the Old Chinese word for 'wind' had *ə, why was it borrowed into early Korean as something like pʌ̀rʌ̀m when Korean also had the vowel ə? In other words, why isn't the Korean word for 'wind' pərəm with ə?

12.19.22:33: Was Edkins (1890: 95) the first to derive Korean param from Old Chinese 風?

param, wind; from [an unspecified - presumably Chinese -] pam. The old Chinese for wind is bam, which has changed to [Mandarin] feng.

Edkins was writing decades before Karlgren reconstructed Old Chinese. I know almost nothing about pre-Karlgren Chinese reconstructions, so I wonder what the reasoning behind pam and bam are. *pam is not a bad guess, since even in the 19th century, it was known that f- was from *p- and that 'wind' rhymed with 南 'south' (Mandarin nán and Cantonese naam4). However, *b- is a surprise, as 'wind' does not have a tone pointing to an earlier *voiced initial.

3. I've never seen anything like this use of the reflexive in Romagnol:

mè a sò 'I am' (cf. Italian [io] sono 'id.')

The reflexive seems less exotic in this case:

mè a j'ò 'I have' (cf. Italian [io] ho 'id.')

And the English and Italian translations of this last instance also have a reflexive:

mè a'm so lavê 'I washed myself' (cf. Italian [io] mi sono lavato 'id.')

4. Wikipedia:

Romagnol has an inventory of up to 20 contrastive vowels in stressed position, in comparison to Italian's 7.

Unfortunately Wikipedia doesn't list all 20 vowel phonemes. How did the 10 native vowels of Latin become 20 in Romagnol? Are some of the Romagnol vowels from Latin diphthongs?

The most interesting Romagnol vowels are these diphthongs which are unlike anything in Latin:

I assume they are phonemes, though Wikipedia represents them with phonetic brackets. /Və̯/ : /Vɐ̯/ is a fine contrast I've never seen before.

5. How did Neapolitan develop this alternation?

Did an earlier *o break to [wo] before the masculine ending *-o merged with the feminine ending *-a?

6. While I'm in languages of Italy mode, It just occurred to me that the gorgia toscana is a bit like Jurchen/Manchu in which *p > f (albeit in all environments, not just intervocalically) and *-k- > -h- (see Vovin 1997 for details).

7. I saw a commercial for the IUDs Mirena [məɹiːnə] and Kyleena [kʰajliːnə]. Those names sound like 'creative' Anglospheric girls' names. The commercial was aimed at young women. Somebody wanted the audience to think of IUDs as if they were daughters. The children that the IUDs are supposed to prevent. Creepy marketing. YELLOW PIG 11/22

<so nggiyan uliya aniya juwa emu biya orin juwe inenggi>

'yellow pig year, ten one month, twenty two day'

1. orin juwe 'twenty two' is a para-Mongolian (Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin qoyar 'twenty two' containing an unrelated Mongolian word for 'two'. Jurchen juwe 'two' is not to be confused with Jurchen juwa 'ten'.

2. Last night when trying to figure out the Chinese character spellings for damofo and yumofo, I typed <fo> into Windows 10's Pinyin IME and was surprised to see 仸 <PERSON.夭>. 夭 ǎo/yāo/yǎo is normally not phonetic in b/p/f-graphs:

I would have guessed that 仸 was read as something like yao. Then I learned that 仸 is a variant of 佛 'Buddha'. 仸 seems to be a semantic compound with 天 <HEAVEN> slightly altered to 夭. (天 and 夭 are difficult to distinguish in a sans serif font, but in handwriting, the top stroke of 天 is written from left to right, whereas the top stroke of 夭 is written from right to left.)

3. Two elephantine surprises last night: Wiktionary notes a subtle difference between 象 <ELEPHANT> in the PRC standard and nom on the one hand and elsewhere in the Sinosphere on the other. Both versions of 象 have the same codepoint.

I am not sure that the PRC and nom really have a distinct version of 象:

4. 象 was also formerly a simplification of 像. The Wiktionary entry for 象 says it was a 1964-1986 simplification of 像. Wikipedia mentions other two characters restored in 1986: 覆 and 叠. I am skeptical:

5. When trying to type 复 in Microsoft's Bopomofo IME, I found 䲁 <FISH.wèi> wèi 'a snake-like fish' as the 64th and last choice for fù. How did 䲁 get in the list? Graphic confusion with 鮒 <FISH.> 'a kind of fish' which is also in the list?

6. Unidentifiable Khitan small script characters I encountered while copying the 契丹小字研究 Qidan xiaozi yanjiu (Research on the Khitan Small Script)  hand copy of the epitaph for Emperor 興宗 Xingzong (1015-1054) of the Khitan Empire:

⿱⺌月 (but with a dot instead of two horizontal lines in 月; 2.21.1)

a lookalike of Chinese 七 <SEVEN> (2.24.1)

I assume they must be in the book's indices under more conventional forms - but what are those forms?

Ah, the first was a variant of 298 <co> with a narrower bottom half and a curved lower stroke:

The very block with 298 from Xingzong was even discussed in Kane (2009: 71). Duh.

The Qidan xiaozi yanjiu hand copy also has some slight variations of characters I do recognize: e.g.,

243 <HEAVEN> and 240 <TEN>

are written with 𠂉 on top instead of ハ. As a result, 243 <HEAVEN> looks like 矢 204 whose phonetic value is unknown. Could 矢 204 be interpreted as 'heaven'?

I still have no idea what 七 is. Not only is it an unusual (for Khitan) shape, but it is also is the only top element in a pyramid.

7. The Cantonese-only character 乸 <jaa2.MOTHER> for naa2 'female' has an unusual phonetic 也 jaa5. The rhyme is perfect; the initial is not. 乸 has puzzled me since I first saw it some time ago, but today I just realized that a j-phonetic 也 might have been chosen because there are phonetics representing both j- and n-syllables: e.g., 襄 soeng1 (with s-!) < *sInaŋ in

That j- ~ n- alternation goes back to a single Old Chinese *n- that developed two reflexes: *n- before nonhigh vowels and palatal *ɲ- before high vowels.

也 had Old Chinese *l-, another source of Cantonese j-. *l-characters normally aren't phonetics in Cantonese n-characters.

Cantonese speakers would not know which j- are from *n- and which j- are from *l-, so whoever came up with 乸 might have thought, 'if 襄 can stand for j- and n-syllables, 也 can too', unaware that 也 jaa5 isn't from *n- (and hence 'shouldn't represent Cantonese n-syllables).

8. I missed Andrew West's tweet on a cursive Tangut tablet from the Baisigou pagoda.

9. Marijn van Putten on the mystery of Mehmet. YELLOW PIG 11/21

<so nggiyan uliya aniya juwa emu biya orin juwe inenggi>

'yellow pig year, ten one month, twenty two day'

1. orin emu 'twenty one' is a para-Mongolian (Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin nigen 'twenty one' containing an unrelated Mongolian word for 'one'.

2. I wish I could look more into exceptions to 'Altaic' vowel harmony. Two examples that have long stuck in my mind:

More recently I came across Manchu age 'older brother' (not ege or aga!; see Hauer and Corff [2007]: 7). Rozycki (1983: 22) regards age as somehow related to Written Mongolian aq-a¹ 'id.': "The correspondence is ancient and direction of loan impossible to ascertain." Could this be an anne-like case of intimate deformation?

I couldn't find age or other similar Manchu words like ahūn 'older brother' in Doerfer's Mongolo-Tungusica (1985), so I suppose Doerfer does not think there is any connection between the Manchu and Mongolian words.

What finally pushed me to write about Manchu age was seeing Manchu ajige 'small, little, young' (not ejige or ajiga) on Saturday night. Its root is aji-, also found in ajida 'small' and ajigan 'young, small' which are harmonic. majige 'little' is similarly nonharmonic with similar semantics. Are these cases of cute deformation? Imitating the speech of small children who have not yet mastered vowel harmony? I can't quickly find any article on L1 Turkish vowel harmony acquisition (DuckDuckGo results are often unsatisfying), but Leiwo, Kulju, and Aoyama (2006?) cover Finnish vowel harmony:

The data showed that most of Finnish 2;6-year-olds’ productions do not violate FVH [Finnish vowel harmony], suggesting early mastery of FVH. When there were errors in children's productions, they were mostly substitutions of back vowels for the front rounded vowels.

... which is the opposite of the substitution that occurred in Turkish anne! (Or centuries ago in barmis.)

Unlike Finnish or Turkish, Manchu does not have palatal harmony. Manchu age, etc. have a high series vowel e [ə] in place of its low series counterpart a. But if I 'translate' the Finnish error pattern into Manchu, I would expect substitutions of low series vowels for high series vowels. Which is the opposite of what happened in age, etc.

There is, however, a common denominator: Finnish vowel harmony errors occurred "especially in non-initial syllables and in suffixes" (Leiwo, Kulju, and Aoyama (2006: 151), and the Turkish and Manchu violations above are also in noninitial position: -mis, anne, age.

Incidentally, Aoyama Katsura is a former classmate of mine.

¹The hyphen is a device to transliterate the obligatory space in the Written Mongolian spelling <aq a>; it has no morphological or phonological significance.

3. Looking at Tangut


4440 2len4 'pavilion' (#189 in The Golden Guide)

led me to wonder: Why did Middle English pavilloun become modern English pavilion? Was -i- restored by someone who knew its Latin source pāpiliō 'butterfly'?

4. Today I started copying the epitaph for Emperor 興宗 Xingzong (1015-1054) of the Khitan Empire. I haven't gotten to line 4 yet, but I looked ahead and spotted block 24

<096.339.140> <?.i.en>

of line 17.24.

The only other instances of 096 that I know of are in the block

<096.339> <?.i>

in the epitaphs for Mme. 耶律 Yelü (11.20) and 蕭敵魯 Xiao Dilu (1061-1114; 30.19 and 34.14).


is similar in shape to 095, a lookalike of Chinese 女 <WOMAN>. 095 is more common than 096 and can occur in medial and final positions in blocks. These different distributive patterns suggest that 096 represents a more complex phonetic sequence than 095 - one that so far is only known from the beginnings of words. On the other hand, whatever 095 represents may be more complex than, say, 339 which is simply [i]?

Both 095 and 096 probably represent one or more syllables absent from Liao Chinese, as neither appears in Khitan transcriptions of Chinese. They may contain

I doubt that 095 or 096 represent single segments. I suspect that all the single-segment phonograms of the Khitan small script have been found by now.

As far as I know, as of 2016 there were 482 known small script characters including variants. Have any new ones been found lately? The only new small script texts found lately to the best of my knowledge are fragments of jade tablets from a mausoleum. If this photograph is representative, the texts are too short to be likely to contain any character that hasn't surfaced in any previously known, much longer texts.

5. Today I finally got Jun Jiang's Learn Manchu Handwriting on my iPhone. As neat as it is to see a finger trace strokes on a screen, I wish I could double-check the direction and order of strokes with another source. And I'm not yet accustomed to the wheel interface.

6. Today I also got Jun Jiang's Mongolian Words & Writing app, but I haven't tried it out yet. Users hoping to learn Mongolian Cyrillic will be disappointed since the app only covers the traditional script. I'd like to know how to write Ө <Ö> and Ү <Ü> in cursive. (The rest of the alphabet is identical to Russian, and I've been writing Russian in cursive since 1997.)

7. Jun Jiang's store doesn't have any app for Mongolian Cyrillic, but it does have these apps:

I assume those apps have the same interface as the Manchu app.

So much for my original guess that Jun Jiang might be a Manchu and Mongol specialist.

8. Wikipedia's sample of the traditional Mongolian script is (turn 90 degrees clockwise for the proper orientation - alas, that way the first line is on the right instead of the left where it should be):

ᠴᠣᠷᠢ ᠢᠢᠨ ᠭᠠᠭᠴᠠ

cori yin ghaghca

ᠪᠣᠰᠤᠭ᠎ᠠ ᠪᠢᠴᠢᠭ᠌᠄33

'single GEN single': i.e., 'the one and only'

bosugh-a bicig:

'vertical script:'

ᠮᠣᠩᠭᠣᠯ ᠪᠢᠴᠢᠭ᠌

mongghol bicig

'Mongol script'

I don't know what is meant by 'one and only' since  there are other vertical scripts, and even if one is only thinking of major vertical scripts written from left to right, the Mongolian script is not unique since the Manchu script is written the same way.

ghaghca has a synonym ghanca. How can that word-medial -gh- ~ -n- alternation be explained - assuming they are related words?

9. Today while double-checking the Li Fanwen number for the common Tangut character


4457 2leq3 'great'

I found these interesting characters which appear to be semantic compounds:


4445 2bi1 = 4457 2leq3 'great' + 2547 1chir2 'right'


4454 2ryr1 = 4457 2leq3 'great' + 2920 1zhyq3 'left'

2920 has the Tangraphic Sea analysis


2920 1zhyq3 'left' = all of 3485 1laq 'hand' + right of 4454 2ryr1

which cannot be taken at face value as the origin of the character - why would a character for a common word 'left' be based on a rare character 4445?

4445 and 4454 are only known as members of these compounds:


4445 0661 2bi1 2ngon4  'South Sea'


4454 0661 2ryr1 2ngon4 'North Sea'

4445 and 4454 are not the normal words for 'south' and 'north' which are


4796 1zyr4 'south' and 0942 1laq3 'north'

Although the Tangut script is thought to be full of semantic compounds, it is curious that 4445 and 4454 - glossed by Li Fanwen (2008: 706-707) as 'south' and 'north' - do not contain any components in common with 4796 and 0942, the graphs for the common words 'south' and 'north'.

Nonetheless Li's glosses make sense: 4445 has the notation


4796 0661 1zyr4 2ngon4 'southern sea'

in Homophones D and is a definition for 4796 'south' in Tangraphic Sea 89.251. And if 4454 contains 'left', the opposite of the 'right' in 4445, then 4454 must be 'north', the opposite of 4445 = 4796 'south'. But I am hesitant to gloss 4445 and 4454 simply as 'south' and 'north'. Maybe 'Great South' and 'Great North' or even as 'Great Right' and 'Great Left'?

The association of 'south' with 'right' is reminiscent of Sanskrit dakṣiṇa- 'south/right'. Sanskrit uttara- 'north' can also mean 'left', but the normal word for left is vāma- which does not mean 'south'.

What were the Great South/Right and Great North/Left Seas? Were they mythical? I don't know much about how the landlocked Tangut perceived their world. How many Tangut had ever seen a sea? What is the etymology of 2ngon4 'sea'?

10. Today I saw this passage in Gorelova ( :15; I added the links):

The Mohes [靺鞨] called their tribal leader "damofo mandu" (chin. da [大] "great"), as one can see further, the Southern Shiwei [室韋], who can be identified as people of Tungusic descent, called their tribal chieftains "yumofo mandu".


The language spoken by the Mohe was Tungus-Manchu. What is important to mention is that the language of the Sushen could also be referred to as proto-Tungusic.

During the Tang era, the Mohe, similar to other peoples of northeastern Asia, were subjected to constant political and military pressure from Tang rulers. Soon after the Koguryo state of Korea had been defeated by the Tang empire (668 AD), a large portion of the Koguryo people fled into the lands of the Sumo Mohe [粟末靺鞨]. Soon a lot of towns, surrounded by defensive walls, arose there. Around 700, a new state, "Parhae" (chin. Bohai), raised from the ruins of Koguryo, was established. It was the leader of Sumo Mohe, Cicik Zhungxiang [乞乞仲象] who was considered the creator of Bohai. [...] Later, his grandson, Uazhi Da Tuyu, declared himself the emperor of Bohai, which in the course of time became highly cultured and enlightened, and widely known beyond the borders of the country. The Parhae (Bohai) state—a deserving successor of the culture and power of Koguryo and the tribal league of the Songari Mohe—flourished for 228 years until it was destroyed by the Qitans [Khitans] (926 AD) (Shavkunov, 1968; Crossley, 1997:18; Larichev, 1998:53-4).

What are the characters for damofo mandu and yumofo mandu which sound like modern Mandarin readings of old Chinese transcriptions?

I was surprised to see the Southern Shiwei described as Tungusic since their name - roughly pronounced *shirwi in Late Middle Chinese - is derived from the para-Mongolic autonym Serbi. But of course names are not reliable guides to linguistic affiliation.

Cicik Zhungxiang is a strange, not-quite-Pinyin romanization of 乞 乞仲象 Qǐqǐ Zhòngxiàng with a -k whose motivation is obscure. Assuming the Chinese pronunciation favored in Parhae was like early Sino-Korean, 乞 乞仲象 was pronounced something like *kər kər tyung syang. 乞 乞 <BEG BEG> looks like an insulting ('derographic') transcription of a non-Chinese (i.e., Mohe) name. 乞 乞仲象 is also known as 大 仲象 with a Chinese-style surname 大 <GREAT> to go along with the Chinese-style disyllabic personal name 仲 象 <SECOND.BORN ELEPHANT>.

Uazhi Da Tuyu is presumably 乞 乞仲象's son (not grandson) 大祚榮 (Mandarin: Dà Zuòróng, Korean: Tae Cho-yŏng; r. 712-719), the first king (not emperor) of Parhae. I have no idea what Uazhi is.

11. The best for last: I just discovered Andrew West's Tangraphic Sea search tool! More Tangut resources here. YELLOW PIG 11/20

<so nggiyan uliya aniya juwa emu biya orin inenggi>

'yellow pig year, ten one month, twenty day'

1. Jurchen and Manchu orin 'twenty' sounds like Written Mongolian qorin 'id.' The pronunciation of Khitan

廿 <TWENTY> (large script)

丁 <TWENTY> (small script)

is unknown; it could have been something like qorin.

Normally Written Mongolian q corresponds to h or k in Manchu.

Rozycki (1983: 11-12) proposes four layers of borrowing into (Jurchen/)Manchu to explain the different correspondences:

Layer 1: Mongolic *q- borrowed as *k- > *x- > *Ø- (within Tungusic): e.g., orin 'twenty'

Layer 2: Mongolic *q- borrowed as *k- > *x- (within Tungusic): e.g., hoton 'city wall' (cf. Written Mongolian qoton 'id.')

Layer 3: Mongolic *q- borrowed as k-: e.g., kobkolo- 'to remove (paper stuck to a surface)' (cf. Written Mongolian qubqol- 'to peel')

Layer 4: modern Mongolic *q- > x- borrowed as h-

This model could be refined: e.g., in the early layers, the borrowing was probably from para-Mongolic (specifically Khitan) rather than from Mongolic.

There doesn't seem to be any way to distinguish between layers 2 and 4 on the basis of Manchu evidence. I suppose Rozycki assigns Manchu words to layer 2 if the borrowings are found elsewhere in Tungusic (e.g., see Doerfer [1985: 81] for hoton-type Tungusic words). Layer 2 words were borrowed into early Tungusic, whereas layer 4 words were borrowed only into (Jurchen/)Manchu.

2. The Khitan large script character 廿 <TWENTY> is identical to the standard Chinese character 廿 <TWENTY> which was pronounced *ɲip in Middle Chinese, a fusion of 二 *ɲi̤ 'two' and 十 *dʑip 'ten'. Wiktionary says the expected standard Mandarin reflex is rì, but the actual reflex is niàn because

[t]he irregular pronunciation (e.g. /nVm/ [with the nasal counterpart of the original coda /p/] dates from the Song dynasty, to avoid homophony with a vulgar word; see 入.

Let's see 入:

The regular Mandarin pronunciation [for 入 <ENTER>] as predicted from Middle Chinese is rì. The irregular sound change [to rù] is for taboo reasons - to avoid homophony with its derived vulgar meaning "to enter > to have sexual intercourse", nowadays represented by 日 (rì).

I would expect 廿 to be nhập in Vietnamese since 二 'two' is nhị and 十 'ten' is tập. Wiktionary lists five Vietnamese readings of 廿:

The normal Vietnamese word for 'twenty' is native: 𠄩𨑮 hai mươi 'two ten', which has its own contracted form hăm (with short ă instead of long a!).

3. Is it obvious to Koreans that the hangul title of the movie 독전 Tokchŏn (English title: Believer) is 毒戰 <POISON BATTLE> tokchŏn rather than 督戰 <SUPERVISE BATTLE> tokchŏn 'urging to fight harder'?

Only the second tokchŏn is in dictionaries. The first tokchŏn is a straightforward Koreanization of the title of its inspiration, the Chinese movie 毒戰 (Mandarin Dúzhàn, Cantonese Duk6 zin3; English title: Drug War]).

The fact that some websites call the Korean movie 독전: 마약전쟁 Tokchŏn: mayak chŏnjaeng 'Poison Battle: Narcotic Wars' implies that Tokchŏn by itself might need clarification. In hanja that longer title looks redundant with two 戰 chŏn: 毒戰: 痲藥戰爭.

4. Naver's Korean-English dictionary gave this sentence as an example of tokchŏn:

암튼 '독전' 화이팅 할까요?

Amthŭn 'Tokchŏn' hwaithing halkkayo?

'Anyway, shall we do "Believer" fighting?'

That made me curious about the etymology of 암튼 amthŭn 'anyway'. Is it of recent origin? I couldn't find it in Martin et al.'s massive 1967 Korean-English dictionary or my old portable favorite, Dong-A's 1981 Korean-English dictionary.

I think 암튼 is an extreme example of contraction:

아무리 하려 하면 하든지

amu-ri ha-ryŏ ha-myŏn ha-dŭ-n-ji


Martin et al. (1967: 1093) derive 암 am 'surely' from


amuryŏmyŏn 'surely'

which according to Martin et al. (1967: 1073) is in turn a contraction of

아무리 하려 하면

amu-ri ha-ryŏ ha-myŏn


thŭn is a reduction of




Martin (1992: 834) translates -dŭ-n-ji as  'the uncertain fact that it has been observed that', 'whether it was (observed to be/happen'). -ji can be dropped. That leaves hadŭn /hatɯn/. th- /tʰ/ looks like the product of syncope, metathesis, and fusion:

/hat/ > /ht/ > /th/ > /tʰ/

Metathesis is a regular process in Korean: /hC/ cannot surface as [hC].

(12.16.0:16: The reduction of /hat/ to /tʰ/ above parallels the reduction of the first syllable of the Korean root 'to ride' between the 12th and 15th centuries:

12th c. *hʌta- > *hta- > 15th c. tha- /tʰa/

The 12th century form is preserved in Chinese transcription as 轄打 *xjaʔta in Jilin leishi. I have followed the conventional view by reconstructing *ʌ in the first syllable, but now it occurs to me that Chinese *-ja- might reflect a 12th century Korean *(y)e or *yə. Perhaps

pre-12th c. *heta- > 12th c. *h(y)eta- or *hyəta- > *hʌta- > *hta- > 15th c. tha- /tʰa/

I reconstruct *e as a front low series vowel in early Korean:




That *e later broke to (= in my modified McCune-Reischauer romanization), the most common yV-sequence in native Korean words.

In my scenario for 'to ride' above, *(y)e or *yə was reduced to *ʌ, the minimal low series vowel, before being lost. By that point Korean had developed vowel harmony, so the vowel in the first syllable had to be a low series vowel like the *a in the second syllable.)

5. More examples of metathesis in Korean:

암클 amkhŭl or 암글 amgŭl < /am(h) kɯr/

'useless knowledge, female writing, hangul'

수클 sukhŭl or 수글 sugŭl < /su(h) kɯr/

'useful knowledge, male writing, Chinese characters'

That pair of words is not only sexist but also reflects  a Sinocentric worldview.

The final /h/ of /amh/ 'female' and /suh/ 'male' surfaces as aspiration following a stop which in this case is the /k/ of /kɯr/ 'writing'.

The variants with -gŭl are compounds in which 'female' and 'male' have been reinterpreted as /am/ and /su/ without /h/. /k/ voices after voiced segments: /m/, /u/, and the /n/ of han'gŭl /hankɯr/ 'great/Korean-writing'.

Naver regards the -g-forms (amgŭl, sugŭl) as correct and states that the -kh-forms are erroneous (see here and here), though Martin et al. (1967: 1011, 1095) only lists the -kh-forms. Does that indicate the reanalysis of 'female' and 'male' as being without /h/ has been completed over the past half-century? Not quite - the official standard for Korean still requires aspiration in, for instance,암캐 amkhae 'female dog' < /amh kɛ/ (not 암개 amgae!) in which am- is still clearly 'female' (한글 맞춤법 Hangul Spelling 4.4.31 and 표준어규정 Standard Language Code 1.1.7). Perhaps the 'writing' words have lost their gendered associations.

I found amkhŭl in Martin et al. (1967: 1095) when looking in vain for amthŭn (ㅋ kh is before ㅌ th in Korean alphaetical order).

6. I was surprised to learn that 怒濤 <ANGER WAVE> dotō) is a Japanese name for a kind of Faucaria plant.

(12.16.2:22: The same characters are the Chinese name [Mandarin nùtāo] for Faucauria paucidens.)

The Korean name for Faucauria tuberculosa is a combination of that and the kanji for the Japanese name of Faucauria tuberculosa (荒波 aranami 'wild wave') read in Sino-Korean: 怒濤荒波 nodo hwangpha.

Sino-Korean 怒濤 nodo 'angry wave' by coincidence sounds like the unrelated native Japanese word 喉 nodo 'throad' - and by another coincidence, Faucaria is from Latin fauces 'throat'.

7. I thought faucet might be related to fauces 'throat',  and Wiktionary agrees, but Merriam-Webster gives a derivation I don't quite follow:

Middle English, bung, faucet, from Middle French fausset bung, perhaps from fausser to damage, from Late Latin falsare to falsify, from Latin falsus false

Falsetto turns out to be from falsus too.

The bottom of Merriam-Webster's entry for faucet led me to their Time Traveler feature showing what words were first attested in English in a given century: e.g., the 15th century (faucet, favored, feasible ...).

