188.8.131.52:59: YELLOW PIG 11/26
<so nggiyan uliya aniya juwa emu biya orin ninggu inenggi>'yellow pig year, ten one month, twenty six day'
1. orin ninggu 'twenty six' is a para-Mongolian (Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin jirghughan 'twenty six' containing an unrelated Mongolian word for 'six'.
Grinstead (1972: 16) noted that
is an inverted Chinese 六 <SIX>. It is not like any of the variants of Khitan large script <SIX>:
Is the Jurchen graph a 12th century invention, or is it derived from a version of the Parhae <SIX> that the Khitan did not adopt for their large script?
The reading of Khitan <SIX> is unknown, but it might be
something like Proto-Mongolic *jir-gu-xan 'two-three-NUMERAL'
as reconstructed by Janhunen (2003: 17). Jishi read <SIX> as ʧirkɔ:
i.e., as 'two-three'. But if Janhunen is right about *jir-gu-xan
being an innovation, Khitan might retain an older Proto-Serbi-Mongolic
root for 'six'.
The Khitan small script block
<085.033.288> <SIX.is.bun> (Epitaph for Empress 仁懿 Renyi, d. 1076)
might indicate that <SIX> ended in -i, given how the
initial vowel of one block (here, the i of <is>) is often
(but not always) the final vowel of the previous block (here,
2. What is the etymology of Hawaiian luakini 'large heiau [Hawaiian temple; < hei 'sacrifice' + ?] where ruling chiefs prayed and human sacrifices were offered'? It looks like a compound of lua plus kini, but I can't find any lua or kini that would transparently add up to 'sacrificial temple'.
on the Dzungar genocide:
[Qing emperor] Qianlong issued his orders multiple times as some of his officers were reluctant to carry them out. Some were punished for sparing Dzungars and allowing them to flee, such as Agui and Hadada, while others who participated in the slaughter were rewarded like Tangkelu and Zhaohui (Jaohui).
is a Manchu name, it violates vowel harmony. I would expect Tangkalu
4. I wish I could look for Tangkelu in Giovanni Stary's A Dictionary of Manchu Names (2000). The book's National Library of Australia listing says it's in "Mandingo" (sic). No.
5. In actual
Mandingo, "/g/ and /p/ are found in French loans." The language has
/k c j t d b/, though. Are /h/ and /p/ in part or in whole from earlier
*g and *p?
IPA transcription of the Kazakhstani national anthem is so
different from what one might think Kazakh sounds like solely on the
basis of the Cyrillic or Latrin alphabet: e.g.,
'courage-GEN epic-3.POSS.NOM' = 'epic of courage'
One might expect the pronunciation to be something like [erliktiŋ dastanɨ] on the basis of Cyrillic and Latin alone. And if one guessed that Cyrillic і was [i], what would one guess и is? (It's [ɪj] ~ [əj] according to this chart.)
The use of ы/y for [ə] reminds me of my own choice to use y for the Tangut neutral vowel which may have been [ə] or [ə]-like in one or more grades.
The 3rd person singular possessive suffix -ы/y is missing
from this table. See Mukhamedova
(2016: 81) on the Kazakh X-GEN Y-POSS 'Y of X' construction.
7. Why does Glosbe
align Kazakh дастан 'epic' with Dennis in translations?
8. Until now I assumed that Turkic beg was a loanword from the Middle Chinese title 伯 *pæk. That is the etymology in Clauson (1972: 322). But Wiktionary has a second etymology:
the Middle Persian title bag (also baγ or βaγ, Old Iranian baga; cf. Sanskrit भग / bhaga) meaning "lord" and "master". Peter Golden derives the word via Sogdian bġy from the same Iranian root. All Middle Iranian languages retain forms derived from baga- in the sense "god": Middle Persian bay (plur. bayān, baʾān), Parthian baγ, Bactrian bago, Sogdian βγ-, and were used as honorific titles of kings and other men of high rank in the meaning of "lord".
The problem I have with this etymology is: why was a
in some Iranian language borrowed as Turkic e?
If /a/ in the Iranian source language was [æ], how can Slavic bog 'god' be a loan from Iranian? Was the Slavic word borrowed from a different Iranian source language in which /a/ was back and labial: [ɒ] or [ɔ]?
As for the Chinese etymology, the mismatch of initials (Chinese *p- vs. Turkic b-) is not a problem if the borrowing was in an early Turkic variety without p-. (Pre-Proto-Turkic *p- became Proto-Turkic *h- which was preserved in Khaladj and was lost elsewhere.)
The -g of beg might be a Turkic approximation of a Chinese (allophonic?) [ɣ]-like pronunciation of *-k. Although Old Turkic did have gh, gh could not coexist with e, but g could. And at some point, Middle Chinese *æ raised to *ɛ. Late Middle Chinese *pɛɣ was transcribed in the Tibetan version of the 千字文 Thousand Character Classic (c. 9th-10th c.?)as <peg.> which is close to Turkic beg. (However, the Turkic word is first attested in the 8th century, possibly when 伯 was closer to *pæk than *pɛɣ in western Middle Chinese.)
9. If I understand this correctly, Haddow is a Germanic/Celtic (Scots + Scots Gaelic) hybrid. Are there more common names like it?
10. Aacistak has been called "the Language Capital of the World". What is
its more common name?
184.108.40.206:55: YELLOW PIG 11/25
<so nggiyan uliya aniya juwa emu biya orin shunja inenggi>'yellow pig year, ten one month, twenty five day'
1. orin shunja 'twenty five' is a para-Mongolian (Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin tabun 'twenty five' containing an unrelated Mongolian word for 'five'.
The initial of 'five' in Manchu is s-, not sh-.
Neither Jurchen sh- nor Manchu s- matches the
t- in the rest of Tungusic.
2. Last night I thought of a Chinese character for the first time in
many years: 閼. It has the same phonetic as a character that I first
encountered last week: 菸.
That phonetic is a drawing of a crow: 於/烏. 烏 still represents the
word for crow, but its variant 於 has come to represent a nearly
homophonous locative preposition.
Normally 於/烏-graphs represent open syllables in modern languages: e.g.,
烏嗚 Old Chinese *ʔa > Cantonese wu1
於 Old Chinese *CIʔa > Cantonese jyu1
棜瘀 Old Chinese *CIʔas > Cantonese jyu3
So in Cantonese, I would expect 閼 and 菸 to end either in -u [u] or -yu [y]. But they don't:
閼 Cantonese aat3 ~ jin1
菸 Cantonese jin1
The vowels are less of an issue (see the appendix) than the codas:
Cantonese -t and -n go back to Old Chinese *-t and *-n.
Usually an Old Chinese phonetic can represent Old Chinese *-t syllables or *-n syllables but not both.
And usually an Old Chinese phonetic for *vowel-final syllables can also represent Old Chinese *-ʔ and *-(ʔ)s syllables but not Old Chinese syllables ending in stops other than *-ʔ or ending in *nasals.
In other words, 於/烏 should represent *-a(ʔ)(s) syllables but not *-t syllables or *-n syllables. Should. But clearly 於 is a phonetic in
Old Chinese *ʔat > Cantonese aat3
Old Chinese *CIʔan and *ʔen (syllable in the title of the Xiongnu supreme female leader) > Cantonese jin1
Old Chinese *ʔa(t)s > no Cantonese reflex (which would theoretically be *jyu3)?
Old Chinese *CIʔat > no Cantonese reflex (which would theoretically be *jit3)?
Old Chinese *CIʔa 'to fade' > no Cantonese reflex (which would theoretically be *jyu1)?
Old Chinese *CIʔas 'smelly grass' > no Cantonese reflex (which would theoretically be *jyu3)?
I have not found any evidence for 菸 being read with -n before the last millennium. At some point 菸 came to represent a word 'tobacco' < 煙/烟 Old Chinese *CAʔin 'smoke' normally written with -n phonetics (垔 and 因). The top component of 菸 'tobacco' is <GRASS> which makes sense. But the bottom component 於 is a poor phonetic (and 於 is unlikely to be an abbreviation of the uncommon character 閼 which also has non-n readings). Was 菸 'smelly grass' chosen to write an unrelated and phonetically different but semantically similar word 'tobacco'?
I found 菸 via Wiktionary's entry on yen. I forgot that yen could also refer to having a desire for something.
12.22.19:22: APPENDIX: Some *-a rhymes from Old Chinese to Cantonese:
*Ca > *Co > [Cuː]: e.g., Cantonese 烏嗚 wu1 [wuː˥]
*CICa > *CICɨa > *Cɨa > *Cɨə > *Cɨ > [Cyː]: e.g., Cantonese 於 jyu1 [jyː˥]
*Cat > [Caːt]: e.g., Cantonese 閼 aat3 [aːt˧]
*CICan > *CICɨan > *Cɨan > *Cɨən > *Cɨen > *Cien > [Ciːn]: e.g., Cantonese 閼 jin1 [jiːn˥]
*CACin > *CACein > *Cein > *Cen > *Cien > [Ciːn]: e.g., Cantonese 菸煙烟 jin1 [jiːn˥]
Old Chinese *-s conditions Cantonese tone 3 after *voiceless initials
Old Chinese *-t conditions Cantonese tone 3 after *voiceless initials and *long vowels
At some point after tonogenesis,*ʔ- was lost, and zero initials became homorganic glides before high vowels:
*ʔu > u > [wuː]: e.g., Cantonese 烏嗚 wu1 [wuː˥]
*ʔy > y > [jyː]: e.g., Cantonese 於 jyu1 [jyː˥]
*ʔi > i > [jiː]: e.g., Cantonese 閼 jin1 [jiːn˥]
Contrast with *ʔa > nonhigh [a] without a glide in Cantonese 閼 aat3 [aːt˧].
220.127.116.11:51: YELLOW PIG 11/24
<so nggiyan uliya aniya juwa emu biya orin duin inenggi>'yellow pig year, ten one month, twenty four day'
1. orin duin 'twenty four' is a para-Mongolian (Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin dörben 'twenty four' containing an unrelated Mongolian word for 'four'. -ben is the 'feminine'¹ vowel variant of the -ban found in ghurban 'three', and both ghurban and dörben have a shared suffix -r- (Janhunen 2003: 47).
Rozycki (1983: 7, 93) regards Jurchen/Manchu duin and
Written Mongolian dörben to be a "[p]re-loan correspondence":
"words with a phonology consistent with native Tungus stock and for
which there is no evidence of loaning". I regard the vague similarity
of duin and Proto-Mongolic *dö- 'four' (as
reconstructed by Janhunen 2003: 47) as coincidental.
¹I use the term 'feminine' to avoid
committing to a front or higher vowel interpretation of e.
2. Yesterday I forgot how to pronounce 6ix9ine which
looks like it was written in the Arabic chat
alphabet (in which 6 is ط <ṭ> and 9 is ص <ṣ> or ق
<q>). But it's actually a
stylized spelling of six nine mixing logograms with letters.
The Jurchen (large) script, Korean hyangchal, and Japanese script
frequently have logogram-phonogram sequences for words. Perhaps the
Khitan large script did too, but it's too poorly understood for me to
How did Tekashi 6ix9ine come up with the stage name Tekashi?
Is it based on Japanese Takashi?
3. I knew Ў wasn't unique to Belarusian (in which it represents /w/), but I forgot which other language was written with Ў: Uzbek. Ў has since been replaced with Oʻ. Ў/Oʻ represents mid /o/, whereas О/O represents low /ɒ/ and /o/ in Russian loans. Did Uzbeks perceive Russian /o/ [o] ~ [ɔ]² as being lower than their /o/ and closer to their /ɒ/? Does native /o/ have a high allophone [ʊ]? That would explain why it was written as Ў: i.e., as У <U> with a breve rather than as О <O> plus a diacritic.
²For some reason, Wikipedia IPA has [ɛ] for Russian
/e/ and [o] (not [ɔ]) for Russian /o/ even though this
diagram shows the two vowels at almost identical heights with [o] lower
than [ɛ] rather than the other way around.
4. Cyrillic Ӯ (Ұ after 1957; see here for other uses of Ӯ) for Kazakh /ʊ/ reminds me of Möllendorff's Ū for Manchu /ʊ/.
The 'feminine' counterpart of Manchu /ʊ/ is /u/, but Kazakh has no /u/. It has an interesting three-way categorization of vowels: -RTR, 0RTR (neutral), and +RTR. The [-RTR] and [0RTR] counterparts of [+RTR] // are /ɪ/ and /ʉ/. (Kazakh has no /i/ either. If the IPA symbols are taken at face value, apparently the only high vowel is central /ʉ/; /ɪ/ and /ʊ/ are slightly lower.)
Is Kazakh /œ/ backed if not central? It is a [0RTR] vowel like /ʉ əj
ə/ despite being written with a front vowel symbol like the [+RTR]
vowels /ɪ jɪ e æ/.
5. I wish I had a key to the 1964-1984 Kazakh Latin alphabet used in
China (and in this
1977 edition of Mao's Selected
6. Last night I found Handel
(2006) while trying to find where I had first encountered the idea
that Korean 바람 param < Middle Korean pʌ̀rʌ̀m
'wind' was a borrowing from Old Chinese. I thought I had read it in
Pulleyblank (1962), but I couldn't find it there. This 2013 post
reminded me I got it from William Boltz. My apologies to Professor
Handel discusses 'wind' on page 1015. In footnote 8, he mentions an
internal etymology relating Korean 'wind' to pul- < Middle
Korean pǔr- 'to blow'. Although the semantic match is perfect,
the phonetic match leaves much to be desired. First, I know of no other
cases of a CʌC-noun from a CuC-verb. Second, Middle
Korean pǔr- is a class 5 stem in Ramsey's (1986) typology; it
is a disyllabic stem /pùúr/, and if I understand Ramsey (1978: 221)
correctly, it goes back to *pùrɯ́- with high series vowels and
a high-low pitch pattern unlike the low-pitched low series vowels of pʌ̀rʌ̀m.
part of the Wikipedia article on the Common Turkic Alphabet puzzles me:
Some handwritten letters have variant forms. For example: Čč=Jj, Ķķ=Ⱪⱪ, and Ḩḩ=Ⱨⱨ.
But Lithuanian Karaim, the only Turkic Latin alphabet that I
know of with Č, distinguishes Č (for ) from J
(for [j]). And I find it hard to believe that two letters with such
different shapes could be variants only in Turkic usage.
Of course in general Latin letter usage there are some surprising
variants. Would an alien guess that B and b are the
same letter? Uzbek used to have в instead of b in the 1928-40
Yaꞑalif alphabet. (I am not italicizing в since I'm not sure if the old
Uzbek italic в looked like Russian italic в.)
Turns out that "[t]he small letter B is ʙ (to prevent confusion with Ь ь)". Although Ь represented palatalization in Russian, in Yaꞑalif, it seems to have stood for Soviet Turkic vowels similar to Turkish ı: e.g., Tatar [ɤ]. Uzbek had no such vowel:
Nonetheless I guess ʙ remained the lowercase version of B in Uzbek
for consistency with the other variants of Yaꞑalif. You
can see Uzbek ʙ here.
8. I've never looked at Karakalpak before today. I confess I forgot it even existed.
It has a nearly symmetrical vowel system with palatal vowel harmony. Only e has no nonpalatal counterpart.
It also has labial harmony. If the first vowel is nonlabial, then the second vowel cannot be labial. However, if the first vowel is labial, then the second vowel may or may not be labial. In any case, vowels must match in palatality.
How was Karakalpak /h/ written in Cyrillic? I can't find a Cyrillic letter for it.
9. Wikipedia says that
The [irregular] /otoosan/ form [for Japanese 'father'] first appears in the early Meiji period in educational materials mandated by the 文部省 (Monbushō, "Ministry of Education").
Did /otoosan/ replace earlier /otossan/ by analogy with the long vowel of /okaasan/ 'mother'?
/okaasan/ is itself irregular; it is from /okakasan/
with irregular intervocalic /k/-loss.
Wikipedia lists Taiwanese borrowings of both words: 多桑 <MANY
MULBERRY> tò-sàng and 卡桑 <kha MULBERRY> khà-sàng. Both
reflect shorter Japanese forms without the honorific prefix o-.
19.12.18.xx:xx: YELLOW PIG 11/23
<so nggiyan uliya aniya juwa emu biya orin ilan inenggi>'yellow pig year, ten one month, twenty three day'
1. orin ilan 'twenty three' is a para-Mongolian (Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin ghurban 'twenty three' containing an unrelated Mongolian word for 'three'.
2. Yesterday I learned that Eom Ik-sang still believes a number of Korean words conventionally regarded as native are actually borrowings from Old Chinese. Even if I assume the Old Chinese forms he cites are correct, there are still issues.
Perhaps the most convincing of his proposals is
Old Chinese 風 *pljəm (Li), *plums (Zhengzhang) 'wind' : Korean 바람 param 'id.'
I would prefer to cite Middle Korean pʌ̀rʌ̀m 'wind' which is even closer to the Old Chinese reconstructions that he cites.
expressed some doubts about a liquid in the Old Chinese word for 'wind'
in 2013, I would favor reconstructing that word as *prəm
with *-r- now.
That aside, there is one other potential problem with the comparison: I don't think anyone's Old Chinese reconstruction for 'wind' ever had the vowel *ʌ. If the Old Chinese word for 'wind' had *ə, why was it borrowed into early Korean as something like pʌ̀rʌ̀m when Korean also had the vowel ə? In other words, why isn't the Korean word for 'wind' pərəm with ə?
12.19.22:33: Was Edkins (1890: 95) the first to derive Korean param from Old Chinese 風?
param, wind; from [an unspecified - presumably Chinese -] pam. The old Chinese for wind is bam, which has changed to [Mandarin] feng.
Edkins was writing decades before Karlgren reconstructed Old
Chinese. I know almost nothing about pre-Karlgren Chinese
reconstructions, so I wonder what the reasoning behind pam and bam
are. *pam is not a bad guess, since even in the 19th century,
it was known that f- was from *p- and that 'wind' rhymed
with 南 'south' (Mandarin nán and Cantonese naam4).
However, *b- is a surprise, as 'wind' does not have a tone
pointing to an earlier *voiced initial.
3. I've never seen anything like this use of the reflexive in Romagnol:
mè a sò 'I am' (cf. Italian [io] sono 'id.')
The reflexive seems less exotic in this case:
mè a j'ò 'I have' (cf. Italian [io] ho 'id.')
And the English and Italian translations of this last instance also have a reflexive:
mè a'm so lavê 'I washed myself' (cf. Italian [io] mi sono lavato 'id.')
Romagnol has an inventory of up to 20 contrastive vowels in stressed position, in comparison to Italian's 7.
Unfortunately Wikipedia doesn't list all 20 vowel phonemes. How did the 10 native vowels of Latin become 20 in Romagnol? Are some of the Romagnol vowels from Latin diphthongs?
The most interesting Romagnol vowels are these diphthongs which are
unlike anything in Latin:
ê [ɛə̯] vs. ë [ɛɐ̯]
ô [ɔə̯] vs. ö [ɔɐ̯]
I assume they are phonemes, though Wikipedia represents them with
phonetic brackets. /Və̯/ : /Vɐ̯/ is a fine contrast I've never seen
5. How did Neapolitan
develop this alternation?
luongo [ˈlwoŋɡə] 'long' (masculine)
longa [ˈloŋɡə] 'long' (feminine)
Did an earlier *o break to [wo] before the masculine ending *-o
merged with the feminine ending *-a?
*ˈloŋɡo > *ˈlwoŋɡo > [ˈlwoŋɡə]
*ˈloŋɡa > *ˈloŋɡa > [ˈloŋɡə]
6. While I'm in languages of Italy mode, It just occurred to me that the gorgia toscana is a bit like Jurchen/Manchu in which *p > f (albeit in all environments, not just intervocalically) and *-k- > -h- (see Vovin 1997 for details).
7. I saw a commercial for the IUDs Mirena [məɹiːnə] and Kyleena
[kʰajliːnə]. Those names sound like 'creative' Anglospheric girls'
names. The commercial was aimed at young women. Somebody wanted the
audience to think of IUDs as if they were daughters. The children that
the IUDs are supposed to prevent. Creepy marketing.
18.104.22.168:51: YELLOW PIG 11/22
<so nggiyan uliya aniya juwa emu biya orin juwe inenggi>'yellow pig year, ten one month, twenty two day'
1. orin juwe 'twenty two' is a para-Mongolian
(Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin qoyar
'twenty two' containing an unrelated Mongolian word for 'two'. Jurchen juwe
'two' is not to be confused with Jurchen juwa 'ten'.
2. Last night when trying to figure out the Chinese character
spellings for damofo and yumofo,
I typed <fo> into Windows 10's Pinyin IME and was surprised to
see 仸 <PERSON.夭>. 夭 ǎo/yāo/yǎo is normally not phonetic
沃 wò/wù (the first and more common reading is irregular - a loan from another dialect?)
I would have guessed that 仸 was read as something like yao.
Then I learned that 仸 is a variant of 佛 Fó 'Buddha'. 仸 seems to
be a semantic compound with 天 <HEAVEN> slightly altered to 夭. (天
and 夭 are difficult to distinguish in a sans serif font, but in
handwriting, the top stroke of 天 is written from left to right, whereas
the top stroke of 夭 is written from right to left.)
3. Two elephantine surprises last night: Wiktionary notes a subtle difference between 象 <ELEPHANT> in the PRC standard and nom on the one hand and elsewhere in the Sinosphere on the other. Both versions of 象 have the same codepoint.
I am not sure that the PRC and nom really have a distinct version of 象:
my 1971 edition of 新华字典 Xinhua zidian (The New Chinese Character Dictionary) has the 'wrong' form of 象
my modern edition of the nom handbook Ngũ thiên tự (Five Thousand Characters) has the 'wrong' form of 象 in the main text. (The font in the index is too small for me to make out which form is used.)
4. 象 was also formerly a simplification of 像. The Wiktionary entry for 象 says it was a 1964-1986 simplification of 像. Wikipedia mentions other two characters restored in 1986: 覆 and 叠. I am skeptical:
叠 is already simplified. The traditional form is 疊. Sukhanov
(1980: 25) lists an even more simplified form ⿱又冝 which is 俗 'popular'
and not official. (12.18.0:15: My love of variants compels me to
mention other variants of 疊 listed in Wiktionary: 𣆹曡㬪疉.)
My 1971 edition of 新华字典 Xinhua zidian has the characters 像覆叠 as main entries 15 years before 1986. All main entries are in simplified characters which are followed by their traditional forms in parentheses. The entries for 象 and 复 do not list 像 and 覆 in parentheses.
Conversely, DeFrancis' (1996) dictionary substitutes 象 for 像 in
its main body and lists 像 in an appendix of traditional characters.
5. When trying to type 复 fù in Microsoft's Bopomofo IME, I
found 䲁 <FISH.wèi> wèi 'a snake-like fish' as the
64th and last choice for fù. How did 䲁 get in the list? Graphic
confusion with 鮒 <FISH.fù> fù 'a kind of fish'
is also in the list?
6. Unidentifiable Khitan small script characters I encountered while copying the 契丹小字研究 Qidan xiaozi yanjiu (Research on the Khitan Small Script) hand copy of the epitaph for Emperor 興宗 Xingzong (1015-1054) of the Khitan Empire:
⿱⺌月 (but with a dot instead of two horizontal lines in 月; 2.21.1)
a lookalike of Chinese 七 <SEVEN> (2.24.1)
I assume they must be in the book's indices under more conventional forms - but what are those forms?
Ah, the first was a variant of 298 <co> with a narrower bottom half and a curved lower stroke:
The very block with 298 from Xingzong was even discussed in Kane
(2009: 71). Duh.
The Qidan xiaozi yanjiu hand copy also has some slight variations of characters I do recognize: e.g.,
243 <HEAVEN> and 240 <TEN>
are written with 𠂉 on top instead of ハ. As a result, 243 <HEAVEN> looks like 矢 204 whose phonetic value is unknown. Could 矢 204 be interpreted as 'heaven'?
I still have no idea what 七 is. Not only is it an unusual (for Khitan) shape, but it is also is the only top element in a pyramid.
7. The Cantonese-only character 乸 <jaa2.MOTHER> for naa2 'female' has an unusual phonetic 也 jaa5. The rhyme is perfect; the initial is not. 乸 has puzzled me since I first saw it some time ago, but today I just realized that a j-phonetic 也 might have been chosen because there are phonetics representing both j- and n-syllables: e.g., 襄 soeng1 (with s-!) < *sInaŋ in
儴勷攘瀼獽禳穰蘘蠰躟鬤 joeng4 < *ɲɨaŋ < *nɨaŋ
< *CInɨaŋ < *CInaŋ
囊囔瓤饢 nong4 < *naŋ
That j- ~ n- alternation goes back to a single Old Chinese *n- that developed two reflexes: *n- before nonhigh vowels and palatal *ɲ- before high vowels.
也 had Old Chinese *l-, another source of Cantonese j-.
*l-characters normally aren't phonetics in Cantonese n-characters.
Cantonese speakers would not know which j- are from *n- and which j- are from *l-, so whoever came up with 乸 might have thought, 'if 襄 can stand for j- and n-syllables, 也 can too', unaware that 也 jaa5 isn't from *n- (and hence 'shouldn't represent Cantonese n-syllables).
8. I missed Andrew West's tweet on a cursive Tangut tablet from the Baisigou pagoda.
9. Marijn van Putten on the mystery of Mehmet.
22.214.171.124:46: YELLOW PIG 11/21
<so nggiyan uliya aniya juwa emu biya orin juwe inenggi>'yellow pig year, ten one month, twenty two day'
1. orin emu 'twenty one' is a para-Mongolian (Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin nigen 'twenty one' containing an unrelated Mongolian word for 'one'.
2. I wish I could look more into exceptions to 'Altaic' vowel harmony. Two examples that have long stuck in my mind:
Old Turkic -mish ~ -mis (past particple)
(examples from Tekin 1968: 179):
qazghanmish 'acquired' (not qazghanmïš; Bilgä Qaghan East 22)
barmis 'gone' (not barmïs; Bilgä Qaghan East 22)
but other Old Turkic texts have ï where it is
expected: e.g., tughmïsh 'risen' (Mai tH XV 11v22; found
in Erdal 2004: 268)
Turkish anne 'mother' (not anna or enne!)
Clauson (1972: 169) says anaː 'mother' was "sometimes
subjected to unusual deformations, e.g., anne,
to make it a term of more intimate affection" - a phenomenon that is
the opposite of taboo deformation in terms of motivation (though not
More recently I came across Manchu age 'older
brother' (not ege or aga!; see Hauer and
Corff : 7). Rozycki (1983: 22) regards age as somehow
related to Written Mongolian aq-a¹ 'id.':
"The correspondence is ancient and direction of loan impossible to
ascertain." Could this be an anne-like case of intimate
I couldn't find age or other similar Manchu words like ahūn 'older brother' in Doerfer's Mongolo-Tungusica (1985), so I suppose Doerfer does not think there is any connection between the Manchu and Mongolian words.
What finally pushed me to write about Manchu age was seeing Manchu ajige 'small, little, young' (not ejige or ajiga) on Saturday night. Its root is aji-, also found in ajida 'small' and ajigan 'young, small' which are harmonic. majige 'little' is similarly nonharmonic with similar semantics. Are these cases of cute deformation? Imitating the speech of small children who have not yet mastered vowel harmony? I can't quickly find any article on L1 Turkish vowel harmony acquisition (DuckDuckGo results are often unsatisfying), but Leiwo, Kulju, and Aoyama (2006?) cover Finnish vowel harmony:
The data showed that most of Finnish 2;6-year-olds’ productions do not violate FVH [Finnish vowel harmony], suggesting early mastery of FVH. When there were errors in children's productions, they were mostly substitutions of back vowels for the front rounded vowels.
... which is the opposite of the substitution that occurred in
Turkish anne! (Or centuries ago in barmis.)
Unlike Finnish or Turkish, Manchu does not have palatal harmony. Manchu age, etc. have a high series vowel e [ə] in place of its low series counterpart a. But if I 'translate' the Finnish error pattern into Manchu, I would expect substitutions of low series vowels for high series vowels. Which is the opposite of what happened in age, etc.
There is, however, a common denominator: Finnish vowel harmony errors occurred "especially in non-initial syllables and in suffixes" (Leiwo, Kulju, and Aoyama (2006: 151), and the Turkish and Manchu violations above are also in noninitial position: -mis, anne, age.
Katsura is a former classmate of mine.
¹The hyphen is a device to transliterate the
obligatory space in the Written Mongolian spelling <aq a>; it has
no morphological or phonological significance.
3. Looking at Tangut
4440 2len4 'pavilion' (#189 in The Golden Guide)
led me to wonder: Why did Middle English pavilloun become modern English pavilion? Was -i- restored by someone who knew its Latin source pāpiliō 'butterfly'?
4. Today I started copying the epitaph for Emperor 興宗 Xingzong
(1015-1054) of the Khitan Empire. I haven't gotten to line 4 yet, but I
looked ahead and spotted block 24
of line 17.24.
The only other instances of 096 that I know of are in the block
in the epitaphs for Mme. 耶律 Yelü (11.20) and 蕭敵魯 Xiao Dilu (1061-1114; 30.19 and 34.14).
is similar in shape to 095, a lookalike of Chinese 女 <WOMAN>.
095 is more common than 096 and can occur in medial and final positions
in blocks. These different distributive patterns suggest that 096
represents a more complex phonetic sequence than 095 - one that so far
is only known from the beginnings of words. On the other hand, whatever
095 represents may be more complex than, say, 339 which is simply [i]?
Both 095 and 096 probably represent one or more syllables absent from Liao Chinese, as neither appears in Khitan transcriptions of Chinese. They may contain
a non-Liao Chinese consonant (e.g., q)
a non-Liao Chinese vowel
segments shared with Liao Chinese but combined into a sequence absent from Liao Chinese
I doubt that 095 or 096 represent single segments. I suspect that all the single-segment phonograms of the Khitan small script have been found by now.
As far as I know, as of 2016 there were 482
known small script characters including variants. Have any new ones
been found lately? The only new small script texts found lately to the
best of my knowledge are fragments
of jade tablets from a mausoleum. If
this photograph is representative, the texts are too short to be
likely to contain any character that hasn't surfaced in any previously
known, much longer texts.
5. Today I finally got Jun Jiang's Learn
on my iPhone. As neat as it is to see a finger trace strokes on a
screen, I wish I could double-check the direction and order of strokes
with another source. And I'm not yet accustomed to the wheel interface.
6. Today I also got Jun Jiang's Mongolian
Words & Writing app, but I haven't tried it out yet. Users
hoping to learn Mongolian
Cyrillic will be disappointed since the app only covers the traditional
script. I'd like to know how to write Ө <Ö> and Ү <Ü>
in cursive. (The rest of the alphabet is identical to Russian, and I've
been writing Russian in cursive since 1997.)
7. Jun Jiang's store doesn't have any app for Mongolian Cyrillic, but it does have these apps:
Learn Chinese Handwriting ! (with a space before !)
Japanese Kanji Writing (but the icon has hiraganaあ <a>!)
Learn Uyghur Handwriting ! (with a space before !)
Tibetan Words & Writing
Thai Words & Writing
Persian Words & Writing
Learn Zhuang Language ! (with a space before !)
Lao Words & Writing
Learn Khmer Handwriting ! (with a space before !)
Learn Burmese Handwriting ! (with a space before !)
Tamil Words & Writing
Learn Malay Language ! (with a space before !)
Tagalog Language - Filipino
Vietnamese Alphabet & Words
Learn Hokkien Language ! (with a space before !)
Cantonese Words & Writing!
Hebrew Words & Writing
Wu Language - Chinese Dialect
Hakka - Chinese Dialect (no "Language"!)
Korean Hanja Handwriting ! (with a space before !)
I assume those apps have the same interface as the Manchu app.
So much for my original guess that Jun Jiang might be a Manchu and Mongol specialist.
sample of the traditional Mongolian script
is (turn 90 degrees clockwise for the proper orientation - alas, that
way the first line is on the right instead of the left where it should
ᠴᠣᠷᠢ ᠢᠢᠨ ᠭᠠᠭᠴᠠ
cori yin ghaghca
ᠪᠣᠰᠤᠭᠠ ᠪᠢᠴᠢᠭ᠌᠄33'single GEN single': i.e., 'the one and only'
I don't know what is meant by 'one and only' since there are other vertical scripts, and even if one is only thinking of major vertical scripts written from left to right, the Mongolian script is not unique since the Manchu script is written the same way.
ghaghca has a synonym ghanca. How can
that word-medial -gh- ~ -n- alternation be explained -
assuming they are related words?
9. Today while double-checking the Li Fanwen number for the common Tangut character
4457 2leq3 'great'
I found these interesting characters which appear to be semantic
4445 2bi1 = 4457 2leq3 'great' + 2547 1chir2 'right'
4454 2ryr1 = 4457 2leq3 'great' + 2920 1zhyq3 'left'
2920 has the Tangraphic Sea analysis
2920 1zhyq3 'left' = all of 3485 1laq 'hand' + right of 4454 2ryr1
which cannot be taken at face value as the origin of the character - why would a character for a common word 'left' be based on a rare character 4445?
4445 and 4454 are only known as members of these compounds:
4445 0661 2bi1 2ngon4 'South Sea'
4454 0661 2ryr1 2ngon4 'North Sea'
4445 and 4454 are not the normal words for 'south' and 'north' which are
4796 1zyr4 'south' and 0942 1laq3 'north'
Although the Tangut script is thought to be full of semantic compounds, it is curious that 4445 and 4454 - glossed by Li Fanwen (2008: 706-707) as 'south' and 'north' - do not contain any components in common with 4796 and 0942, the graphs for the common words 'south' and 'north'.
Nonetheless Li's glosses make sense: 4445 has the notation
4796 0661 1zyr4 2ngon4 'southern sea'
in Homophones D and is a definition for 4796 'south' in Tangraphic
And if 4454 contains 'left', the opposite of the 'right' in 4445, then
4454 must be 'north', the opposite of 4445 = 4796 'south'. But I am
hesitant to gloss 4445 and 4454 simply as 'south' and 'north'. Maybe
'Great South' and 'Great North' or even as 'Great Right' and 'Great
The association of 'south' with 'right' is reminiscent of Sanskrit dakṣiṇa- 'south/right'. Sanskrit uttara- 'north' can also mean 'left', but the normal word for left is vāma- which does not mean 'south'.
What were the Great South/Right and Great North/Left Seas? Were they
mythical? I don't know much about how the landlocked Tangut perceived
their world. How many Tangut had ever seen a sea? What is the etymology
of 2ngon4 'sea'?
10. Today I saw this passage in Gorelova ( :15; I added the links):
The Mohes [靺鞨] called their tribal leader "damofo mandu" (chin. da [大] "great"), as one can see further, the Southern Shiwei [室韋], who can be identified as people of Tungusic descent, called their tribal chieftains "yumofo mandu".
The language spoken by the Mohe was Tungus-Manchu. What is important to mention is that the language of the Sushen could also be referred to as proto-Tungusic.
During the Tang era, the Mohe, similar to other peoples of northeastern Asia, were subjected to constant political and military pressure from Tang rulers. Soon after the Koguryo state of Korea had been defeated by the Tang empire (668 AD), a large portion of the Koguryo people fled into the lands of the Sumo Mohe [粟末靺鞨]. Soon a lot of towns, surrounded by defensive walls, arose there. Around 700, a new state, "Parhae" (chin. Bohai), raised from the ruins of Koguryo, was established. It was the leader of Sumo Mohe, Cicik Zhungxiang [乞乞仲象] who was considered the creator of Bohai. [...] Later, his grandson, Uazhi Da Tuyu, declared himself the emperor of Bohai, which in the course of time became highly cultured and enlightened, and widely known beyond the borders of the country. The Parhae (Bohai) state—a deserving successor of the culture and power of Koguryo and the tribal league of the Songari Mohe—flourished for 228 years until it was destroyed by the Qitans [Khitans] (926 AD) (Shavkunov, 1968; Crossley, 1997:18; Larichev, 1998:53-4).
What are the characters for damofo mandu and yumofo mandu
which sound like modern Mandarin readings of old Chinese transcriptions?
I was surprised to see the Southern Shiwei described as Tungusic
since their name - roughly pronounced *shirwi in Late Middle
Chinese - is derived from the para-Mongolic autonym Serbi. But
of course names are not reliable guides to linguistic affiliation.
Cicik Zhungxiang is a strange, not-quite-Pinyin romanization of 乞 乞仲象 Qǐqǐ Zhòngxiàng with a -k whose motivation is obscure. Assuming the Chinese pronunciation favored in Parhae was like early Sino-Korean, 乞 乞仲象 was pronounced something like *kər kər tyung syang. 乞 乞 <BEG BEG> looks like an insulting ('derographic') transcription of a non-Chinese (i.e., Mohe) name. 乞 乞仲象 is also known as 大 仲象 with a Chinese-style surname 大 <GREAT> to go along with the Chinese-style disyllabic personal name 仲 象 <SECOND.BORN ELEPHANT>.
Uazhi Da Tuyu is presumably 乞
乞仲象's son (not grandson) 大祚榮 (Mandarin: Dà Zuòróng, Korean: Tae Cho-yŏng; r.
712-719), the first king (not emperor) of Parhae. I have no idea what Uazhi
11. The best for last: I just discovered Andrew
West's Tangraphic Sea search tool! More Tangut
126.96.36.199:55: YELLOW PIG 11/20
<so nggiyan uliya aniya juwa emu biya orin inenggi>'yellow pig year, ten one month, twenty day'
1. Jurchen and Manchu orin 'twenty' sounds like Written Mongolian qorin 'id.' The pronunciation of Khitan
廿 <TWENTY> (large script)
丁 <TWENTY> (small script)
is unknown; it could have been something like qorin.
Normally Written Mongolian q corresponds to h or k
Rozycki (1983: 11-12) proposes four layers of borrowing into (Jurchen/)Manchu to explain the different correspondences:
Layer 1: Mongolic *q- borrowed as *k- > *x- > *Ø- (within Tungusic): e.g., orin 'twenty'
Layer 2: Mongolic *q- borrowed as *k- > *x- (within Tungusic): e.g., hoton 'city wall' (cf. Written Mongolian qoton 'id.')
Layer 3: Mongolic *q- borrowed as k-: e.g., kobkolo- 'to remove (paper stuck to a surface)' (cf. Written Mongolian qubqol- 'to peel')
Layer 4: modern Mongolic *q- > x- borrowed as h-
This model could be refined: e.g., in the early layers, the
borrowing was probably from para-Mongolic (specifically Khitan) rather
than from Mongolic.
There doesn't seem to be any way to distinguish between layers 2 and
4 on the basis of Manchu evidence. I suppose Rozycki assigns Manchu
words to layer 2 if the borrowings are found elsewhere in Tungusic
(e.g., see Doerfer [1985: 81] for hoton-type
Tungusic words). Layer 2 words were borrowed into early Tungusic,
whereas layer 4 words were borrowed only into (Jurchen/)Manchu.
2. The Khitan large script character 廿 <TWENTY> is identical to the standard Chinese character 廿 <TWENTY> which was pronounced *ɲip in Middle Chinese, a fusion of 二 *ɲi̤ 'two' and 十 *dʑip 'ten'. Wiktionary says the expected standard Mandarin reflex is rì, but the actual reflex is niàn because
Let's see 入:
[t]he irregular pronunciation (e.g. /nVm/ [with the nasal counterpart of the original coda /p/] dates from the Song dynasty, to avoid homophony with a vulgar word; see 入.
The regular Mandarin pronunciation [for 入 <ENTER>] as predicted from Middle Chinese is rì. The irregular sound change [to rù] is for taboo reasons - to avoid homophony with its derived vulgar meaning "to enter > to have sexual intercourse", nowadays represented by 日 (rì).
I would expect 廿 to be nhập in Vietnamese since 二 'two' is nhị and 十 'ten' is tập. Wiktionary lists five Vietnamese readings of 廿:
nhập 'twenty' (?)
trập (first syllable of 廿重 trập trùng
the initial isn't nh-; tr- is normally from Chinese
*retroflexes and native Vietnamese *Cl-clusters
the sắc tone suggests a *voiceless initial even though Chinese 'twenty' had a *voiced initial
are those oddities the product of Vietnamese-internal taboo deformation?
chấp 'twenty' (?)
not listed at nomfoundation.org; presumably an alternate
spelling of trấp; the Hanoi dialect merges tr- and ch-
(< *c- if followed by a sắc tone).
The normal Vietnamese word for 'twenty' is native: 𠄩𨑮 hai mươi
'two ten', which has its own contracted form hăm (with short ă
instead of long a!).
3. Is it obvious to Koreans that the hangul title of the movie 독전 Tokchŏn (English title: Believer) is 毒戰 <POISON BATTLE> tokchŏn rather than 督戰 <SUPERVISE BATTLE> tokchŏn 'urging to fight harder'?
Only the second tokchŏn is in dictionaries. The first tokchŏn is a straightforward Koreanization of the title of its inspiration, the Chinese movie 毒戰 (Mandarin Dúzhàn, Cantonese Duk6 zin3; English title: Drug War]).
The fact that some websites call the Korean movie 독전: 마약전쟁 Tokchŏn:
mayak chŏnjaeng 'Poison Battle: Narcotic Wars' implies that Tokchŏn
by itself might need clarification. In hanja that longer title looks
redundant with two 戰 chŏn: 毒戰: 痲藥戰爭.
Korean-English dictionary gave this sentence as an example of tokchŏn:
암튼 '독전' 화이팅 할까요?
Amthŭn 'Tokchŏn' hwaithing halkkayo?
'Anyway, shall we do "Believer" fighting?'
That made me curious about the etymology of 암튼 amthŭn 'anyway'. Is it of recent origin? I couldn't find it in Martin et al.'s massive 1967 Korean-English dictionary or my old portable favorite, Dong-A's 1981 Korean-English dictionary.
I think 암튼 is an extreme example of contraction:
아무리 하려 하면 하든지
amu-ri ha-ryŏ ha-myŏn ha-dŭ-n-ji
any-ADVERB do-INTENTIVE do-CONDITIONAL be-RETROSPECTIVE-MODIFIER-uncertain.fact
Martin et al. (1967: 1093) derive 암 am 'surely' from
which according to Martin et al. (1967: 1073) is in turn a contraction of
아무리 하려 하면
amu-ri ha-ryŏ ha-myŏn
'any-ADVERB do-INTENTIVE do-CONDITIONAL'.
튼 thŭn is a reduction of
Martin (1992: 834) translates -dŭ-n-ji as 'the uncertain fact that it has been observed that', 'whether it was (observed to be/happen'). -ji can be dropped. That leaves hadŭn /hatɯn/. th- /tʰ/ looks like the product of syncope, metathesis, and fusion:
/hat/ > /ht/ > /th/ > /tʰ/
Metathesis is a regular process in Korean: /hC/ cannot surface as [hC].
(12.16.0:16: The reduction of /hat/ to /tʰ/ above parallels the reduction of the first syllable of the Korean root 'to ride' between the 12th and 15th centuries:
12th c. *hʌta- > *hta- > 15th c. tha- /tʰa/
The 12th century form is preserved in Chinese transcription as 轄打 *xjaʔta in Jilin leishi. I have followed the conventional view by reconstructing *ʌ in the first syllable, but now it occurs to me that Chinese *-ja- might reflect a 12th century Korean *(y)e or *yə. Perhaps
pre-12th c. *heta- > 12th c. *h(y)eta- or *hyəta- > *hʌta- > *hta- > 15th c. tha- /tʰa/
I reconstruct *e as a front low series vowel in early Korean:
That *e later broke to yə (= yŏ in my modified McCune-Reischauer romanization), the most common yV-sequence in native Korean words.
In my scenario for 'to ride' above, *(y)e or *yə was reduced to *ʌ, the minimal low series vowel, before being lost. By that point Korean had developed vowel harmony, so the vowel in the first syllable had to be a low series vowel like the *a in the second syllable.)5. More examples of metathesis in Korean:
암클 amkhŭl or 암글 amgŭl < /am(h) kɯr/
'useless knowledge, female writing, hangul'
수클 sukhŭl or 수글 sugŭl < /su(h) kɯr/
'useful knowledge, male writing, Chinese characters'
That pair of words is not only sexist but also reflects a
The final /h/ of /amh/ 'female' and /suh/ 'male' surfaces as
aspiration following a stop which in this case is the /k/ of /kɯr/
The variants with -gŭl are compounds in which 'female' and 'male' have been reinterpreted as /am/ and /su/ without /h/. /k/ voices after voiced segments: /m/, /u/, and the /n/ of han'gŭl /hankɯr/ 'great/Korean-writing'.
Naver regards the -g-forms (amgŭl, sugŭl) as correct
and states that the -kh-forms are erroneous (see here
though Martin et al. (1967: 1011, 1095) only lists the -kh-forms.
Does that indicate the reanalysis of 'female' and 'male' as being
without /h/ has been completed over the past half-century? Not quite -
the official standard for Korean still requires aspiration in, for
instance,암캐 amkhae 'female dog' < /amh kɛ/
(not 암개 amgae!) in which am-
is still clearly 'female' (한글 맞춤법 Hangul Spelling 4.4.31 and 표준어규정
Standard Language Code 1.1.7). Perhaps the 'writing' words have lost
their gendered associations.
I found amkhŭl in Martin et al. (1967: 1095) when looking in vain for amthŭn (ㅋ kh is before ㅌ th in Korean alphaetical order).
6. I was surprised to learn that 怒濤 <ANGER WAVE> dotō)
is a Japanese name for a
kind of Faucaria plant.
(12.16.2:22: The same characters are the Chinese name [Mandarin nùtāo]
The Korean name for Faucauria tuberculosa is a combination of that and the kanji for the Japanese name of Faucauria tuberculosa (荒波 aranami 'wild wave') read in Sino-Korean: 怒濤荒波 nodo hwangpha.
Sino-Korean 怒濤 nodo 'angry wave' by coincidence sounds like
the unrelated native Japanese word 喉 nodo 'throad' - and by
another coincidence, Faucaria
is from Latin fauces
7. I thought faucet might be related to fauces 'throat', and Wiktionary agrees, but Merriam-Webster gives a derivation I don't quite follow:
Middle English, bung, faucet, from Middle French fausset bung, perhaps from fausser to damage, from Late Latin falsare to falsify, from Latin falsus false
turns out to be from falsus too.
The bottom of Merriam-Webster's entry for faucet led me to their Time Traveler feature showing what words were first attested in English in a given century: e.g., the 15th century (faucet, favored, feasible ...).