HAPPY NEW YEAR 2020
It's still the year of the pig in traditional East Asian calendars, but it's the year of the rat (2020) if one coordinates the Chinese animal cycle with the Gregorian calendar:
Center (red): Khitan large script <RAT>
Horizontal (green): Khitan small script <216> <?> and <151> <ghu> split by <RAT>
Vertical (blue): Jurchen <sin> and <ge> for singge 'rat' split by <RAT>
Circle (black): Jurchen <TWENTY> x 20
Last night I realized that Khitan small script character 216 might be a derivative of 118 <qu>:
Let's assume 216 was <qu*> with <*> indicating 'different from <qu> in some way'. Then
would be read <qu*ghu> which is close to Written Mongol qulughana 'rat'.
What if <qu*> were <qul>? <qul.ghu>
is close to qulughana, but I wouldn't expect Khitan u
to correspond to Written Mongol a.
That's where I left off last night. Today I realized that
<ghu> might be read <ugh> after a consonant. So maybe
<216.151> was read <qul.ugh> which is even closer to
Written Mongol qulughana and requires no vocalic gymnastics.
The low frequency of 216 (7 times in the 契丹小字研究 Qidan xiaozi yanjiu corpus and 0 times in initial position in Wu and Janhunen 2011 [whose index is organized by initial graphs]) suggests that it probably did not represent a simple CV syllable. If it didn't represent the CVC syllable qul, it may have represented a CVCV sequence qulu, and <qulu.ugh> was read qulugh.
The <qul(u)> hypothesis could be confirmed if 216 alternated with <qu.l>, <qu.ul> (= <qu.lu>?), etc.
As far as I know, 216 appears only in initial position with one exception: this block
from line 3 of the second inscription in the 萬部華嚴經塔 Wanbu
Avataṁsakasūtra Pagoda in Hohhot.
2. I still practice writing Tangut, Khitan, and Jurchen (TJK) every day. Recently I added Manchu to my regimen and today I started writing Mongolian (in the traditional script - I still don't know how to handwrite Ө and Ү in Cyrillic).
All my TJK exercises begin with the date. I'm still going to date
these blog entries in Jurchen since it's the thousandth anniversary of
the Jurchen large script or close to it (see Kiyose [1977: 22] for
three possible dates: 1119, 1121, and 1123; Kane [1989: 3] gives the
date 1120, though Kane [2009: 3] gives the date 1119). Today's date in
songgiyan uliya aniya juwa juwe biya ice nadan inenggi
'yellow pig year, ten one month, new seven day'.
3. Last night I learned about prothesis in Bashkir:
арыш 'rye' < Russian рожь
өҫтәл 'table' < Russian стол
эскәмйә 'bench' < Russian скамья
but why ө- [ø] instead of э [e] in өҫтәл?
The prothesis is mostly unsurprising, but these correspondences are:
B ы [ɤ] ~ [ʌ] : R о [o]
B ә [a] : R о [o]
1.2.11:00: I forgot to mention these cases of prothesis in native
ыласын 'falcon' < *laːčïn
ысыҡ 'dew' < *čïq
Without more Bashkir data, I can't test my guesses for motivations: e.g., avoiding initial l- and making monosyllables disyllabic.
letter ҡ <q> surprised me since I'm accustomed to қ <q>
from Kazakh, etc. Why do Bashkir and Siberian
Tatar have their own special ҡ <q>? Siberian Tatars
were educated in (Volga)
Tatar which has к <k> for /k/ (including a [q] allophone)
and къ <k"> for /q/.
4. Today I learned about the Caucasian Albanian script used to write a (near?-)ancestor of the Udi language.
I've thought Old Chinese might have had pharyngealized vowels, so
I'm interested in the phonetics of Udi's
5. What is the etymology of Persian شمشیر <šmšyr> shamshir,
first (?) attested in Middle Persian as <šmšyl>? It doesn't look
Indo-European. Is it an areal word?
6. Why does the Persian word/name فرشته <frsth> fereshte
< firishta sometimes appear as Farishta(h),
e.g., in this
1958 Bollywood film title (फरिश्ता Phraiśtā; cf. Urdu فرشته
list of Pashto (not Persian, I know) names?
188.8.131.52:56: YELLOW PIG 12/6
songgiyan uliya aniya juwa juwe biya ice ninggu inenggi
'yellow pig year, ten one month, new six day'
1. Last night I looked up 䯗 'hip bone' and discovered it could also be called the innominate bone. Why 'nameless'?
2. Are 清樂 Shingaku
'Qing music' lyrics an overlooked source of data for premodern Mandarin
reconstruction? In this sample from 月琴樂譜 Gekkin gakufu (Moon
Guitar Sheet Music, 1877), 兒 (now ér
[aɚ˧˥] in modern standard Mandarin) has the furigana ルウ <ruu>.
That seems to indicate that the kana transcription is based on a
dialect in which 兒 was pronounced like [ɻ̩]. (Other evidence rules out
the most obvious interpretation [ruː]: e.g., no Mandarin dialect has
[u] in 兒.)
The date of the text does not necessarily indicate that the [ɻ̩]
pronunciation still existed in the source dialect as of 1878. The kana
spelling ルウ <ruu> could have been copied from some earlier source.
ルウ <ruu> bears no resemblance to ジ <zi> [dʑi], the usual
Japanese reading of 兒. Strictly speaking, the two Japanese borrowings
are not from the same
dialect in two different periods: <zi> is from a 7th century
northwestern Chinese dialect, whereas <ruu> is from a Qing
(perhaps 18th century?) Mandarin dialect. Nonetheless the latter
probably underwent more or less the same changes as the former, so as a
convenient fiction, here's how the sources of <zi> and
<ruu> could be bridged:
Stage 1: *ɲʑi > borrowed into Old Japanese as /Nzi/ (> modern [dʑi])
Stage 2: *ʑi
Stage 3: *ʐi
Stage 4: *ʐɻ̩
Stage 5: *ɻ̩ > borrowed into Edo Japanese as /ruː/
Modern standard [aɚ] is from a stage 5-type form that developed a prothetic vowel:
*ɻ̩ > *əɻ > *ɚ > [aɚ]
In some Mandarin varieties, only the prothetic vowel has survived without any trace of retroflexion: e.g., 壽縣 Shouxian [ə] and 鳳陽 Fengyang [a] for 兒.
It is tempting to derive Sino-Korean 아 a for 兒 from a Fengyang-like form, but that would be anachronistic. Fengyang [a] is probably a very recent development from *ar, whereas the earliest attested ancestor of 아 a is ᅀᆞ zʌ borrowed from a form like stage 4 *ʐɻ̩. zʌ became ʌ in the 16th century, and ʌ then became a in the 18th century.
3. I don't understand how Korean z vanished without a trace. Lee and Ramsey (2011: 142) state that "early examples of the elision of z are all restricted to the environment _i, y, which suggests that the process of change started there." They give these examples:
/sʌzi/ > /sʌi/ 'interval'
/nʌyzir/ > /nʌyir/ 'tomorrow'
In those particular cases, I can imagine /z/ being phonetically something like [ʑ] that lenited to [j] and then disappeared before /i/. But what were the intermediate stages between /z/ and zero in initial position before /ʌ/ as in 15th century /zʌ/ > 16th century /ʌ/?
I thought [ɦ] might be a possible intermediate stage by analogy with Sanskrit:
Proto-Indo-Iranian *ĵʱ > Sanskrit h [ɦ] but Avestan z
I assume there was a stage like *ʑʱ underlying both
Sanskrit and Avestan reflexes. (No, see topic 4 below.) That stage
would be like Middle Korean /z/. In some modern Indic languages,
Sanskrit initial h- has disappeared in reflexes
of hima- 'winter'. I don't know if that's a regular change.
4. I've been trying to work out the phonetics of Proto-Indo-Iranic¹ (PII) reflexes of Proto-Indo-European (PIE)
4.1. The PIE starting point:
4.2. The first palatalization in PII
4.3. Affrication in PII (cf. the alveolar affricate reflexes of Sanskrit palatals in some modern Indic languages)
4.4. The merger of plain velars and labiovelars
4.5. The second palatalization in PII
Velars palatalized in certain environments. Compare:
*kʷe > *ke > *ce
(palatalization before *e) 'and'
4.6. The merger of *e and *o into *a made the second palatalization phonemic:
*ce > *ca 'and'
It was no longer possible to regard *c as an allophone of
/k/ before /e/, since /e/ no longer existed. (The e of later
Indo-Iranic languages is not from the earlier *e that merged
with *a: e.g., Sanskrit e is from PII *ai which
could be from PIE *ei or *oi but not PIE *e.)
1.1.0:59: The following sections deal with post-PII developments.
4.7. Pre-Sanskrit (Proto-Indic²) stage 1
The affricate series palatalized. I thought the absence of *ts-type affricates in Proto-Dravidian might have pressured a shift away from alveolar affricates, but the traces of Indic in the Near East - far from Dravidian - underwent stage 2 (4.8 below): e.g., the name Paršasatar from praśāstar- 'director' with ś < PII *ts-.
4.8. Pre-Sanskrit (Proto-Indic) stage 2
Voiceless *tɕ simplified to *ɕ.
The voiced affricates merged with the voiced palatals.
I don't know the order of those two changes, so I show the results of both changes in the same table instead of arbitarily showing one change at a time in two tables.
4.9. Sanskrit (Proto-Indic)
*ɟʱ weakened to h [ɦ].
4.10. Proto-Iranic (continuing from 4.6)
The voiced aspirate series merged with the plain voiced series.
The affricates deaffricated. The change of *ts to s is roughly parallel to the change of *tɕ to ś in Sanskrit. But note that Proto-Iranic *dz became Avestan z, whereas pre-Sanskrit *dz did not become Sanskrit ź [ʑ], a sound that does not exist in Sanskrit.
The exact phonetics of c and j are unknown. They
were palatal unlike s and z, so I have projected
palatal stops forward into Avestan. But maybe Avestan c and j
were actually affricates.
4.12. Summing up
¹1.1.0:40: I favor the term Iranic by analogy
with Turkic, Mongolic, etc. to avoid confusion with the country of Iran.
²184.108.40.206: I prefer the term Indic to Indo-Aryan, as the word Aryan is shared by both Indic and Iranic. Ironically, the name Indic is actually Iranic, as it is an Hellenization of Old Persian 𐏃𐎡𐎯𐎢𐏁 <ha i du u sha> [hi(n)duš] 'India', cognate to Sanskrit Sindhus 'Sindhu'. The Old Persian form has two Iranic innovations:
*s > h
*dʱ > d (cf. *gʱ > g in 4.10 above)
It occurs to me tonight that an Indic name for Indic would be Sindhic,
but that's not going to catch on. No one is going to rename the country
Sindhia either. And Hindutva advocates
are probably not going to change the name of their ideology to Sindhutva.
220.127.116.11:45: YELLOW PIG 12/6
songgiyan uliya aniya juwa juwe biya ice shunja inenggi
'yellow pig year, ten one month, new five day'
1. I checked Jan van Steenbergen's Interslavic page for updates and noticed a new item in the menu:
The Painted Bird (in Czech: Nabarvené Ptáče) a Czech-Slovak-Ukrainian film written, directed and produced by Václav Marhoul. It is based on Jerzy Kosiński’s novel The Painted Bird from 1965.
The action takes place in some unspecified East-European, Slavic-speaking country. A place that cannot directly be linked to a specific Slavic population requires a language that can instantly be recognised as Slavic but not be linked directly to any specific Slavic population either. That's why Marhoul decided to use Interslavic:
2. I just bought e-access to Vojtěch Merunka's Interslavic zonal constructed language: an introduction for English-speakers. Google says I can check a box to "Make [the book] available offline", but I can't find it.
On page 5, Merunka writes (12.31.14:03: links added),
Interslavic is also an interesting experiment of alternative history: If there was not such strong pressure from the Frankish Latin-oriented church (e.g. Wiching of Nitra and his band) against the Moravian Church in the 9th century, the invasion of the Hungarians into Central Europe and the subsquent collapse of contacts between Moravia (now a territory of both the Czech and Slovak Republics) and Bulgarian, Serbian and Kiev (later Russian) states, it is possible to imagine a hypothetic different evolution of the Slavic early Middle Age language - we have seen a similar phenomenon in the Arabic World: After the end of natural linguistic unity during the Middle Ages, the modernized universal Arabic language based on the religious language of the Qur'an still prevails. It is an artificial language which is close enough to the various contemporary spoken national dialects of Arabic that it is recognized as the standard for communication between Arabic nations and for contact with foreigners and used as an auxiliary language by both state apparatus and the media.
It would be fun to see historical fiction depicting a world where Interslavic - probably simply 'Slavic' - has the same position that modern standard Arabic has.
Page 143 presents a modified Arebica alphabet to
3. 𗡠 0271 2mer4, representing the second syllable of 𗡢𗡠 0702 0271 1to'4 2mer4 'to seek, find', has a right side (Boxenhorn code: baedar) found nowhere else. I found it in Li (2008: 47) when looking up 𘅊 0273 1le1 for my last entry.
2mer4 sounds like Old and Middle Chinese 覓 *mek 'to
seek'. If I were to force a relationship between the two, I could trace
2mer4 back to pre-Tangut *RImek-H with labial
*Pek > *Pew > *Pej > Pe
*RImek-H could be related to
𗑉 4684 1me1 < *CAmik or *mek 'eye'
cf. Tibetan mig (archaic dmyig) 'eye' (but Old Chinese has 目 *Cmuk - is *Cmikʷ possible?)
which is the word that made me discover labial dissimilation. Two
*CAmik > *CAmiw > *CAmij > *CAmi > *CAmai > *mai > 1me1
The relative chronology of *P-w dissimilation and *A-triggered diphthongization is uncertain.
Japhug tɯ-mɲaʁ has a high vowel presyllable,
not a low vowel presyllable needed to condition Grade I (the -1
at the end of 1me1).
*mek > *mew > *mej > 1me1
But there are other possible pre-Tangut sources of 2mer4 that would rule out a connection with the Chinese word:
𗡢 0702 1to'4 'to seek' can appear by itself. That suggests that 𗡠 0271 2mer4 might be a formerly independent verb that only survives as the second half of a synonym compound 'seek-seek'.
4. Li (2008: 120) gives this example of 0702 as an independent verb from The Timely Pearl 292:
5098 0702 0760 1715
2ngon4 1to'1 2dzen4 1rar4
'case seek judge ?'
It corresponds to Chinese 案檢判憑 'case examine judge ?'
Nishida (1964: 215) has the translation 'to examine the case and hand down a judgment'. Nishida (1964: xii) says Burton Watson and a ヤンポルスキー (Yampolsky? - I don't know who this is, or what his preferred Anglicization of Ямпольский is) helped him with the English translations. Later, Nishida (1964: 216) has the translation'deliver a judgment' for 判憑 in Timely Palm 302.
I would think then that 𘅤 1715 1rar4 /憑 means 'to hand down' or 'to deliver'. But the basic meaning of 𘅤 1715 1rar4 is 'to write' (Li 2008: 285). So might the Tangut phrase in The Timely Palm mean 'write a judgment'?
憑 can be translated many ways in Chinese, but none of those translations mean 'write' or 'hand down' or 'deliver'. Might it be 'proof': i.e., 'evidence'? If so, then there is only a vague parallel between the Tangut object-verb sequence 𗍷𘅤 'write a judgment' and the Chinese verb-object sequence 判憑 'judge evidence (?)', and mechanically equating 𘅤 with 憑 may be a mistake.
Then again, to say Burton Watson's knowledge of Chinese dwarfs mine would be an understatement, and maybe 判憑 is an idiom 'deliver/hand down a judgment' that I just failed to confirm in other sources.
I always assumed Watson had learned Japanese in the American
military in WWII, but in fact he didn't know any Japanese when he
arrived in Japan in 1945, and he was actually
a Chinese major.
5. My DuckDuckGo search for Yampolsky led me to a video of minerva scientia pronouncing Tangut in Gong's (more or less) and Arakawa's reconstructions.
6. ElitekidMu0 comments on that video:
Fun fact: Thunder Force VI [Wikipedia], a shooting game released in 2008 by SEGA for the PS2, included the Tangut Language as the main language for the protagonist of the series, Galaxy Federation (Vastian). Another language included in the game is the Mongolian Script, used by the antagonist of the series, ORN Empire.
7. Last night I learned that Kara Ben Nemsi was meant to mean 'Carl son German' (though nemsi is really closer to نمساوي namsāwiyy/nimsāwiyy 'Austrian'; 'German' is ألماني 'almāniyy).
Karl May has a way with foreign names. I couldn't have come up with
something equivalent to Old Shatterhand
or Old Surehand
8. I just noticed that the Old English Wikipedia (Ƿikipǣdia) is
Sēo Frēo Ƿīsdōmbōc
'the free wisdombook' (Ƿ <W> wynn is a rune borrowed into the Old English alphabet)
forms like Irish seo
'this' the only living reflexes of Proto-Indo-European
*só retaining s-? Greek [o] has lost h- <
*s-, and English the has a th- that spread from
the th-reflexes of the *t-initial oblique forms of *só.
9. I finally got around to rewriting my lost entry for 12.26 from memory. I finished right after I ordered a used hardcover copy of William C. Hannas' The Writing on the Wall: How Asian Orthography Curbs Creativity (2013).
10. Tonight I discovered the variant 槑 for 梅 <PLUM>.
11. Baxter and Sagart (2014) reconstruct 梅 <PLUM>. in Old Chinese as *C.mˤə. I suspect that *C was a voiceless consonant because Vietnamese mơ 'apricot' has a ngang tone pointing to an earlier *m̥- which may be from an even earlier *C̥m- with a voiceless *C̥- that conditioned the devoicing of *m-. I would reconstruct the word in Early Old Chinese as *C̥Amə with a low first vowel that triggered the warping of *ə to *ʌə:
*C̥Amə > *C̥Amʌə > *C̥mʌə > *m̥ʌə > *mʌe > *mʌj > *mɑj > *mwɑj > *muj > *mwəj > *məj > standard Mandarin [mej]
It is possible that *C̥A- was simply completely lost after warping in (many? most? all?) dialects other than the one underlying Vietnamese *C̥m-. I have not yet found any Chinese varieties with a yinping tone pointing to *m̥-.
The *m̥- in the scenario above is of late origin. An earlier
*m̥- in Old Chinese became *x- in stage 2 below, whereas
newer *m̥- merged with *m-:
The tones above are conditioned by final glottals: final glottal
stops conditioned the falling-rising tone [˧˩˧] and stage 3 voiced *m-
and the absence of a final glottal conditioned the high rising tone
songgiyan uliya aniya juwa juwe biya ice duin inenggi
'yellow pig year, ten one month, new four day'
1. Tonight it occurred to me that the Jurchen and Khitan large script characters for 'four' might be graphic cognates:
One might be rotated - but which one? And did the Parhae script have both rotated and nonrotated variants of <FOUR>?
12.30.0:17: Both <FOUR>s have four strokes, so they may simply be two types of tally marks formalized as characters.
In any case, the Khitan large script character is not to be confused with Chinese 卅 <THIRTY> which is a fusion of three 十 <TEN>s.
12.30.12:50: Chinese 卅 <THIRTY> in turn should not be confused
with the Jurchen phonogram <sui>:
Jin (1984: 25, 26, 180) reports the first pair of forms in the 大金得勝陀頌碑 Great Jin Victory Hill stele (1185) and the second 卅-like pair of forms in the Berlin and Tōyō bunko copies of the Ming dynasty Bureau of Translators vocabulary from c. 1500. Without examining the original texts, I cannot be certain about minor variations such as the presence or absence of a hook in the 1185 stele.
I fear that the Bureau of Translators' forms might be
unintentionally 'sinified' in the sense that unfamiliar Jurchen
characters were accidentally modified by scribes more familiar with
sinography. Perhaps the resemblance of <sui> to Chinese卅
<THIRTY> in the Bureau of Translators vocabulary might be an
example of sinification.
12.30.15:33: Jin (1984: 58, 76) derives Jurchen <FOUR> from
the phonogram <da> which in turn he derives from Chinese 屠:
In the Jin dynasty, 屠 was pronounced *tʰu. Why base a
phonogram <da> on a Chinese character pronounced *tʰu?
I don't think <da> was a Jin dynasty invention. I think its roots go back further to a period when 屠 was pronounced as *da in Late Old Chinese. (屠 was once a transcription character for -ddha in 浮屠 *bu da = Buddha.) In other words, I think <da> is potential evidence for the Jurchen large script being an heir to an old tradition of phonetic writing rather than a 12th century invention.
I don't think there is any relationship between <FOUR> and
<da> beyond graphic convergence - the bottom of <da> (known
only from two inscriptions) may have been remodelled after the far more
common character <FOUR>.
2. Tonight while copying character 236 of the Golden Guide, I miswrote the Tangut character element 𘡛 by placing the dot too low so it intersected the stroke below it.
Nishida (1966: 242) interpreted as 𘡛 a radical for things having to do with 愛惜 aiseki 'cherish'. It just occurred to me that 𘡛 might be derived from the top of 愛 <LOVE> or the top right of 惜 <CHERISH>.
But ... what is 𘡛 doing on the top
of 𘓉 0993 1lhew1 'to herd',
of all things? Is 𘓉 0993 a semantic
compound like <CHERISH.LIVESTOCK>?
But ... the bottom of 𘓉 0993
code: baecie) is neither 'livestock' nor short for a character for any
animal. The only other character with baecie is 𘅊 0273 1le1, a character for writing
3. I was surprised by this passage (emphasis mine):
Martin Kümmel similarly proposes, based on observations from diachronic typology, that the consonants traditionally reconstructed as voiced stops were really implosive consonants, and the consonants traditionally reconstructed as aspirated stops were originally plain voiced stops, agreeing with a proposal by Michael Weiss that typologically compares the development of the stop system of the Tày language (Cao Bằng Province, Vietnam).
But then I checked Pittayaporn (2009: 110) who explains that in Cao Bằng,
Proto-Tai *implosives > [plain voiced stops]
Proto-Tai *plain voiced stops > [voiced aspirate stops]
The voiceless aspirate stop reflexes of Thai, Lao, etc. are from Cao Bằng-like *voiced aspirate stops (e.g., the name Thai [tʰaj] itself < *dʱ- < *d-; the name Tai for the language family has a unaspirated [t] reflex of *d-).
Was there a push or pull chain in Tai? I imagine a pull chain:
*plain voiced stops became *voiced aspirate stops, leaving a gap to be
filled by *implosives becoming *plain voiced stops. But that's just an
offhand scenario with zero research, much less testing.
I can see something similar happening in Proto-Indo-European ... except for this problem:
in Proto-Tai (and languages with implosives in general), *ɓ- is common and *ɠ- does not exist
in Kümmel-style Proto-Indo-European as I understand it, *ɓ-
would be rare, and *ɠ- and *ɠʷ- would be common
The ejective hypothesis, on the other hand, correctly predicts that
Proto-Indo-European labial *pʼ (corresponding to *ɓ- in
the implosive hypothesis) would be rare or absent.
4. I wish there were animated GIFs like the Georgian ones at georgian-language.com for Manchu and traditional Mongolian letters. I've been using Jun Jiang's Manchu app which has animated images for Manchu syllables and words, but it doesn't seem to match the verbal (nonvisual) instructions in Roth Li's Manchu textbook, so I'd like to see a second opinion.
5. I discovered that the Old English Wikipedia has a runic viewing option. Select ᚱᚢᚾ <run> under the article title.
12.30.0:16: Try the ȝƿ and ᵹƿ viewing options too.
6. Why is Gdańsk
Gduńsk in Kashubian? Is Polish a : Kashubian u
a regular correspondence in some environment(s)? I don't see anything
like *a > u in Stone's (1993: 765) sketch of
Kashubian vowel history.
7. Another Kashubian surprise: kùńszt [kwuɲʃt] (I think) 'art' < German Kunst. Why [wu]? How did Kashubian develop [wu] in native words? Is [ɲ] instead of [n] due to assimilation with [ʃ]? Was the word borrowed from a German dialect in which 'art' was [kunʃt] instead of [kʊnst]? 'Hyperlabial' [wu] for [ʊ] seems odd to me.
Aha, I see now that Kashubian /u/ becomes [wu] "[i]nitially or after
a labial or a velar" (Stone 1993: 762). So [wu] has nothing to do with
8. How did Proto-Slavic *sŭnŭ 'sleep' become Lower Sorbian soń with a palatal ń instead of the expected n as in the rest of Slavic: e.g., Upper Sorbian son?
18.104.22.168:58: YELLOW PIG 12/3
<so nggiyan uliya aniya juwa juwe biya ice ilan inenggi>
'yellow pig year, ten one month, new three day'
(0. 12.29.0:15: I keep thinking the version of <ilan> above
looks like Chinese 斗 <DIPPER>, but it is of course in fact
cognate to Chinese 三 <THREE>.)
1. Via Andrew West: Abraham Gross' proposal
to encode the missing kana <YI> and <WU> in Unicode. That
reminds me to upload my August post about <YI> and <WU>.
2. I first heard the song "Year of the Cat" as a child in 1976, and only years later¹ did I learn that it was a reference to the Vietnamese zodiac which is close to the Chinese one with two exceptions:
Sino-Vietnamese 丑 sửu for the water buffalo rather than the ox
Sino-Vietnamese 卯 mão (mẹo in an earlier stratum of borrowing) for the cat rather than the rabbit
The terms for the Vietnamese zodiac are not the normal terms for animals: e.g., in Vietnamese, 'water buffalo' is 𤛠 trâu and 'ox' is 𤙭 ~ 𤞨 bò.
I've long assumed that the reinterpretation of 丑 sửu as water buffalo incorporated a local animal, but water buffalo also exist in China too. Duh. In fact, China has seven times more water buffalo than Vietnam. Shows you what I know about farming: nothing. So I can't explain how sửu came to refer to water buffalo.
As for 卯 mão/mẹo, was its reinterpretation as 'cat' due to a folk etymological association with 貓 ~ 猫 mèo 'cat'?
¹In an interview with Al Stewart that I heard on the radio in 1989?
3. I never heard of screeves until today. The word sounds like it could be a native English word, but in this context it's actually a loan from Georgian მწკრივი cʼkʼrivi 'row, series'. I wonder why it's so Anglicized. It's not as if Japanologists speak of 行 gyō 'rows (of kana sharing the same vowel: e.g., a, ka, sa)' as gheow or however an English speaker might spell it. (It would be fun to ask English speakers unfamiliar with Japanese to write gyō phonetically.)
There turns out to be another screeve which isn't native or from Georgian.
22.214.171.124:59: YELLOW PIG 12/2
<so nggiyan uliya aniya juwa juwe biya ice juwe inenggi>
'yellow pig year, ten one month, new two day'
1. Dept. of Ideas I Wish I Had: Alexander Zapryagaev's proposal for writing Old Japanese in hentaigana, a logical extension of the common practice of writing the extinct Japanese syllable ye (now [e]) in hiragana as the hentaigana 𛀁 to differentiate it from え e and ゑ we (also now [e]). (More in this thread by Sven Osterkamp.)
2. The reading ritsu for 立 <STAND> is in that stratum of Japanese that I feel as if I've 'always' known. I suspect I learned the reading in the early 80s when I started to read Japanese books with furigana.
When I started learning Korean in 1987, I immediately picked up on the correspondences between Sino-Korean and Sino-Japanese¹. For instance, I noticed that Sino-Korean -l regularly corresponded to Sino-Japanese -tsu or -chi and vice versa. So I should have expected ritsu to correspond to Sino-Korean 릴 ril. But of course, the actual Sino-Korean reading of 立 is actually 립 rip. I learned that reading so early in my studies that I didn't even know the correspondence patterns yet. Hence the mismatch of -p and -tsu didn't bother me at all.
Not long afterward I learned Sino-Korean 잡 chap corresponding to Sino-Japanese zatsu for 雜 <MIXED>.
And then I learned the Cantonese readings of those characters: lap6 and zaap6.
The next step was learning about Chinese reconstruction. Of course all agree that 立 and 雜 originally ended in *-p in Chinese, and that Cantonese preserves that *-p.
So how did the Sino-Japanese readings of 立 and 雜 come to end in -tsu? Alexander Zapryagaev has a thread on the mystery of 立 ritsu.
¹And Mandarin, but that's not relevant here, since Mandarin lacks final stops. Without knowledge of Mandarin, I would have had a much harder time remembering which Sino-Korean words ended in -ng.
12.29.20:35: How I guessed final consonants in Sino-Korean in 1987 (before I knew anything about Cantonese or Vietnamese):
||vowel (usually; unpredictably occasionally in -p)
||-n or -m (unpredictable)
At the time I just memorized which Sino-Korean readings ended in -p, since there was no way to guess Sino-Korean -p on the basis of Sino-Japanese or Mandarin even in regular cases such as
十 <TEN> SJ jū : Md shi : SK 십 ship
In that particular case, *-ip was borrowed into Japanese as *-ipu which became *-iu and then -ū.
Once I learned which Sino-Korean readings ended in -p and -m, I could use that knowledge to guess which Cantonese and Vietnamese readings ended in -p and -m.
126.96.36.199:59: YELLOW PIG 12/1
(I completed this post but lost it before I could upload it, so I reconstructed it on 12.30.16:13.)
<so nggiyan uliya aniya juwa juwe biya ice inenggi>
'yellow pig year, ten one month, new day'
1. The first ten days of the month are ice 'new' in the Ming Jurchen calendar. (In Jin Jurchen, the first day was 一日 emu inenggi 'one day'. Note how the early graphs are identical to Chinese 一日 <ONE DAY>.) Jin (1984: 105) derives the graph for ice
from the left side 亲 of Chinese 新 <NEW>. But I think the Jurchen
graph may be more directly connected to Chinese 𢀝 <NEW>, a variant
of attested in the Jin dynasty dictionary 四聲篇海 Sisheng pianhai (The Four-Tone Text Sea).
2. In 1998 I reviewed William C. Hannas' Asia's Orthographic Dilemma for Korean Studies. I finally got around to reading a Kindle sample of the 2013 sequel The Writing on the Wall: How Asian Orthography Curbs Creativity.Here's my attempt to sum up Hannas' argument:
A. East Asia has a "creativity deficit" (Kindle location 146)
B. Writing "affects thought" (Kindle location 245)
C. B causes A - in other words, East Asia writing systems cause a "creativity deficit"
A and/or B could be true. But I am skeptical of C.
188.8.131.52:59: YELLOW PIG 11/30
<so nggiyan uliya aniya juwa emu biya gūsin inenggi>
'yellow pig year, ten one month, thirty day'
gūsin 'thirty' looks like Janhunen's (2003: 397) Proto-Tungusic *gutïn from para-Mongolic or pre-Proto-Mongolic *gutïn. (The Proto-Tungusic form cannot be from Proto-Mongolic *gucin which underwent two changes: *ï > *i and *ti > *ci.) However, Proto-Tungusic *gutïn should become Jurchen gutin, not gusin.
I propose that Jurchen gūsin may be a borrowing that
replaced an earlier *gūtïn inherited from
Proto-Tungusic. (The macron in Jurchen does not symbolize length; it
indicates that u is [ʊ].) The source of Jurchen gūsin
may be a para-Mongolic (Khitan?) dialect that shifted *c to sh
(unlike the prestigious Khitan dialect preserved in the small script
that retains c).
is a graphic cognate of Jurchen
(12.26.13:19: Left to right: the earliest form from Nüzhen zishu [Book of Jurchen Characters, c. early 12th c.?], variant in 慶源 Kyŏngwon inscription, 1138-1153, 進士 jinshi candidate list, 1224, Berlin copy of the Bureau of Translators vocabulary, 15th c. It is interesting that the early and late forms are more similar to each other than to the forms between them.)
and sounded something like Jurchen gūsin, though there is no
evidence for its pronunciation.
2. When I was studying Russian in the late 90s, I was surprised that
'Kremlin' was Кремль <Kreml'> without an n. I asked my
professor why and ... I can't remember his answer. Today I learned from
that there is an Old East Slavic кремлинъ <kremlinŭ>
with -n-. But how did that n-form enter English? Not
directly, I assume.
1660s, Cremelena, from Old Russian kremlinu, later kremlin (1796), from kreml' "citadel, fortress," a word perhaps of Tartar origin. Originally the citadel of any Russian town or city, now especially the one in Moscow (which enclosed the imperial palace, churches, etc.). Used metonymically for "government of the U.S.S.R." from 1933. The modern form of the word in English might be via French.
The un-Turkic initial cluster kr- makes a Tatar (not 'Tartar') origin improbable. The Russian Wiktionary derives kreml' from Proto-Indo-European *kʷrom 'fence'.
12.26.10:09: Merriam-Webster says:
1662 [...] obsolete German Kremelien the citadel of Moscow, ultimately from Old Russian kremlĭ
That gives the impression that German added the -n (but why?).
184.108.40.206:23: YELLOW PIG 11/29
<so nggiyan uliya aniya juwa emu biya orin uyewun inenggi>
'yellow pig year, ten one month, twenty nine day'
orin uyewun 'twenty nine' is a para-Mongolian
(Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin yisün
'twenty nine' containing an unrelated Mongolian word for 'nine'.
Jurchen uyewun is trisyllabic unlike any other Tungusic word
for 'nine' at starling
other than Negidal ijeɣin with different first and third
vowels. Neghidal i can correspond to Jurchen/Manchu u:
e.g., N edin : J/M edun 'wind'. I have long assumed
that Manchu uyun is a contraction of uyewun.
That contraction already existed before Manchu got that name since the
Ming dynasty Bureau of Interpreters vocabulary has disyllabic uyun (transcribed 兀容). The roughly contemporaneous trisyllabic uyewun
(transcribed 兀也溫) in the Ming dynasty Bureau of Translators vocabulary
may be more carefully pronounced and/or from a different dialect.
It's already Christmas in most of the world as I write this, so as a
'gift' to my readers, I'm uploading all the posts I wrote over the last
month but had kept on my computer until now:
I've been too tired and busy to upload posts late at night.
220.127.116.11:57: YELLOW PIG 11/28
<so nggiyan uliya aniya juwa emu biya orin jakūn inenggi>
'yellow pig year, ten one month, twenty eight day'
1. orin jakūn 'twenty eight' is a para-Mongolian
(Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin
'twenty eight' containing an unrelated Mongolian word for 'eight'.
Jurchen jakūn 'eight' has not changed much from Proto-Tungusic
*japkun whose first syllable *ja looks like
Proto-Japonic ya 'eight'. Coincidence? How many other instances
of Proto-Tungusic intervocalic *j- correspond to Proto-Japonic *y-?
If one wants to link the Tungusic and Japonic words for 'eight' via borrowing, one must deal with the complication of working out a scenario of Tungusic-Japonic contact (see yesterday's post) and with the question of why Tungusic has *-pkun and Japonic doesn't. Proposing a genetic relationship eliminates the contact problem but still doesn't resolve the *-pkun problem.
It may be tempting to link early Korean *yʌtʌrp (Lee and Ramsey 2011: 160) to the Tungusic and Japonic words, but that raises even more problems: e.g., what is *tʌrp?
2. The current state of Korea-Japan relations in a slogan:
안가 an ka '[I] don't/won't go [to Japan]'
안사 an sa '[I] don't/won't buy [from Japan]!'
(1.2.15:51: Corrections by Kongduino.)
The verbs appear to be bare stems but are actually a-stems
that have absorbed an -a
ending that Martin (1992: 466) calls the 'infinitive'. But I would
rather not use the term 'infinitive' for the ending of a finite verb.
The -a ending is more obvious in forms like 봐! pwa!
'look!' (< po-a) and 팔아! phar-a 'sell!'
3. I was surprised to learn from Martin et al. (1967: 870) that sa-
'buy' is also an "old-fashioned" term for 'sell (grain)', so ssar-ŭl
sa-da 'rice-ACC X-STATEMENT' can be either 'buy rice' or 'sell
18.104.22.168:50: YELLOW PIG 11/27
<so nggiyan uliya aniya juwa emu biya orin nadan inenggi>
'yellow pig year, ten one month, twenty seven day'
1. orin nadan 'twenty seven' is a para-Mongolian
(Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin
'twenty seven' containing an unrelated Mongolian word for 'seven' with
the numeral suffix last seen in jirghughan 'six'.
Jurchen nadan 'seven' can be projected intact all the way
back to Proto-Tungusic.
Proto-Tungusic *nadan looks like Proto-Japonic *nana
'seven'. Coincidence? How many other instances of Proto-Tungusic
intervocalic *-d- correspond to Proto-Japonic *-n-?
What complicates a loan scenario is uncertainty over whether the two
proto-languages were in contact. I think Tungusic and para-Japonic
languages might have been in contact in Parhae, but that's centuries
after the ancestor of Japonic spread from the Korean peninsula to the
2. I just heard Muir pronounced as [mjʊɚ] which is what I'd expect for a theoretical Miur. Wiktionary lists a General American /mɪɚ/. I have never heard the name pronounced before. I thought it was homophonous with Moore in English. Wiktionary lists five (!) pronunciations for Scots muir 'moor': [møːr], [myːr], [meːr], [miːr], [mjuːr].
3. I also heard Buttigieg
pronounced for the first time as [ˈbuːtɪdʒɪdʒ]. I had been
mispronouncing it as [ˈbuːtɪdʒɛg], thinking gi was like Italian
[dʒ]. Turns out both g's are Maltese ġ [dʒ] and ie
is [ɨː] (according to Wikipedia's IPA
for Maltese page) or [ɪː], [iɛ], or [iː] (according
to Wikipedia's Maltese language page). In any case, ie is
from ā, and so I'm not surprised to learn that Wiktionary says Buttiġieġ
is from Arabic أبو الدجاج <ʔˀbw ʔldjʔj> ʔabū ad-dajāj,
lit. 'father [of] the-poultry' with ā.
The bending of ā to ie in Maltese reminded me of the raising of Old Chinese *a to *ie and various high vowels and convinced me that Norman's pharyngeal hypothesis for Chinese was right. In my take on his hypothesis, pharygealization pushed vowels down, whereas vowels raised in its absence. But David Boxenhorn made me think pharyngealization might not be a factor; vowel harmony alone might trigger vowel lowering and raising. And vowel harmony is a well-attested phenomenon in north Asian languages.
22.214.171.124:59: YELLOW PIG 11/26
<so nggiyan uliya aniya juwa emu biya orin ninggu inenggi>'yellow pig year, ten one month, twenty six day'
1. orin ninggu 'twenty six' is a para-Mongolian (Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin jirghughan 'twenty six' containing an unrelated Mongolian word for 'six'.
Grinstead (1972: 16) noted that
is an inverted Chinese 六 <SIX>. It is not like any of the variants of Khitan large script <SIX>:
Is the Jurchen graph a 12th century invention, or is it derived from a version of the Parhae <SIX> that the Khitan did not adopt for their large script?
The reading of Khitan <SIX> is unknown, but it might be
something like Proto-Mongolic *jir-gu-xan 'two-three-NUMERAL'
as reconstructed by Janhunen (2003: 17). Jishi read <SIX> as ʧirkɔ:
i.e., as 'two-three'. But if Janhunen is right about *jir-gu-xan
being an innovation, Khitan might retain an older Proto-Serbi-Mongolic
root for 'six'.
The Khitan small script block
<085.033.288> <SIX.is.bun> (Epitaph for Empress 仁懿 Renyi, d. 1076)
might indicate that <SIX> ended in -i, given how the
initial vowel of one block (here, the i of <is>) is often
(but not always) the final vowel of the previous block (here,
2. What is the etymology of Hawaiian luakini 'large heiau [Hawaiian temple; < hei 'sacrifice' + ?] where ruling chiefs prayed and human sacrifices were offered'? It looks like a compound of lua plus kini, but I can't find any lua or kini that would transparently add up to 'sacrificial temple'.
on the Dzungar genocide:
[Qing emperor] Qianlong issued his orders multiple times as some of his officers were reluctant to carry them out. Some were punished for sparing Dzungars and allowing them to flee, such as Agui and Hadada, while others who participated in the slaughter were rewarded like Tangkelu and Zhaohui (Jaohui).
is a Manchu name, it violates vowel harmony. I would expect Tangkalu
4. I wish I could look for Tangkelu in Giovanni Stary's A Dictionary of Manchu Names (2000). The book's National Library of Australia listing says it's in "Mandingo" (sic). No.
5. In actual
Mandingo, "/g/ and /p/ are found in French loans." The language has
/k c j t d b/, though. Are /h/ and /p/ in part or in whole from earlier
*g and *p?
IPA transcription of the Kazakhstani national anthem is so
different from what one might think Kazakh sounds like solely on the
basis of the Cyrillic or Latrin alphabet: e.g.,
'courage-GEN epic-3.POSS.NOM' = 'epic of courage'
One might expect the pronunciation to be something like [erliktiŋ dastanɨ] on the basis of Cyrillic and Latin alone. And if one guessed that Cyrillic і was [i], what would one guess и is? (It's [ɪj] ~ [əj] according to this chart.)
The use of ы/y for [ə] reminds me of my own choice to use y for the Tangut neutral vowel which may have been [ə] or [ə]-like in one or more grades.
The 3rd person singular possessive suffix -ы/y is missing
from this table. See Mukhamedova
(2016: 81) on the Kazakh X-GEN Y-POSS 'Y of X' construction.
7. Why does Glosbe
align Kazakh дастан 'epic' with Dennis in translations?
8. Until now I assumed that Turkic beg was a loanword from the Middle Chinese title 伯 *pæk. That is the etymology in Clauson (1972: 322). But Wiktionary has a second etymology:
the Middle Persian title bag (also baγ or βaγ, Old Iranian baga; cf. Sanskrit भग / bhaga) meaning "lord" and "master". Peter Golden derives the word via Sogdian bġy from the same Iranian root. All Middle Iranian languages retain forms derived from baga- in the sense "god": Middle Persian bay (plur. bayān, baʾān), Parthian baγ, Bactrian bago, Sogdian βγ-, and were used as honorific titles of kings and other men of high rank in the meaning of "lord".
The problem I have with this etymology is: why was a
in some Iranian language borrowed as Turkic e?
If /a/ in the Iranian source language was [æ], how can Slavic bog 'god' be a loan from Iranian? Was the Slavic word borrowed from a different Iranian source language in which /a/ was back and labial: [ɒ] or [ɔ]?
As for the Chinese etymology, the mismatch of initials (Chinese *p- vs. Turkic b-) is not a problem if the borrowing was in an early Turkic variety without p-. (Pre-Proto-Turkic *p- became Proto-Turkic *h- which was preserved in Khaladj and was lost elsewhere.)
The -g of beg might be a Turkic approximation of a Chinese (allophonic?) [ɣ]-like pronunciation of *-k. Although Old Turkic did have gh, gh could not coexist with e, but g could. And at some point, Middle Chinese *æ raised to *ɛ. Late Middle Chinese *pɛɣ was transcribed in the Tibetan version of the 千字文 Thousand Character Classic (c. 9th-10th c.?)as <peg.> which is close to Turkic beg. (However, the Turkic word is first attested in the 8th century, possibly when 伯 was closer to *pæk than *pɛɣ in western Middle Chinese.)
9. If I understand this correctly, Haddow is a Germanic/Celtic (Scots + Scots Gaelic) hybrid. Are there more common names like it?
10. Aacistak has been called "the Language Capital of the World". What is
its more common name?
126.96.36.199:55: YELLOW PIG 11/25
<so nggiyan uliya aniya juwa emu biya orin shunja inenggi>'yellow pig year, ten one month, twenty five day'
1. orin shunja 'twenty five' is a para-Mongolian (Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin tabun 'twenty five' containing an unrelated Mongolian word for 'five'.
The initial of 'five' in Manchu is s-, not sh-.
Neither Jurchen sh- nor Manchu s- matches the
t- in the rest of Tungusic.
2. Last night I thought of a Chinese character for the first time in
many years: 閼. It has the same phonetic as a character that I first
encountered last week: 菸.
That phonetic is a drawing of a crow: 於/烏. 烏 still represents the
word for crow, but its variant 於 has come to represent a nearly
homophonous locative preposition.
Normally 於/烏-graphs represent open syllables in modern languages: e.g.,
烏嗚 Old Chinese *ʔa > Cantonese wu1
於 Old Chinese *CIʔa > Cantonese jyu1
棜瘀 Old Chinese *CIʔas > Cantonese jyu3
So in Cantonese, I would expect 閼 and 菸 to end either in -u [u] or -yu [y]. But they don't:
閼 Cantonese aat3 ~ jin1
菸 Cantonese jin1
The vowels are less of an issue (see the appendix) than the codas:
Cantonese -t and -n go back to Old Chinese *-t and *-n.
Usually an Old Chinese phonetic can represent Old Chinese *-t syllables or *-n syllables but not both.
And usually an Old Chinese phonetic for *vowel-final syllables can also represent Old Chinese *-ʔ and *-(ʔ)s syllables but not Old Chinese syllables ending in stops other than *-ʔ or ending in *nasals.
In other words, 於/烏 should represent *-a(ʔ)(s) syllables but not *-t syllables or *-n syllables. Should. But clearly 於 is a phonetic in
Old Chinese *ʔat > Cantonese aat3
Old Chinese *CIʔan and *ʔen (syllable in the title of the Xiongnu supreme female leader) > Cantonese jin1
Old Chinese *ʔa(t)s > no Cantonese reflex (which would theoretically be *jyu3)?
Old Chinese *CIʔat > no Cantonese reflex (which would theoretically be *jit3)?
Old Chinese *CIʔa 'to fade' > no Cantonese reflex (which would theoretically be *jyu1)?
Old Chinese *CIʔas 'smelly grass' > no Cantonese reflex (which would theoretically be *jyu3)?
I have not found any evidence for 菸 being read with -n before the last millennium. At some point 菸 came to represent a word 'tobacco' < 煙/烟 Old Chinese *CAʔin 'smoke' normally written with -n phonetics (垔 and 因). The top component of 菸 'tobacco' is <GRASS> which makes sense. But the bottom component 於 is a poor phonetic (and 於 is unlikely to be an abbreviation of the uncommon character 閼 which also has non-n readings). Was 菸 'smelly grass' chosen to write an unrelated and phonetically different but semantically similar word 'tobacco'?
I found 菸 via Wiktionary's entry on yen. I forgot that yen could also refer to having a desire for something.
12.22.19:22: APPENDIX: Some *-a rhymes from Old Chinese to Cantonese:
*Ca > *Co > [Cuː]: e.g., Cantonese 烏嗚 wu1 [wuː˥]
*CICa > *CICɨa > *Cɨa > *Cɨə > *Cɨ > [Cyː]: e.g., Cantonese 於 jyu1 [jyː˥]
*Cat > [Caːt]: e.g., Cantonese 閼 aat3 [aːt˧]
*CICan > *CICɨan > *Cɨan > *Cɨən > *Cɨen > *Cien > [Ciːn]: e.g., Cantonese 閼 jin1 [jiːn˥]
*CACin > *CACein > *Cein > *Cen > *Cien > [Ciːn]: e.g., Cantonese 菸煙烟 jin1 [jiːn˥]
Old Chinese *-s conditions Cantonese tone 3 after *voiceless initials
Old Chinese *-t conditions Cantonese tone 3 after *voiceless initials and *long vowels
At some point after tonogenesis,*ʔ- was lost, and zero initials became homorganic glides before high vowels:
*ʔu > u > [wuː]: e.g., Cantonese 烏嗚 wu1 [wuː˥]
*ʔy > y > [jyː]: e.g., Cantonese 於 jyu1 [jyː˥]
*ʔi > i > [jiː]: e.g., Cantonese 閼 jin1 [jiːn˥]
Contrast with *ʔa > nonhigh [a] without a glide in Cantonese 閼 aat3 [aːt˧].
188.8.131.52:51: YELLOW PIG 11/24
<so nggiyan uliya aniya juwa emu biya orin duin inenggi>'yellow pig year, ten one month, twenty four day'
1. orin duin 'twenty four' is a para-Mongolian (Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin dörben 'twenty four' containing an unrelated Mongolian word for 'four'. -ben is the 'feminine'¹ vowel variant of the -ban found in ghurban 'three', and both ghurban and dörben have a shared suffix -r- (Janhunen 2003: 47).
Rozycki (1983: 7, 93) regards Jurchen/Manchu duin and
Written Mongolian dörben to be a "[p]re-loan correspondence":
"words with a phonology consistent with native Tungus stock and for
which there is no evidence of loaning". I regard the vague similarity
of duin and Proto-Mongolic *dö- 'four' (as
reconstructed by Janhunen 2003: 47) as coincidental.
¹I use the term 'feminine' to avoid
committing to a front or higher vowel interpretation of e.
2. Yesterday I forgot how to pronounce 6ix9ine which
looks like it was written in the Arabic chat
alphabet (in which 6 is ط <ṭ> and 9 is ص <ṣ> or ق
<q>). But it's actually a
stylized spelling of six nine mixing logograms with letters.
The Jurchen (large) script, Korean hyangchal, and Japanese script
frequently have logogram-phonogram sequences for words. Perhaps the
Khitan large script did too, but it's too poorly understood for me to
How did Tekashi 6ix9ine come up with the stage name Tekashi?
Is it based on Japanese Takashi?
3. I knew Ў wasn't unique to Belarusian (in which it represents /w/), but I forgot which other language was written with Ў: Uzbek. Ў has since been replaced with Oʻ. Ў/Oʻ represents mid /o/, whereas О/O represents low /ɒ/ and /o/ in Russian loans. Did Uzbeks perceive Russian /o/ [o] ~ [ɔ]² as being lower than their /o/ and closer to their /ɒ/? Does native /o/ have a high allophone [ʊ]? That would explain why it was written as Ў: i.e., as У <U> with a breve rather than as О <O> plus a diacritic.
²For some reason, Wikipedia IPA has [ɛ] for Russian
/e/ and [o] (not [ɔ]) for Russian /o/ even though this
diagram shows the two vowels at almost identical heights with [o] lower
than [ɛ] rather than the other way around.
4. Cyrillic Ӯ (Ұ after 1957; see here for other uses of Ӯ) for Kazakh /ʊ/ reminds me of Möllendorff's Ū for Manchu /ʊ/.
The 'feminine' counterpart of Manchu /ʊ/ is /u/, but Kazakh has no /u/. It has an interesting three-way categorization of vowels: -RTR, 0RTR (neutral), and +RTR. The [-RTR] and [0RTR] counterparts of [+RTR] // are /ɪ/ and /ʉ/. (Kazakh has no /i/ either. If the IPA symbols are taken at face value, apparently the only high vowel is central /ʉ/; /ɪ/ and /ʊ/ are slightly lower.)
Is Kazakh /œ/ backed if not central? It is a [0RTR] vowel like /ʉ əj
ə/ despite being written with a front vowel symbol like the [+RTR]
vowels /ɪ jɪ e æ/.
5. I wish I had a key to the 1964-1984 Kazakh Latin alphabet used in
China (and in this
1977 edition of Mao's Selected
6. Last night I found Handel
(2006) while trying to find where I had first encountered the idea
that Korean 바람 param < Middle Korean pʌ̀rʌ̀m
'wind' was a borrowing from Old Chinese. I thought I had read it in
Pulleyblank (1962), but I couldn't find it there. This 2013 post
reminded me I got it from William Boltz. My apologies to Professor
Handel discusses 'wind' on page 1015. In footnote 8, he mentions an
internal etymology relating Korean 'wind' to pul- < Middle
Korean pǔr- 'to blow'. Although the semantic match is perfect,
the phonetic match leaves much to be desired. First, I know of no other
cases of a CʌC-noun from a CuC-verb. Second, Middle
Korean pǔr- is a class 5 stem in Ramsey's (1986) typology; it
is a disyllabic stem /pùúr/, and if I understand Ramsey (1978: 221)
correctly, it goes back to *pùrɯ́- with high series vowels and
a high-low pitch pattern unlike the low-pitched low series vowels of pʌ̀rʌ̀m.
part of the Wikipedia article on the Common Turkic Alphabet puzzles me:
Some handwritten letters have variant forms. For example: Čč=Jj, Ķķ=Ⱪⱪ, and Ḩḩ=Ⱨⱨ.
But Lithuanian Karaim, the only Turkic Latin alphabet that I
know of with Č, distinguishes Č (for ) from J
(for [j]). And I find it hard to believe that two letters with such
different shapes could be variants only in Turkic usage.
Of course in general Latin letter usage there are some surprising
variants. Would an alien guess that B and b are the
same letter? Uzbek used to have в instead of b in the 1928-40
Yaꞑalif alphabet. (I am not italicizing в since I'm not sure if the old
Uzbek italic в looked like Russian italic в.)
Turns out that "[t]he small letter B is ʙ (to prevent confusion with Ь ь)". Although Ь represented palatalization in Russian, in Yaꞑalif, it seems to have stood for Soviet Turkic vowels similar to Turkish ı: e.g., Tatar [ɤ]. Uzbek had no such vowel:
Nonetheless I guess ʙ remained the lowercase version of B in Uzbek
for consistency with the other variants of Yaꞑalif. You
can see Uzbek ʙ here.
8. I've never looked at Karakalpak before today. I confess I forgot it even existed.
It has a nearly symmetrical vowel system with palatal vowel harmony. Only e has no nonpalatal counterpart.
It also has labial harmony. If the first vowel is nonlabial, then the second vowel cannot be labial. However, if the first vowel is labial, then the second vowel may or may not be labial. In any case, vowels must match in palatality.
How was Karakalpak /h/ written in Cyrillic? I can't find a Cyrillic letter for it.
9. Wikipedia says that
The [irregular] /otoosan/ form [for Japanese 'father'] first appears in the early Meiji period in educational materials mandated by the 文部省 (Monbushō, "Ministry of Education").
Did /otoosan/ replace earlier /otossan/ by analogy with the long vowel of /okaasan/ 'mother'?
/okaasan/ is itself irregular; it is from /okakasan/
with irregular intervocalic /k/-loss.
Wikipedia lists Taiwanese borrowings of both words: 多桑 <MANY
MULBERRY> tò-sàng and 卡桑 <kha MULBERRY> khà-sàng. Both
reflect shorter Japanese forms without the honorific prefix o-.
19.12.18.xx:xx: YELLOW PIG 11/23
<so nggiyan uliya aniya juwa emu biya orin ilan inenggi>'yellow pig year, ten one month, twenty three day'
1. orin ilan 'twenty three' is a para-Mongolian (Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin ghurban 'twenty three' containing an unrelated Mongolian word for 'three'.
2. Yesterday I learned that Eom Ik-sang still believes a number of Korean words conventionally regarded as native are actually borrowings from Old Chinese. Even if I assume the Old Chinese forms he cites are correct, there are still issues.
Perhaps the most convincing of his proposals is
Old Chinese 風 *pljəm (Li), *plums (Zhengzhang) 'wind' : Korean 바람 param 'id.'
I would prefer to cite Middle Korean pʌ̀rʌ̀m 'wind' which is even closer to the Old Chinese reconstructions that he cites.
expressed some doubts about a liquid in the Old Chinese word for 'wind'
in 2013, I would favor reconstructing that word as *prəm
with *-r- now.
That aside, there is one other potential problem with the comparison: I don't think anyone's Old Chinese reconstruction for 'wind' ever had the vowel *ʌ. If the Old Chinese word for 'wind' had *ə, why was it borrowed into early Korean as something like pʌ̀rʌ̀m when Korean also had the vowel ə? In other words, why isn't the Korean word for 'wind' pərəm with ə?
12.19.22:33: Was Edkins (1890: 95) the first to derive Korean param from Old Chinese 風?
param, wind; from [an unspecified - presumably Chinese -] pam. The old Chinese for wind is bam, which has changed to [Mandarin] feng.
Edkins was writing decades before Karlgren reconstructed Old
Chinese. I know almost nothing about pre-Karlgren Chinese
reconstructions, so I wonder what the reasoning behind pam and bam
are. *pam is not a bad guess, since even in the 19th century,
it was known that f- was from *p- and that 'wind' rhymed
with 南 'south' (Mandarin nán and Cantonese naam4).
However, *b- is a surprise, as 'wind' does not have a tone
pointing to an earlier *voiced initial.
3. I've never seen anything like this use of the reflexive in Romagnol:
mè a sò 'I am' (cf. Italian [io] sono 'id.')
The reflexive seems less exotic in this case:
mè a j'ò 'I have' (cf. Italian [io] ho 'id.')
And the English and Italian translations of this last instance also have a reflexive:
mè a'm so lavê 'I washed myself' (cf. Italian [io] mi sono lavato 'id.')
Romagnol has an inventory of up to 20 contrastive vowels in stressed position, in comparison to Italian's 7.
Unfortunately Wikipedia doesn't list all 20 vowel phonemes. How did the 10 native vowels of Latin become 20 in Romagnol? Are some of the Romagnol vowels from Latin diphthongs?
The most interesting Romagnol vowels are these diphthongs which are
unlike anything in Latin:
ê [ɛə̯] vs. ë [ɛɐ̯]
ô [ɔə̯] vs. ö [ɔɐ̯]
I assume they are phonemes, though Wikipedia represents them with
phonetic brackets. /Və̯/ : /Vɐ̯/ is a fine contrast I've never seen
5. How did Neapolitan
develop this alternation?
luongo [ˈlwoŋɡə] 'long' (masculine)
longa [ˈloŋɡə] 'long' (feminine)
Did an earlier *o break to [wo] before the masculine ending *-o
merged with the feminine ending *-a?
*ˈloŋɡo > *ˈlwoŋɡo > [ˈlwoŋɡə]
*ˈloŋɡa > *ˈloŋɡa > [ˈloŋɡə]
6. While I'm in languages of Italy mode, It just occurred to me that the gorgia toscana is a bit like Jurchen/Manchu in which *p > f (albeit in all environments, not just intervocalically) and *-k- > -h- (see Vovin 1997 for details).
7. I saw a commercial for the IUDs Mirena [məɹiːnə] and Kyleena
[kʰajliːnə]. Those names sound like 'creative' Anglospheric girls'
names. The commercial was aimed at young women. Somebody wanted the
audience to think of IUDs as if they were daughters. The children that
the IUDs are supposed to prevent. Creepy marketing.
184.108.40.206:51: YELLOW PIG 11/22
<so nggiyan uliya aniya juwa emu biya orin juwe inenggi>'yellow pig year, ten one month, twenty two day'
1. orin juwe 'twenty two' is a para-Mongolian
(Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin qoyar
'twenty two' containing an unrelated Mongolian word for 'two'. Jurchen juwe
'two' is not to be confused with Jurchen juwa 'ten'.
2. Last night when trying to figure out the Chinese character
spellings for damofo and yumofo,
I typed <fo> into Windows 10's Pinyin IME and was surprised to
see 仸 <PERSON.夭>. 夭 ǎo/yāo/yǎo is normally not phonetic
沃 wò/wù (the first and more common reading is irregular - a loan from another dialect?)
I would have guessed that 仸 was read as something like yao.
Then I learned that 仸 is a variant of 佛 Fó 'Buddha'. 仸 seems to
be a semantic compound with 天 <HEAVEN> slightly altered to 夭. (天
and 夭 are difficult to distinguish in a sans serif font, but in
handwriting, the top stroke of 天 is written from left to right, whereas
the top stroke of 夭 is written from right to left.)
3. Two elephantine surprises last night: Wiktionary notes a subtle difference between 象 <ELEPHANT> in the PRC standard and nom on the one hand and elsewhere in the Sinosphere on the other. Both versions of 象 have the same codepoint.
I am not sure that the PRC and nom really have a distinct version of 象:
my 1971 edition of 新华字典 Xinhua zidian (The New Chinese Character Dictionary) has the 'wrong' form of 象
my modern edition of the nom handbook Ngũ thiên tự (Five Thousand Characters) has the 'wrong' form of 象 in the main text. (The font in the index is too small for me to make out which form is used.)
4. 象 was also formerly a simplification of 像. The Wiktionary entry for 象 says it was a 1964-1986 simplification of 像. Wikipedia mentions other two characters restored in 1986: 覆 and 叠. I am skeptical:
叠 is already simplified. The traditional form is 疊. Sukhanov
(1980: 25) lists an even more simplified form ⿱又冝 which is 俗 'popular'
and not official. (12.18.0:15: My love of variants compels me to
mention other variants of 疊 listed in Wiktionary: 𣆹曡㬪疉.)
My 1971 edition of 新华字典 Xinhua zidian has the characters 像覆叠 as main entries 15 years before 1986. All main entries are in simplified characters which are followed by their traditional forms in parentheses. The entries for 象 and 复 do not list 像 and 覆 in parentheses.
Conversely, DeFrancis' (1996) dictionary substitutes 象 for 像 in
its main body and lists 像 in an appendix of traditional characters.
5. When trying to type 复 fù in Microsoft's Bopomofo IME, I
found 䲁 <FISH.wèi> wèi 'a snake-like fish' as the
64th and last choice for fù. How did 䲁 get in the list? Graphic
confusion with 鮒 <FISH.fù> fù 'a kind of fish'
is also in the list?
6. Unidentifiable Khitan small script characters I encountered while copying the 契丹小字研究 Qidan xiaozi yanjiu (Research on the Khitan Small Script) hand copy of the epitaph for Emperor 興宗 Xingzong (1015-1054) of the Khitan Empire:
⿱⺌月 (but with a dot instead of two horizontal lines in 月; 2.21.1)
a lookalike of Chinese 七 <SEVEN> (2.24.1)
I assume they must be in the book's indices under more conventional forms - but what are those forms?
Ah, the first was a variant of 298 <co> with a narrower bottom half and a curved lower stroke:
The very block with 298 from Xingzong was even discussed in Kane
(2009: 71). Duh.
The Qidan xiaozi yanjiu hand copy also has some slight variations of characters I do recognize: e.g.,
243 <HEAVEN> and 240 <TEN>
are written with 𠂉 on top instead of ハ. As a result, 243 <HEAVEN> looks like 矢 204 whose phonetic value is unknown. Could 矢 204 be interpreted as 'heaven'?
I still have no idea what 七 is. Not only is it an unusual (for Khitan) shape, but it is also is the only top element in a pyramid.
7. The Cantonese-only character 乸 <jaa2.MOTHER> for naa2 'female' has an unusual phonetic 也 jaa5. The rhyme is perfect; the initial is not. 乸 has puzzled me since I first saw it some time ago, but today I just realized that a j-phonetic 也 might have been chosen because there are phonetics representing both j- and n-syllables: e.g., 襄 soeng1 (with s-!) < *sInaŋ in
儴勷攘瀼獽禳穰蘘蠰躟鬤 joeng4 < *ɲɨaŋ < *nɨaŋ
< *CInɨaŋ < *CInaŋ
囊囔瓤饢 nong4 < *naŋ
That j- ~ n- alternation goes back to a single Old Chinese *n- that developed two reflexes: *n- before nonhigh vowels and palatal *ɲ- before high vowels.
也 had Old Chinese *l-, another source of Cantonese j-.
*l-characters normally aren't phonetics in Cantonese n-characters.
Cantonese speakers would not know which j- are from *n- and which j- are from *l-, so whoever came up with 乸 might have thought, 'if 襄 can stand for j- and n-syllables, 也 can too', unaware that 也 jaa5 isn't from *n- (and hence 'shouldn't represent Cantonese n-syllables).
8. I missed Andrew West's tweet on a cursive Tangut tablet from the Baisigou pagoda.
9. Marijn van Putten on the mystery of Mehmet.
220.127.116.11:46: YELLOW PIG 11/21
<so nggiyan uliya aniya juwa emu biya orin juwe inenggi>'yellow pig year, ten one month, twenty two day'
1. orin emu 'twenty one' is a para-Mongolian (Khitan?)-Jurchen hybrid. Compare with Written Mongolian qorin nigen 'twenty one' containing an unrelated Mongolian word for 'one'.
2. I wish I could look more into exceptions to 'Altaic' vowel harmony. Two examples that have long stuck in my mind:
Old Turkic -mish ~ -mis (past particple)
(examples from Tekin 1968: 179):
qazghanmish 'acquired' (not qazghanmïš; Bilgä Qaghan East 22)
barmis 'gone' (not barmïs; Bilgä Qaghan East 22)
but other Old Turkic texts have ï where it is
expected: e.g., tughmïsh 'risen' (Mai tH XV 11v22; found
in Erdal 2004: 268)
Turkish anne 'mother' (not anna or enne!)
Clauson (1972: 169) says anaː 'mother' was "sometimes
subjected to unusual deformations, e.g., anne,
to make it a term of more intimate affection" - a phenomenon that is
the opposite of taboo deformation in terms of motivation (though not
More recently I came across Manchu age 'older
brother' (not ege or aga!; see Hauer and
Corff : 7). Rozycki (1983: 22) regards age as somehow
related to Written Mongolian aq-a¹ 'id.':
"The correspondence is ancient and direction of loan impossible to
ascertain." Could this be an anne-like case of intimate
I couldn't find age or other similar Manchu words like ahūn 'older brother' in Doerfer's Mongolo-Tungusica (1985), so I suppose Doerfer does not think there is any connection between the Manchu and Mongolian words.
What finally pushed me to write about Manchu age was seeing Manchu ajige 'small, little, young' (not ejige or ajiga) on Saturday night. Its root is aji-, also found in ajida 'small' and ajigan 'young, small' which are harmonic. majige 'little' is similarly nonharmonic with similar semantics. Are these cases of cute deformation? Imitating the speech of small children who have not yet mastered vowel harmony? I can't quickly find any article on L1 Turkish vowel harmony acquisition (DuckDuckGo results are often unsatisfying), but Leiwo, Kulju, and Aoyama (2006?) cover Finnish vowel harmony:
The data showed that most of Finnish 2;6-year-olds’ productions do not violate FVH [Finnish vowel harmony], suggesting early mastery of FVH. When there were errors in children's productions, they were mostly substitutions of back vowels for the front rounded vowels.
... which is the opposite of the substitution that occurred in
Turkish anne! (Or centuries ago in barmis.)
Unlike Finnish or Turkish, Manchu does not have palatal harmony. Manchu age, etc. have a high series vowel e [ə] in place of its low series counterpart a. But if I 'translate' the Finnish error pattern into Manchu, I would expect substitutions of low series vowels for high series vowels. Which is the opposite of what happened in age, etc.
There is, however, a common denominator: Finnish vowel harmony errors occurred "especially in non-initial syllables and in suffixes" (Leiwo, Kulju, and Aoyama (2006: 151), and the Turkish and Manchu violations above are also in noninitial position: -mis, anne, age.
Katsura is a former classmate of mine.
¹The hyphen is a device to transliterate the
obligatory space in the Written Mongolian spelling <aq a>; it has
no morphological or phonological significance.
3. Looking at Tangut
4440 2len4 'pavilion' (#189 in The Golden Guide)
led me to wonder: Why did Middle English pavilloun become modern English pavilion? Was -i- restored by someone who knew its Latin source pāpiliō 'butterfly'?
4. Today I started copying the epitaph for Emperor 興宗 Xingzong
(1015-1054) of the Khitan Empire. I haven't gotten to line 4 yet, but I
looked ahead and spotted block 24
of line 17.24.
The only other instances of 096 that I know of are in the block
in the epitaphs for Mme. 耶律 Yelü (11.20) and 蕭敵魯 Xiao Dilu (1061-1114; 30.19 and 34.14).
is similar in shape to 095, a lookalike of Chinese 女 <WOMAN>.
095 is more common than 096 and can occur in medial and final positions
in blocks. These different distributive patterns suggest that 096
represents a more complex phonetic sequence than 095 - one that so far
is only known from the beginnings of words. On the other hand, whatever
095 represents may be more complex than, say, 339 which is simply [i]?
Both 095 and 096 probably represent one or more syllables absent from Liao Chinese, as neither appears in Khitan transcriptions of Chinese. They may contain
a non-Liao Chinese consonant (e.g., q)
a non-Liao Chinese vowel
segments shared with Liao Chinese but combined into a sequence absent from Liao Chinese
I doubt that 095 or 096 represent single segments. I suspect that all the single-segment phonograms of the Khitan small script have been found by now.
As far as I know, as of 2016 there were 482
known small script characters including variants. Have any new ones
been found lately? The only new small script texts found lately to the
best of my knowledge are fragments
of jade tablets from a mausoleum. If
this photograph is representative, the texts are too short to be
likely to contain any character that hasn't surfaced in any previously
known, much longer texts.
5. Today I finally got Jun Jiang's Learn
on my iPhone. As neat as it is to see a finger trace strokes on a
screen, I wish I could double-check the direction and order of strokes
with another source. And I'm not yet accustomed to the wheel interface.
6. Today I also got Jun Jiang's Mongolian
Words & Writing app, but I haven't tried it out yet. Users
hoping to learn Mongolian
Cyrillic will be disappointed since the app only covers the traditional
script. I'd like to know how to write Ө <Ö> and Ү <Ü>
in cursive. (The rest of the alphabet is identical to Russian, and I've
been writing Russian in cursive since 1997.)
7. Jun Jiang's store doesn't have any app for Mongolian Cyrillic, but it does have these apps:
Learn Chinese Handwriting ! (with a space before !)
Japanese Kanji Writing (but the icon has hiraganaあ <a>!)
Learn Uyghur Handwriting ! (with a space before !)
Tibetan Words & Writing
Thai Words & Writing
Persian Words & Writing
Learn Zhuang Language ! (with a space before !)
Lao Words & Writing
Learn Khmer Handwriting ! (with a space before !)
Learn Burmese Handwriting ! (with a space before !)
Tamil Words & Writing
Learn Malay Language ! (with a space before !)
Tagalog Language - Filipino
Vietnamese Alphabet & Words
Learn Hokkien Language ! (with a space before !)
Cantonese Words & Writing!
Hebrew Words & Writing
Wu Language - Chinese Dialect
Hakka - Chinese Dialect (no "Language"!)
Korean Hanja Handwriting ! (with a space before !)
I assume those apps have the same interface as the Manchu app.
So much for my original guess that Jun Jiang might be a Manchu and Mongol specialist.
sample of the traditional Mongolian script
is (turn 90 degrees clockwise for the proper orientation - alas, that
way the first line is on the right instead of the left where it should
ᠴᠣᠷᠢ ᠢᠢᠨ ᠭᠠᠭᠴᠠ
cori yin ghaghca
ᠪᠣᠰᠤᠭᠠ ᠪᠢᠴᠢᠭ᠌᠄33'single GEN single': i.e., 'the one and only'
I don't know what is meant by 'one and only' since there are other vertical scripts, and even if one is only thinking of major vertical scripts written from left to right, the Mongolian script is not unique since the Manchu script is written the same way.
ghaghca has a synonym ghanca. How can
that word-medial -gh- ~ -n- alternation be explained -
assuming they are related words?
9. Today while double-checking the Li Fanwen number for the common Tangut character
4457 2leq3 'great'
I found these interesting characters which appear to be semantic
4445 2bi1 = 4457 2leq3 'great' + 2547 1chir2 'right'
4454 2ryr1 = 4457 2leq3 'great' + 2920 1zhyq3 'left'
2920 has the Tangraphic Sea analysis
2920 1zhyq3 'left' = all of 3485 1laq 'hand' + right of 4454 2ryr1
which cannot be taken at face value as the origin of the character - why would a character for a common word 'left' be based on a rare character 4445?
4445 and 4454 are only known as members of these compounds:
4445 0661 2bi1 2ngon4 'South Sea'
4454 0661 2ryr1 2ngon4 'North Sea'
4445 and 4454 are not the normal words for 'south' and 'north' which are
4796 1zyr4 'south' and 0942 1laq3 'north'
Although the Tangut script is thought to be full of semantic compounds, it is curious that 4445 and 4454 - glossed by Li Fanwen (2008: 706-707) as 'south' and 'north' - do not contain any components in common with 4796 and 0942, the graphs for the common words 'south' and 'north'.
Nonetheless Li's glosses make sense: 4445 has the notation
4796 0661 1zyr4 2ngon4 'southern sea'
in Homophones D and is a definition for 4796 'south' in Tangraphic
And if 4454 contains 'left', the opposite of the 'right' in 4445, then
4454 must be 'north', the opposite of 4445 = 4796 'south'. But I am
hesitant to gloss 4445 and 4454 simply as 'south' and 'north'. Maybe
'Great South' and 'Great North' or even as 'Great Right' and 'Great
The association of 'south' with 'right' is reminiscent of Sanskrit dakṣiṇa- 'south/right'. Sanskrit uttara- 'north' can also mean 'left', but the normal word for left is vāma- which does not mean 'south'.
What were the Great South/Right and Great North/Left Seas? Were they
mythical? I don't know much about how the landlocked Tangut perceived
their world. How many Tangut had ever seen a sea? What is the etymology
of 2ngon4 'sea'?
10. Today I saw this passage in Gorelova ( :15; I added the links):
The Mohes [靺鞨] called their tribal leader "damofo mandu" (chin. da [大] "great"), as one can see further, the Southern Shiwei [室韋], who can be identified as people of Tungusic descent, called their tribal chieftains "yumofo mandu".
The language spoken by the Mohe was Tungus-Manchu. What is important to mention is that the language of the Sushen could also be referred to as proto-Tungusic.
During the Tang era, the Mohe, similar to other peoples of northeastern Asia, were subjected to constant political and military pressure from Tang rulers. Soon after the Koguryo state of Korea had been defeated by the Tang empire (668 AD), a large portion of the Koguryo people fled into the lands of the Sumo Mohe [粟末靺鞨]. Soon a lot of towns, surrounded by defensive walls, arose there. Around 700, a new state, "Parhae" (chin. Bohai), raised from the ruins of Koguryo, was established. It was the leader of Sumo Mohe, Cicik Zhungxiang [乞乞仲象] who was considered the creator of Bohai. [...] Later, his grandson, Uazhi Da Tuyu, declared himself the emperor of Bohai, which in the course of time became highly cultured and enlightened, and widely known beyond the borders of the country. The Parhae (Bohai) state—a deserving successor of the culture and power of Koguryo and the tribal league of the Songari Mohe—flourished for 228 years until it was destroyed by the Qitans [Khitans] (926 AD) (Shavkunov, 1968; Crossley, 1997:18; Larichev, 1998:53-4).
What are the characters for damofo mandu and yumofo mandu
which sound like modern Mandarin readings of old Chinese transcriptions?
I was surprised to see the Southern Shiwei described as Tungusic
since their name - roughly pronounced *shirwi in Late Middle
Chinese - is derived from the para-Mongolic autonym Serbi. But
of course names are not reliable guides to linguistic affiliation.
Cicik Zhungxiang is a strange, not-quite-Pinyin romanization of 乞 乞仲象 Qǐqǐ Zhòngxiàng with a -k whose motivation is obscure. Assuming the Chinese pronunciation favored in Parhae was like early Sino-Korean, 乞 乞仲象 was pronounced something like *kər kər tyung syang. 乞 乞 <BEG BEG> looks like an insulting ('derographic') transcription of a non-Chinese (i.e., Mohe) name. 乞 乞仲象 is also known as 大 仲象 with a Chinese-style surname 大 <GREAT> to go along with the Chinese-style disyllabic personal name 仲 象 <SECOND.BORN ELEPHANT>.
Uazhi Da Tuyu is presumably 乞
乞仲象's son (not grandson) 大祚榮 (Mandarin: Dà Zuòróng, Korean: Tae Cho-yŏng; r.
712-719), the first king (not emperor) of Parhae. I have no idea what Uazhi
11. The best for last: I just discovered Andrew
West's Tangraphic Sea search tool! More Tangut
18.104.22.168:55: YELLOW PIG 11/20
<so nggiyan uliya aniya juwa emu biya orin inenggi>'yellow pig year, ten one month, twenty day'
1. Jurchen and Manchu orin 'twenty' sounds like Written Mongolian qorin 'id.' The pronunciation of Khitan
廿 <TWENTY> (large script)
丁 <TWENTY> (small script)
is unknown; it could have been something like qorin.
Normally Written Mongolian q corresponds to h or k
Rozycki (1983: 11-12) proposes four layers of borrowing into (Jurchen/)Manchu to explain the different correspondences:
Layer 1: Mongolic *q- borrowed as *k- > *x- > *Ø- (within Tungusic): e.g., orin 'twenty'
Layer 2: Mongolic *q- borrowed as *k- > *x- (within Tungusic): e.g., hoton 'city wall' (cf. Written Mongolian qoton 'id.')
Layer 3: Mongolic *q- borrowed as k-: e.g., kobkolo- 'to remove (paper stuck to a surface)' (cf. Written Mongolian qubqol- 'to peel')
Layer 4: modern Mongolic *q- > x- borrowed as h-
This model could be refined: e.g., in the early layers, the
borrowing was probably from para-Mongolic (specifically Khitan) rather
than from Mongolic.
There doesn't seem to be any way to distinguish between layers 2 and
4 on the basis of Manchu evidence. I suppose Rozycki assigns Manchu
words to layer 2 if the borrowings are found elsewhere in Tungusic
(e.g., see Doerfer [1985: 81] for hoton-type
Tungusic words). Layer 2 words were borrowed into early Tungusic,
whereas layer 4 words were borrowed only into (Jurchen/)Manchu.
2. The Khitan large script character 廿 <TWENTY> is identical to the standard Chinese character 廿 <TWENTY> which was pronounced *ɲip in Middle Chinese, a fusion of 二 *ɲi̤ 'two' and 十 *dʑip 'ten'. Wiktionary says the expected standard Mandarin reflex is rì, but the actual reflex is niàn because
Let's see 入:
[t]he irregular pronunciation (e.g. /nVm/ [with the nasal counterpart of the original coda /p/] dates from the Song dynasty, to avoid homophony with a vulgar word; see 入.
The regular Mandarin pronunciation [for 入 <ENTER>] as predicted from Middle Chinese is rì. The irregular sound change [to rù] is for taboo reasons - to avoid homophony with its derived vulgar meaning "to enter > to have sexual intercourse", nowadays represented by 日 (rì).
I would expect 廿 to be nhập in Vietnamese since 二 'two' is nhị and 十 'ten' is tập. Wiktionary lists five Vietnamese readings of 廿:
nhập 'twenty' (?)
trập (first syllable of 廿重 trập trùng
the initial isn't nh-; tr- is normally from Chinese
*retroflexes and native Vietnamese *Cl-clusters
the sắc tone suggests a *voiceless initial even though Chinese 'twenty' had a *voiced initial
are those oddities the product of Vietnamese-internal taboo deformation?
chấp 'twenty' (?)
not listed at nomfoundation.org; presumably an alternate
spelling of trấp; the Hanoi dialect merges tr- and ch-
(< *c- if followed by a sắc tone).
The normal Vietnamese word for 'twenty' is native: 𠄩𨑮 hai mươi
'two ten', which has its own contracted form hăm (with short ă
instead of long a!).
3. Is it obvious to Koreans that the hangul title of the movie 독전 Tokchŏn (English title: Believer) is 毒戰 <POISON BATTLE> tokchŏn rather than 督戰 <SUPERVISE BATTLE> tokchŏn 'urging to fight harder'?
Only the second tokchŏn is in dictionaries. The first tokchŏn is a straightforward Koreanization of the title of its inspiration, the Chinese movie 毒戰 (Mandarin Dúzhàn, Cantonese Duk6 zin3; English title: Drug War]).
The fact that some websites call the Korean movie 독전: 마약전쟁 Tokchŏn:
mayak chŏnjaeng 'Poison Battle: Narcotic Wars' implies that Tokchŏn
by itself might need clarification. In hanja that longer title looks
redundant with two 戰 chŏn: 毒戰: 痲藥戰爭.
Korean-English dictionary gave this sentence as an example of tokchŏn:
암튼 '독전' 화이팅 할까요?
Amthŭn 'Tokchŏn' hwaithing halkkayo?
'Anyway, shall we do "Believer" fighting?'
That made me curious about the etymology of 암튼 amthŭn 'anyway'. Is it of recent origin? I couldn't find it in Martin et al.'s massive 1967 Korean-English dictionary or my old portable favorite, Dong-A's 1981 Korean-English dictionary.
I think 암튼 is an extreme example of contraction:
아무리 하려 하면 하든지
amu-ri ha-ryŏ ha-myŏn ha-dŭ-n-ji
any-ADVERB do-INTENTIVE do-CONDITIONAL be-RETROSPECTIVE-MODIFIER-uncertain.fact
Martin et al. (1967: 1093) derive 암 am 'surely' from
which according to Martin et al. (1967: 1073) is in turn a contraction of
아무리 하려 하면
amu-ri ha-ryŏ ha-myŏn
'any-ADVERB do-INTENTIVE do-CONDITIONAL'.
튼 thŭn is a reduction of
Martin (1992: 834) translates -dŭ-n-ji as 'the uncertain fact that it has been observed that', 'whether it was (observed to be/happen'). -ji can be dropped. That leaves hadŭn /hatɯn/. th- /tʰ/ looks like the product of syncope, metathesis, and fusion:
/hat/ > /ht/ > /th/ > /tʰ/
Metathesis is a regular process in Korean: /hC/ cannot surface as [hC].
(12.16.0:16: The reduction of /hat/ to /tʰ/ above parallels the reduction of the first syllable of the Korean root 'to ride' between the 12th and 15th centuries:
12th c. *hʌta- > *hta- > 15th c. tha- /tʰa/
The 12th century form is preserved in Chinese transcription as 轄打 *xjaʔta in Jilin leishi. I have followed the conventional view by reconstructing *ʌ in the first syllable, but now it occurs to me that Chinese *-ja- might reflect a 12th century Korean *(y)e or *yə. Perhaps
pre-12th c. *heta- > 12th c. *h(y)eta- or *hyəta- > *hʌta- > *hta- > 15th c. tha- /tʰa/
I reconstruct *e as a front low series vowel in early Korean:
That *e later broke to yə (= yŏ in my modified McCune-Reischauer romanization), the most common yV-sequence in native Korean words.
In my scenario for 'to ride' above, *(y)e or *yə was reduced to *ʌ, the minimal low series vowel, before being lost. By that point Korean had developed vowel harmony, so the vowel in the first syllable had to be a low series vowel like the *a in the second syllable.)5. More examples of metathesis in Korean:
암클 amkhŭl or 암글 amgŭl < /am(h) kɯr/
'useless knowledge, female writing, hangul'
수클 sukhŭl or 수글 sugŭl < /su(h) kɯr/
'useful knowledge, male writing, Chinese characters'
That pair of words is not only sexist but also reflects a
The final /h/ of /amh/ 'female' and /suh/ 'male' surfaces as
aspiration following a stop which in this case is the /k/ of /kɯr/
The variants with -gŭl are compounds in which 'female' and 'male' have been reinterpreted as /am/ and /su/ without /h/. /k/ voices after voiced segments: /m/, /u/, and the /n/ of han'gŭl /hankɯr/ 'great/Korean-writing'.
Naver regards the -g-forms (amgŭl, sugŭl) as correct
and states that the -kh-forms are erroneous (see here
though Martin et al. (1967: 1011, 1095) only lists the -kh-forms.
Does that indicate the reanalysis of 'female' and 'male' as being
without /h/ has been completed over the past half-century? Not quite -
the official standard for Korean still requires aspiration in, for
instance,암캐 amkhae 'female dog' < /amh kɛ/
(not 암개 amgae!) in which am-
is still clearly 'female' (한글 맞춤법 Hangul Spelling 4.4.31 and 표준어규정
Standard Language Code 1.1.7). Perhaps the 'writing' words have lost
their gendered associations.
I found amkhŭl in Martin et al. (1967: 1095) when looking in vain for amthŭn (ㅋ kh is before ㅌ th in Korean alphaetical order).
6. I was surprised to learn that 怒濤 <ANGER WAVE> dotō)
is a Japanese name for a
kind of Faucaria plant.
(12.16.2:22: The same characters are the Chinese name [Mandarin nùtāo]
The Korean name for Faucauria tuberculosa is a combination of that and the kanji for the Japanese name of Faucauria tuberculosa (荒波 aranami 'wild wave') read in Sino-Korean: 怒濤荒波 nodo hwangpha.
Sino-Korean 怒濤 nodo 'angry wave' by coincidence sounds like
the unrelated native Japanese word 喉 nodo 'throad' - and by
another coincidence, Faucaria
is from Latin fauces
7. I thought faucet might be related to fauces 'throat', and Wiktionary agrees, but Merriam-Webster gives a derivation I don't quite follow:
Middle English, bung, faucet, from Middle French fausset bung, perhaps from fausser to damage, from Late Latin falsare to falsify, from Latin falsus false
turns out to be from falsus too.
The bottom of Merriam-Webster's entry for faucet led me to their Time Traveler feature showing what words were first attested in English in a given century: e.g., the 15th century (faucet, favored, feasible ...).
22.214.171.124:51: YELLOW PIG 11/19
<so nggiyan uliya aniya juwa emu biya oniohon inenggi>'yellow pig year, ten one month, nineteen day'
1. Jurchen oniohon 'nineteen' is unlike either Manchu juwan uyun 'ten nine' or Written Mongolian arban yisün 'ten nine'. It is a loan from some para-Mongolic language (presumably a nonstandard variety of Khitan) whose morpheme for '-teen' was something like *-hon (or *-kon if the Jurchen word was borrowed before the weakening of *-k- to -h- in Jurchen); yesterday's day number niuhun 'eighteen' has a high vowel harmonic variant of '-teen'. Janhunen (2003: 399) believes the Jurchen words for 'eighteen' and 'nineteen' have the same root before '-teen':
*o + nya(y)i.ku/n '? + eight-teen'
Could *o be related to Proto-Mongolic *onca 'unique'? If the root of unique is 'one', perhaps *o is 'one'.
What's not clear to me is why *a(y)i correspond to u and o in Jurchen. Did *a(y)i reduce to a single vowel that assimilated to surrounding vowels (the *u of '-teen' and *o- 'one')?
2. I got interested in Southern American English vowels
a year before I fell in love with Tangut in 1996. It's taken me 24
years to wonder if complex 'drawled' diphthongs like [æ̠ɛæ̠] in
Southern American English might have parallels in Tangut. If they do,
there would probably be no way to reconstruct them since no fine
phonetic notation for Tangut has survived. A simple-looking Tibetan
transcription of a Tangut rhyme like <e> might conceal something
like[æ̠ɛæ̠] or even ဇိုင်ဂူ <zuiṅ gū> Mon [ʌ ei̯a] (Diffloth 1984: 53¹, 226).
Southern American English [æ̠ɛæ̠] goes back to *æ (and before that, *a) and <zuiṅ gū> Mon [ʌei̯a] goes back to Proto-Monic *-iəw (Diffloth 1984: 226). So I presume that similarly complex Tangut vowels also had simpler origins. I still reconstruct only six vowels in pre-Tangut: *u *i *a *ə *e *o.
¹Diffloth (1984: 53) uses an underscore to
indicate "that portion of the vowel which is loudest", whereas I
presume the underscore inSouthern American English [æ̠ɛæ̠] is the IPA retraction symbol.
3. What kind of name is Onreitt? Onreitt Murtagh's name was so unlike those of her sisters Jean and Kate (the latter on the cover of Supertramp's Breakfast in America).
4. I just realized that methinks
only has third person singular forms (methink is a mistake)
the past form methought is presumably also third person singular even though any other person/number combination would be methought
never takes a subject (though me- is a built-in object no longer written separately)
doesn't seem to take any modals, so it has no future - no equivalent of 'it will seem to me'
Also found two other similar defective verbs: meseems and the pseudoarchaic (and obsolete) mehopes.
126.96.36.199:59: YELLOW PIG 11/18
<so nggiyan uliya aniya juwa emu biya niuhun inenggi>'yellow pig year, ten one month, eighteen day'
1. Tonight Stephen Colbert made a joke about the new Finnish prime minister Sanna Marin using the pseudo-Finnish phrase Okey Bøömer.
That phrase is so un-Finnish - even un-Scandinavian:
No language in Scandinavia uses <y> for [j]; <y> = [y]
but the joke is for English speakers, and most don't know that <j> = [j], so <y> was inevitable
No language in Scandinavia has both <ø> and <ö>, much less both together in the same word: e.g., Finnish only as <ö>
I doubt a more pseudo-Finnish Okej Buumer would have amused as many English speakers, though.
2. How is a violin like a prison?
Votre Nicolas est au violon de la ville
'Your Nicolas is at the violin [i.e., in the prison] of the town'
- Erckmann-Chatrian, Histoire d'un paysan, 1789-1815
3. Some interesting Cantonese characters:
3a. Cantonese me1 'to carry on the back'
has several spellings:
孭 = 子 <CHILD> + 貝 bui3
b-phonetics normally don't represent m-syllables
I would have made up something like 子 <CHILD> or 扌 <HAND> (cf. 揹 below) + 乜 me1
Tonight I realized this might be a simplification of 𡥼 (see below) facilitated by the vague phonetic similarity between me1 and 貝 bui3 (labial followed by rhyme with a palatal element)
𧴯 = 貝 bui3 + 子 <CHILD>
order of elements of 孭 reversed with 子 in the
𡥼 = 子 <CHILD> + 負 <CARRY>
graphically similar to 孭 which may be a simplification of 𡥼; is there philological evidence for 𡥼 being the oldest spelling?
揹 = 扌 <HAND> + 背 <BACK>
recycling of existing character for bui3, the Cantonese pronunciation of a character for Mandarin bēi 'to carry on the back' - in other words, native me1 is being written with a character for bēi, a rough translation equivalent in another language.
踎 = 足 <FOOT> + 否 <NOT> fau2
f-phonetics normally don't represent m-syllables
12.14.0:19: 否 is also read pei2; p- is a better phonetic match for m- than f-, but the rhyme -ei2 is a poor match for -au1
I would have made up something like 足 <FOOT> + 牟 mau4 or 某 mau5 (there is no phonetic mau1, though there is a 哞 mau1 with <MOUTH>)
188.8.131.52:59: YELLOW PIG 11/17
<so nggiyan uliya aniya juwa emu biya darhon inenggi>
'yellow pig year, ten one month, seventeen day'
1. Why isn't ambush embush?
2. Today I realized that Sanskrit and Okinawan represent two opposing approaches to mid vowel elimination:
Sanskrit neutralization: *e, *o > *a
Okinawan polarization: *e > i, *o > u
Sanskrit has neutralization in two senses:
a palatal and labial vowel became a neutral vowel.
a distinction between two vowel phonemes became neutralized
Okinawan vowels were polarized in the sense that they moved toward the points of the vowel triangle and away from the neutral center.
Both Sanskrit and Okinawan then developed new long vowels from vowel sequences:
*ai > eː
Sanskrit *aika- > eːka- 'one'
Okinawan *aite > eːti 'partner'
*au > oː
Okinawan *augi > oːji 'fan'
(Normally, length in Sanskrit e and o are left unmarked because those vowels are always long, but I have marked their length here and below for clarity.)
Even later, Pali and modern Okinawan developed short e and o:
Sanskrit jeːṣṭha- > Pali jeṭṭha- 'elder'
Sanskrit oːṣṭha- > Pali oṭṭha- 'lip'
Okinawan (y)eigo 'English' (borrowed from Japanese eigo 'id.')
Pali shortened long eː and oː in closed syllables to avoid overlong syllables (long vowels followed by codas).
Okinawan borrowed Japanese short e and o without modification in 'English'.
Some native Okinawan words seem to have Pali-style shortening of overlong syllables:
fensa < *feːnsa < *payambusa? 'peregrine falcon'
yonnaː ~ yoːnnaː 'slowly'
However, unlike Pali, Okinawan does permit overlong syllables: e.g., the yoːn- of 'slowly' above and yoːn 'lightly, gently, weakly'.
184.108.40.206:51: YELLOW PIG 11/16
I can't decide on a title, so I'm bringing back a generic Jurchen date title since 2019 is the 1000th anniversary of the Jurchen large script:
<so nggiyan uliya aniya juwa emu biya nilhun inenggi>
'yellow pig year, ten one month, sixteen day'
1. Yesterday I ran out of time to write about the ᠣᠯᠬᠣᠨᠣᠳ <ulqunut> Olqonud, the tribe of Genghis Khan's mother Höelün. The Mongolian Wikipedia article about that tribe is titled Олхонууд <Olxonuud> with a long vowelуу <uu>. ууд <uud> looks like a plural ending, so I suppose Olqonud is 'the Olqons'.
How far back does that long vowel go? Janhunen (2003: 5) writes,
In spite of claims made to the contrary, it has been impossible to establish any quantitative correlation for the Proto-Mongolic vowels. While virtually all the Modern Mongolic idioms have distinctive long (double) vowels, these are of a secondary contractive origin. Occasional instances of irregular lengthening are observed in most of the modern languages, and in a small number of cases there would seem to be a correspondence between two peripheral languages, notably Dagur and (Huzhu) Mongghul, as in Dagur mood ‘tree, wood’ = Mongghul moodi id. < *modu/n. In spite of the seemingly perfect match, such cases are too few and involve too many counterexamples to justify any diachronic conclusion other than that of accidental irregular convergence.
Having said that, Janhunen (2003: 45) goes on to reconstruct a long vowel in *-UUd from an even earlier *-U-d. *-U- (later *-UU-) is a linker vowel of unspecified 'phonological gender' inserted between a final consonant and the plural ending *-d. *-U- is *-u- after masculine vowel stems and *-ü- after feminine vowel stems: e.g. (examples added 12.12.2:03),
*nom-ud 'books' (Janhunen 2003: 12); now Khalkha номууд <nomuud>
*cerix-üd 'soldiers' (Janhunen 2003: 64); now Khalkha цэргүүд <cergüüd>
Why would a linker vowel become long?
There is another Written Mongolian plural suffix ᠨᠤᠭᠤᠳ /ᠨᠤᠭᠦᠳ <nughut>/<nugut> which Janhunen reconstructs as *-nUUd (not *-nUgUd!). I guess <gh>/<g> is an orthographic pseudoarchaism: the logic being '-UU- is a long vowel, and long vowels in speech often correspond to <VghV>/<VgV> in writing, so -UU- should be written as <VghV>/<VgV> too: e.g. (examples added 12.12.2:21),
<yaghan nughut> jaghan nugud 'elephants', now Khalkha заанууд <zaanuud>
<cacag nugut> ceceg nügüd 'flowers', now Khalkha цэцэгнүүд <cecegnüüd>
¹Manchu moo 'tree, wood' also has a long vowel. Loanword or cognate? But I digress.
2. On Monday it took me a moment to realize that 홋카이도 <h.o.s kh.a Ø.i t.o> Hotkhaido on a sign in Honolulu stood for 'Hokkaido'. That got me thinking about the many ways kana have been transcribed in hangul. Although Japanese and Korean are typologically similar in many ways and also share a large amount of vocabulary of Chinese origin, they have very different phonological systems: e.g.,
Japanese has a two-way distinction between voiceless and voiced obstruents: /k/ vs. /g/.
Korean has a three-way distinction between plain, aspirated, and reinforced obstruents: /k/ vs. /kʰ/ vs. /k͈/. Plain obstruents have voiced allophones after voiced segments: /k/ can be [g] (but there is no phoneme /g/).
One challenge for Korean transcribers of Japanese is distinguishing
between Japanese voiceless and voiced obstruents. Here are several
solutions to the problem from Wikipedia. I use /k/ and /g/ as examples:
|1986 South Korean standard
|2001 North Korean standard
|Japanese colonial standard
|Korean Language Society
|1948 South Korean standard||/k/
|1963 South Korean standard
|Chhoe Yŏng-ae and Kim Yong-ok
Japanese noninitial /k/ cannot be precisely replicated in Korean. The majority solution is to Koreanize it as /k/ even though Korean /k/ is voiced [g] in that position. The current South and North Korean standards Koreanize Japanese /k/ as voiceless /kʰ/ and /k͈/. Compare:
||/naka/ [naka] 'middle'
||/naga/ [naga] ~ [naŋa] 'long'
|South Korea (1986)
||/nakʰa/ [nakʰa]||/naka/ [naga]|
|North Korea (2001)
||/nak͈a/ [nak͈a]||/naka/ [naga]|
Japanese initial /g/ also cannot be precisely replicated in Korean.
The majority solution is to Koreanize it as /k/ [k].
The most interesting solution is the colonial one: Japanese /g/ is
transcribed as <k> with a circular diacritic. I presume
<°k> was to be read as [g] even in initial position. There are
two interesting things about that diacritic. First, <°> in
Japanese indicates a voiceless stop [p], not voiced obstruents. Second,
in Japanese is placed to the top right of kana, not the top left. I
suspect a circle was chosen because it was a shape that already existed
in hangul unlike the Japanese voicing diacritic ゛.
Japanese noninitial /g/ can also be pronounced as [ŋ], but that nasal variant is not reflected in any of the above Koreanizations, even though Korean does have /ŋ/ [ŋ] in noninitial position: e.g., Japanese [naŋa] 'long' sounds like Korean 낭아 <n.a.Ø Ø.a> /naŋa/ [naŋa].
220.127.116.11:35: ANOTHER EMPRESS XUANYI (PART 2)
The Japanese Wikipedia has yet other renderings of the name of Genghis Khan's mother Höelün Üjin 'Lady Hoelun', a.k.a. Empress 宣懿 Xuanyi:
ホエルン <hoerun> (the title of the article itself)
Old Mandarin 月也倫 *ɥe je lun (from 元史 Yuan shi [History of the Yuan Dynasty, 1370])
اولون فوجین <ʔwlʔwn fwjyn> (source unspecified)
The katakana spelling looks like a transliteration of Höelün sans diacritics.
The Old Mandarin spelling 月也倫 *ɥe je lun has front vowels unlike the Secret History spelling 訶額侖*o o lun. The first spelling seems to represent [øelyn], whereas the second spelling might represent [hoəlun]. Do the spellings represent two different Mongolian dialects: one with Turkic-style palatal harmony and another with height harmony?
|palatal harmony dialect
|height harmony dialect
The first character of the Yuan shi transcription is crucial: Old Mandarin 月*ɥe cannot stand for a simple [o] which would have been transcribed as Old Mandarin *o. And Mongolian vowel harmony dictates that vowels within a word must match in terms of 'gender': feminine [ø] must be followed by feminine [e] and [y]. Old Mandarin had no syllable *e, so 也 *je was the best available approximation of [e]. Old Mandarin had no syllable *lyn, so 倫 *lun was the best available approximation of [lyn].
The second character of the Secret History spelling訶額侖 is also crucial: Old Mandarin 額 *o cannot stand for a simple [e] which would have been transcribed as Old Mandarin *je. Old Mandarin had no syllable *ə, so 額 *o was the best available approximation of [ə]. In theory 額 *o could even represent [o] or [ø], but the Written Mongolian spelling <a> for this vowel rules out rounded vowels which would have been spelled as <ui>. The other characters are ambiguous out of context:
Old Mandarin 訶 *xo could represent [ho], [hø], or [hə].
But the Written Mongolian spelling <ui> for this syllable rules out a nonlabial vowel which would have been spelled as <a>.
Old Mandarin 侖 *lun could represent [lun], [lyn], or [lʊn].
But if the preceding two vowels are feminine, then the vowel of this syllable also has to be feminine, so [lʊn] with a masculine vowel can be ruled out, and the choice of [lun] or [lyn] depends on whether the name had height or palatal harmony.
The Arabic script transcription is ambiguous: <ʔwlʔwn> could represent either [øelyn] or [oəlun] - or even other possibilities that the Chinese and Mongolian spellings rule out: e.g., [ulun].
The Arabic script transcription <fwjyn> looks like a straightforward transcription of Old Mandarin 夫人 *fu žin 'lady' rather than the Mongolian borrowing of that Chinese word as üjin 'id.'
2. The English Wikipedia article on Höelün says
also had a nephew named Palchuk who married a sister of Genghis Khan (Temülün, whose name is misspelled as "Temulin")
The name Palchuk has an un-Mongolian initial p-. Earlier *p- became h- or zero in Mongolian. If Palchuk isn't Mongolian, what is it? It sounds Ukrainian. But seriously ...
The Japanese Wikipedia, on the other hand, says Genghis Khan's sister 帖木倫 Temülün married 不禿 Butu of the Ikires. Höelün was of the Olqonud, not the Ikires, so a nephew of Höelün would be likely to be of the Olqonud too.
18.104.22.168:46: ANOTHER EMPRESS XUANYI (PART 1)
When I refer to "Empress 宣懿 Xuanyi" on this blog, I refer to 蕭觀音 Xiao Guanyin (r. 1055-1075) of the First Khitan Empire.
But it turns out there are two other Empress Xuanyis:
Today the spelling of Höelün Üjin 'Lady Hoelun' in a 1908 edition of the Secret History of the Mongols caught my eye:
Old Mandarin *o o lun u tʂin
Where's the Old Mandarin *x- that should correspond to Middle Mongolian (MM) h-? 阿 *o looks like an error for 訶 *xo.
If the Secret History were all that remained of Mongolian, we might have to assume Höelün was Xoolun. How do we know Xoolun stood for Höelun? Even if we didn't have modern Mongolian Өэлүн <Öelün>, we could still get closer to the original via the Written Mongolian (WM) spelling ᠥᠭᠡᠯᠦᠨ <uikalun>:
WM has zero corresponding to MM h-
WM <ui> could stand for either ü or ö, but the Chinese spelling rules out a high vowel
WM <k> could stand for either k or g, but a g is more likely to lenite intervocalically to the zero of MM (but isn't WM g : MM Ø irregular?)
WM <a> could stand for either a or e out of context, but if preceded by <ui>, it must stand for e
WM <u> could stand for u, ü, o, or ö out of context, but if preceded by <ui>, it must stand for ü or ö, and the Chinese spelling rules out a mid vowel
Putting the MM and WM evidence together, I could reconstruct a Proto-Mongolic name *Högelün. (There is no 'Old Mongol'.) *h- goes back to an even earlier *p-.
The word üjin (WM <uijin>) 'lady' is a borrowing from Late Middle Chinese or Liao Chinese 夫人 *fuʐin 'id.' That word was also borrowed into Khitan as
I'm surprised Chinese *f- wasn't similarly borrowed into pre-MM as *p- which would have become MM h-, not zero. Was *fuʐin borrowed into pre-MM as üjin without any initial consonant?
22.214.171.124:56: THÁNH GIÓNG (PART 2)
1. How was the name 揀 Gióng pronounced in earlier Vietnamese? The vowel *ɔ is certain. The rest is not:
gi- could be from *kj-, *CVc-, or *pl-
-ng could be from *-ŋ or *-n
Nom spelling variants of Gióng might be able to narrow down the possibilities:
𢫝 with phonetic 冬, presumably here an abbreviation of 終 (Sino-Vietnamese chung < *c-) rather than 冬 (Sino-Vietnamese đông)
𢫝 also represents đong 'to measure' and in that case, the phonetic 冬 must be 冬 đông and not short for 終 chung
𢶢 with phonetic 衆 (and graphic variants; Sino-Vietnamese chúng < *c-)
All three types of variants share the semantic element 扌 <HAND> since gióng means 'to beat (a drum'. So does the name Gióng mean 'The Drumbeater', or is it an unrelated homophone written with characters originally devised for gióng 'to beat (a drum)'? (No, see below.)
1a. The initial of Gióng
The Sino-Vietnamese reading of 揀 is giản < *kj-
But 𢫝 and 𢶢 have ch- < *c-phonetics which would seem to point to *CVc-
The absence of spellings with labial phonetics might rule out *pl-. Compare the spellings of Gióng with those of giai ~ trai < *pl- 'male' which have both labial and velar phonetics:
𪩭 < 巴 ba + 來 lai, the most conservative and hence the oldest spelling in this list. Wish I had manuscript dates to back up that claim, though.
this spelling must postdate *kj- > gi- and *pl- > gi-: i.e., it presumably must be newer than 𪩭
佳 giai 'good' (recycled for giai 'male')
another spelling presumably newer than 𪩭
𪟦 < 男 <MAN> + 佳 giai
I assume 隹 is derived from 佳, as 隹 chuy doesn't sound like giai
I wonder if 扶董 <HELP SUPERVISOR> Phù Đổng, another name
of Gióng, might be a phonetic transcription of an older sesquisyllabic
or disyllabic labial-initial form. The transcription might only loosely
resemble the original if the transcription characters were chosen to be
meaningful at the expense of phonetic fidelity.
Trần Quốc Vương's "The Legend of Ông Dóng from the Text to the Field" (1995) has made me rethink everything I just wrote above. Here's what I think happened now:
The original name was something like *pVtɔ́ŋ (with *-t-, even though I hadn't even considered that as a possibility when I started this post!)
name was approximated in Sino-Vietnamese as 扶董 Phù Đổng, a spelling
whose first known attestation is as a village name in 安南志略 An Nam chí lược (Abbreviated Records of Annam; 1333). It's unclear whether the village is named after the man or vice versa.
I suspect the spelling 扶董 may go back centuries to a period when it was pronounced with an initial unaspirated stop and a *sắc tone on the second syllable: *buətɔ́ŋ (as an approximation of an early Vietnamese *pətɔ́ŋ? - there was no early Sino-Vietnamese syllable *bə or *pə)
*-t- lenited intervocalically to *-d-. Later the preceding syllable was lost, and *d- weakened to *ð- and shifted to [z] in the north where the Gióng legend originated.
In northern Vietnamese, d-, gi-, and r- all merged into [z] (though the distinction remains in spelling which is largely etymological).
The name [záwŋ͡m] was spelled as Gióng because the young Gióng was "put [...] in a hammock tied to a gióng tre, the trunk of a bamboo [tre]." (p. 18)
But a more etymological spelling would be Dóng with a d- < *pVt- (in this case; other instances of etymological d- are from *CVt-, *CVd-, and *j-).
Trần and Cao Huy Đỉnh (1967) think Dóng is related to dông 'storm', but the vowels and tones do not match, so I think the words are unrelated. I can't find any Vietnamese word dóng other than the name, but I wonder if the name might have cognates in other Vietic languages.
1b. The coda of Gióng
The oldest spelling 扶董 points to *-ŋ. So do 𢫝𢶢. 揀 has an -n-phonetic, but is overruled by 扶董; it must be a later spelling created by someone speaking a nonnorthern dialect in which *-n shifted to [ŋ].
(Note that -on and -ong have not become homophonous in any dialect as far as I know: the distinction between the two in nonsouthern dialects is [ɔŋ] vs. [awŋ͡m] corresponding to [ɔn] vs. [awŋ͡m] in the north.)
1c. A chronology of spellings of Gióng:
𢫝𢶢: *c-phonetic spellings postdating the merger of gi- < *CVc- and d- < *pVt-
揀: *kj- and -n-phonetic spelling postdating the merger of *gi- < *kj- and d- < *pVt- and reflecting a nonnorthern dialect in which *-n and *-ŋ merged as [ŋ]; the spelling most divergent from the original pronunciation and hence perhaps the newest
Trần (1995: 27) "would like to conclude that the impact of Indra [of the Cham] on the portrayal of Phù Đổng is undeniable. In other words, Phù Đổng Thiên Vương [Heaven King] is, in fact, the Vietnamese metamorphosis of Indra."
I'd like to read an article on the Cham element in Vietnamese culture. Unfortunately recovering similar substratal elements in the Korean and Japanese cultures would seem to be more difficult given the extinction of other cultures on the peninsula and in the islands; we can't say belief or practice X is from Y if we don't even know what Y is like.
Sino-Vietnamese 天 Thiên 'heaven' in 扶董天王 Phù Đổng Thiên Vương 'Heaven King Phù Đổng' sounds like pʰatʰɛ̂ːn 'sky' in the Vietic language Thavung which unlike Vietnamese doesn't have an enormous number of Chinese borrowings. It took me almost three hours to realize that pʰatʰɛ̂ːn is a borrowing from Lao ຟ້າແຖນ [fȃː tʰɛ̆ːn] 'sky' (poetic), a synonym compound of native Lao [fȃː] 'sky' and [tʰɛ̆ːn], a Lao borrowing from Chinese. Were all Thavung words of Chinese origin borrowed recently through Lao?
126.96.36.199:57: THÁNH GIÓNG (PART 1)
1. Most of the characters of Vietnam's mythical past have anachronistic Sino-Vietnamese names. One exception is 聖揀 Thánh Gióng 'Sage Giong'. 聖 is of course a Sino-Vietnamese title for 'sage' and not a name. ButGióng is an indigenous name with an unusual nom spelling 揀.
The Sino-Vietnamese pronunciation of 揀 is giản with -n, not -ng. Normally Chinese characters with Sino-Vietnamese readings ending in -n are not used to write native Vietnamese words ending in -ng. But in this case and others like it, I wonder if whoever chose -n characters for native -ng words spoke a dialect with an [n] > [ŋ] shift: i.e., a dialect in which 揀 giản sounded like giảng which is closer to Gióng. Codas in central and southern dialects have undergone a chain shift:
[ɲ] > [n] > [ŋ]
[c] > [t] > [k]
If 揀 originated as a nonnorthern spelling, how was Gióng spelled in the north?
Nom is usually treated as a single body of characters even though it
was in use for centuries throughout Vietnam. I'd like to see that body
analyzed into geographical, dialectal, and chronological strata. Nom
could tell us about when and where sound changes occurred: e.g., when
and where -n and -ng characters were first confused (implying [n] > [ŋ]).
2. Can you guess what a kamprag is?
188.8.131.52:57: HOW AI COULD ELUCIDATE THE ORIGINS OF THE KHITAN AND JURCHEN SCRIPTS
Last night I somehow got the idea that the Jurchen phonogram
might be related to Chinese 恭 <REVERENT> (old and calligraphic forms) rather than 拳 <FIST> (as I wrote about two days ago). I don't know how 恭 <REVERENT> got into my head. I thought I might have seen it when I was scrolling through Wells' "" (2011), but it's not actually there.
In Late Middle Chinese, 恭 <REVERENT> was pronounced *koŋ (cf. Sino-Korean 공 kong) - a good match for Jurchen [kɔɴ]. No need to invoke Old Chinese as I did with 拳 <FIST>. The Jurchen character could simply be based on a Parhae variant of Late Middle Chinese 恭 *koŋ.
Tonight I was wondering if a machine could be 'taught' to find potential Chinese graphic cognates for Khitan and Jurchen characters. One could start training it with Khitan and Jurchen characters identical in shape with Chinese characters: e.g.,
Then one could introduce Khitan and Jurchen characters nearly identical in shape with Chinese characters: e.g.,
Ultimately, one would then 'feed' the machine Khitan and Jurchen characters that have no obvious Chinese graphic cognates and 'ask' the machine if there are any near-matches: e.g., for Jurchen <gon>.
One could even add a phonetic dimension to the search process and get the machine to favor potential Chinese graphic cognates with readings close to the Khitan or Jurchen readings (whenever known).
Thirty-five years ago today, I got my first exposure to 連環畫 lianhuanhua - a copy of The People's Comic Book translated by Endymion Wilkinson - at a school fundraising carnival, unaware that just a short walk away, the University of Hawaii would one day have a lianhuanhua collection.
Here's a Sinification I had to DuckDuckGo because I couldn't guess what the original was: 白求恩 Báiqiúēn.Select the blank area below to see what the original is:
白求恩 Báiqiúēn is (Norman) Bethune.
Bái is a Chinese surname that sounds like Be-.
I was surprised by qiú [tɕʰjow] for -thu- [θuː]. I would have expected t [tʰ] or s [s] intead of q
[tɕʰ] for [θ]. But eventually I realized that 求恩 <SEEK FAVOR> is
a meaningful verb-object sequence as well as a loose approximation of -thune.
I had a vague memory of Wikipedia having an article on conventions for Sinifying foreign names. This wasn't it, but it was interesting nonetheless. It reminded me of the 佛菻 Fulin problem that I wrote about almost ten years ago. It also taught me the term graphic pejoratives for what I've called derography (derogatory spellings).
184.108.40.206:59: JURCHEN FIST - BARRIER BREAKER
I've been slowly copying Kiyose's (1977) edition of the Sino-Jurchen vocabulary of the Bureau of Interpreters. This forces me to take a good look at characters and think about how they're pronounced.
Entry 43 is gonkeu 'mountain pass', a borrowing from
northeastern Chinese 關口 *gonkeu 'id.' (lit. 'barrier mouth'):
Note how the style of the two characters doesn't match. I can't find the second character in Jason Glavy's Jurchen font, in Jerry You's fonts, or in N3696. It seems to have been overlooked because it doesn't have an entry in Jin's (1984) Jurchen dictionary. It is identical in shape to N4631 1734 which is in Jerry You's Khitan large script font, so I've made an image of N4631 1734 in lieu of crafting a Glavy-style image.
a transcription character for northeastern Chinese *gon-syllables (觀冠館 as well as 關), resembles Chinese 拳 <FIST> (see old forms here) and even vaguely sounds like its Early Old Chinese reading *NI-kron. Later readings of 拳 do not have o-like vowels:
Late Old Chinese *gwɨan
Middle Chinese *gwɨen (> Sino-Korean 권 kwŏn [kwən])
Liao/Jin Chinese *küen [kʰɥen]
If the character were 'invented' c. 1119 according to the conventionally accepted scenario, why modify Jin Chinese 拳 *küen [kʰɥen] to represent the syllable gon [kɔɴ]? There was no shortage of Chinese characters with *gon-like readings (e.g., the aforementioned 關觀冠館) that could have been phonetically more appropriate models for a Jurchen character <gon>. I think Jurchen <gon> was inherited from an older tradition going back to a time when 拳 had *o in Chinese:
|拳 Early Old Chinese *NI-kron
(graph shape unknown)
*<gon>-like reading (and other non-Chinese-based readings?)
(graph shape unknown)
*<gon>-like reading (and other non-Chinese-based
Khitan large script
|Jurchen large script
Following Janhunen (1994: 114), I regard "the Khitan and Jurchen
'large' scripts [...] as parallel, rather than successive,
developments" of the Parhae script, so I do not think the Khitan large
script <gon>-like character from the epitaph for the 太師 Grand
Preceptor (1056) as written in Jin (1984: 17) is ancestral to Jurchen
<gon>. The two, however, should share a Parhae ancestor.
The problem with the above scenario - besides the fact that the hypothetical Serbi and Parhae ancestors of <gon> are not attested - is the huge gap of over a millennium and a half between Early Old Chinese c. 1000 BC and the (unattested!) Serbi script of c. 400 AD.
It's not entirely implausible, though, that some archaic nonprestige dialect of northern Chinese preserved *o as late as 400 AD. The 8th century Old Japanese phonogram 支 ki (earlier read *ke) reflects an Old Chinese *kie whose initial had palatalized to *tɕ- in prestige dialects long ago. The practice of writing *ke as 支 originated from the Korean peninsula centuries earlier and must have started at a time when some northern Chinese dialect still preserved *k-. (支 ki in Taiwanese and other Min varieties in the south preserves the original initial to this day.)
Jurchen <gon> has variants that do not look much like
Chinese 拳 <FIST>:
in line 2 of the monument commemorating the victory of Emperor 太祖 Taizu of Jin over the Khitans in 1114 (1185; the earliest attested form)
~ on the bottom of ne 11 of the monument recording the names of successful candidates for the degree of 進士 jinshi in 1224 (Jin [1984: 17, 199] writes this character two different ways, and without seeing a photo or rubbing of the monument, I don't know which is correct)
(12.5.1:13: Jin and Jin [1980: 301] have the form with ㄴ in their hand copy of the text of that monument. I can't even find the character in this rubbing.)
The most 拳 <FIST>-like form
is first attested as a transcription of the first syllable of 觀音 *gonin
'Guanyin' in line 11 of the monument commemorating the foundation of
永寧寺 Yongning Temple (1413).
12.6.21:06: APPENDIX 1: Modern Chinese o-reflexes of 拳 Early Old Chinese *NI-kron
'fist' from Xiaoxuetang:
Their k(ʰ)- does not reflect the original root-initial *k-; it is from a later fused *g- < *ŋg- < *ŋk- < *NIk-.
I do not know which of those forms is native and which is borrowed.
Moreover, I do not know the details of the phonological histories of those varieties, so I cannot be certain that their -(u)o- directly preserves Old Chinese *-o-. Late Old Chinese *-wɨa- or Middle Chinese *-wɨe- could have fused into -(u)o- later.
I have excluded reflexes like 弋陽 Yiyang Gan ɕʰyon 13 with fricative and affricate initials because they are less conservative-sounding: i.e., because their initials are no longer velar. But who knows, maybe their rhymes are more conservative than their initials.
I initially thought Yiyang Gan ɕʰ- was a typo for Yiyang Gan tɕʰ-, but Xiaoxuetang lists three other characters read ɕʰyon: 穿權棬. All have [tɕʰ-] in standard Mandarin. If ɕʰ- is a typo, it's not an isolated one. On the other hand, Xiaoxuetang has 124 characters read with tɕʰ- in Yiyang Gan and 0 characters read with ɕʰ- and rhymes other than -yon. So either ɕʰ- has an extremely restricted distribution or it is a typo for the readings of one homophone set (穿權棬拳).
APPENDIX 2: 12.9.22:18: A history of 拳 'fist' from Old Chinese to modern standard Mandarin:
1. The root of 'fist' is *kron 'roll', which does not seem to occur by itself and therefore has no characters. It is also in 卷 *CI-kron-ʔ 'to roll'.
2. A prefix *NI- was added to this root.
The prefix has to have a nasal initial *N- to account for the later voiced initial (see steps 6-8 below).
The prefix has to have a high vowel *-I- to account for the later vocalism (see step 4 below).
The prefix seems to be a nominalizer: 'roll' > 'rolled thing' > 'fist'.
But Baxter and Sagart's (2014: 54) *N(ə)- was not a nominalizer; it converted transitive verbs into intransitive verbs.
Perhaps *NI- was *mI-. Baxter and Sagart (2014: 55) reconstruct an *m- that converts verbs into agentive/instrumental nouns and an *m- for body parts. But neither of these *m- (= *mI- in my system?) are good fits: an agentive/instrumental noun from 'roll' should be mean 'roller', not 'fist', and the body part prefix is added to nouns, not verbs.
3. *o broke to *wa: *NI-kron > *NI-krwan
I follow Starostin (1989) who posits *o-breaking at the 'Classical Old Chinese' stage immediately after the 'Preclassical Old Chinese' stage which is the earliest stage in his reconstruction.
4. A prefix *NI- with a high vowel triggered vowel bending in the following syllable: *NI-krwan > *NIkrwɨan with *a partly bent up to match the height of the unknown high vowel *I
5. The high vowel was lost:*NIkrwɨan > *Nkrwɨan
6. *N assimilated to *k- (if it wasn't already velar): *Nkrwɨan > *ŋkrwɨan
7. *k- assimilated to *ŋ-: *ŋkrwɨan > *ŋgrwɨan
8. *ŋ- was lost: *ŋgrwɨan > *grwɨan
9. *-r- was lost: *grwɨan > *gwɨan
10. *-a- fronted: *gwɨan > *gwɨen
11. *-ɨ- fronted: *gwɨen > *gwien
12. *-wi- fused into *-ɥ-: *gwien > *gɥen
13. The level tone developed two allophones: one in syllables with *voiced initials like *g- and another in syllables with *voiceless initials like *k(ʰ)-.
14. *g- aspirated and devoiced: *gɥen > *gʱɥen > *kʱɥen > *kʰɥen; the allophones of the level tone became phonemic after *g- and *kʰ- merged into *kʰ-
15. *kʰɥ palatalized: *kʰɥen > quán [tɕɥɛn]
What I wrote as *e might have been [ɛ] all along, but I have chosen a simpler symbol since there was no contrast between */e/ and */ɛ/ in diphthongs.
Some of the relative chronology is unclear: e.g., 14 and 15 must
have followed 13, but I'm not sure whether 13 followed 12 which must
have followed 11.
The Wiktionary entry for ear says its
Persian cognate is هوش hush
'intellect' which surprised me because I expect Persian h- to
correspond to English s-: e.g., Persian هفت haft
: English seven. Wiktionary reconstructs 'intellect' at
the Proto-Indo-Iranic level as *Hā́wšiH 'ears; understanding'
and at the Proto-Indo-European level as *h₂ṓws
'ear'. Following Beekes, I interpret Proto-Indo-European *h₂ as
*ʕ, and I interpret Proto-Indo-Iranic *H as a glottal
stop *ʔ (cf. Beekes' /ʔ/ < *H in Avestan). None of
those sounds should correspond to Persian h which is from
Proto-Indo-Iranian and ultimately Proto-Indo-European *s-
(e.g., 'seven' which was *septḿ̥ in Proto-Indo-European).
h- in Persian hush is irregular like the h- in
Greek ἵππος híppos
'horse' < Proto-Indo-European *ʔéḱwos (cf. Persian
'horse' which has no h-). (The i of híppos is
Could the h- of hush and híppos be by
analogy with h-words with similar semantics? But what would the
models be? And how did h- appear in both the Kurdish and
Persian forms of the word? h- seems to be an innovation in
Middle Persian hōš (Old Persian ušiy has no h-).
Did Kurdish acquire h- through contact with Persian?
What got me thinking about ears was a stand-up comic on the radio joking about boxen as a plural of box. That made me check what the Old English plural of box was (boxas) and look into Old English declension in general. Here's a colorful summary. Via Wikibooks I found earan 'ears' as an example of a real Old English n-plural.
Then I started thinking about French plurals and via Wikipedia found Mickael Korvin's nouvofrancet proposal to spell all plurals with -s, among other things (like respelling the -ais of the proposal's name as -et). Is there a book like Robbins Burling's Spellbound (now only $14.67 US on Amazon!) on French spelling reform?
ADDENDUM: 12.3.23:39: Today I realized that English ear and hear are near-homophones. I'm afraid to look for a folk etymology 'deriving' one from the other. To my surprise, Wiktionary derives hear from a Proto-Indo-European compound *h₂ḱh₂owsyéti < *h₂eḱ- 'sharp' + *h₂ows- 'ear' + *-yé- (denominative suffix) + *-ti (3rd person singular suffix). The h- is all that is left of *h₂eḱ- (and it is a remnant of *ḱ-, not *h₂- which I interpret as *ʕ-).
Tonight I discovered the TVK2 channel which airs content from Arirang TV whose onscreen logo alternates between English and Korean. I can't find the Korean logo online. It looks something like this:
"Something", because the letter ㅇ has the same shape and size at
both ends of the logo which is almost symmetrical. So all the hangul
letters are on the same line, whereas in normal hangul, <ng>
would be under <r.a>: 랑, not 라ㅇ. Are linear hangul logos 'in'
now, or is this logo an outlier?
If the Khitan small script had survived into modern times, how would
it have been computerized? Hangul blocks represent syllables, but
Khitan small script blocks represent words (including inflected forms)
which are far more plentiful than syllables in any language. In
pre-Unicode days, the KS X 1001 encoding of Korean only allowed for
2,350 out of 11,172 possible modern Korean hangul syllables. There must
be more than 2,350 or 11,172 possible Khitan small script blocks.
Unicode and sophisticated character-combining fonts can handle the
Khitan small script now, but how would computers thirty years ago have
handled them? Would pre-Unicode computerization have popularized
linearization of the Khitan small script?
Back to Korean: I saw this episode of Gangnam Insider's Picks
on TVK2 which mentioned Guardian:
The Lonely and Great God at 17:00. The Korean title is
쓸쓸하고 찬란하神 – 도깨비
ssŭlssŭl-ha-go chhallan-ha-shi-n - tokkaebi
'lonely-be-and resplendent-be-HON- - goblin'
= 'goblin that is lonely and resplendent'
The title is written entirely in hangul except for the
honorific-adnominal suffix sequence -shi-n written with the
homophonous hanja 神
<GOD>. I've never seen this kind of hanja wordplay in modern
Korean before. (The use of hanja to write homophonous Korean words is,
of course, a core practice of the extinct hyangchhal and idu writing
that show title and much more in calligraphy.
Today UPtv's GilMORE the Merrier 153-episode marathon of Gilmore Girls ended.
One of the show's stars is Matt Czuchry who "is of Ukrainian descent on his father's side." I was surprised to learn that his name is pronounced [ˈzuːkri] in English rather than [ˈtʃuːkri] which is closer to the Ukrainian pronunciation of Чухрій <Čuxrij> as [tʃuxrʲij] (where's the stress?). I suppose [z] is a spelling pronunciation of Czuchry which looks like a Polish-style romanization. Did Czuchry's paternal ancestors come from western Ukraine?
I got the Ukrainian spelling of Czuchry from the
Ukrainian Wikipedia (which unfortunately does not specify the
Russian Wikipedia simply Russifies the English pronunciation of his
name as Зукри <Zukri> [ˈzukrʲi] instead of Russifying his
Ukrainian name as Чухрий <Čuxrij> [tɕuxrʲij].
While I'm on the subject of Ukrainian names, Wikipedia has a
list of "somewhat comical" Cossack surnames. My favorite is
Добрийвечір <Dobryjvečir> 'good evening'. Google
shows that surname is alive and well in Ukraine today.
220.127.116.11:10: GHO GUO: THE COUNTRY OF 309
After spending almost all week on spellings of Khitan qudugh (if that is what
<FORTUNE FORTUNE₂ FORTUNE₃ FORTUNE₄ FORTUNE₅>
represent - see part 4 of "The Qudugh Question" for a different interpretation), I'd like to prove that I can think about other Khitan small script characters.
When looking for instances of isolated
in 契丹小字研究 Qidan xiaozi yanjiu (Research on the Khitan Small Script, 1985) while writing part 5 of "The Qudugh Question", I stumbled upon a hand copy of the text on a coffin containing ... 国 in line 4. Without seeing the actual coffin (photos would do), I think 国 - an exact lookalike of the Chinese character <COUNTRY> pronounced guó in modern standard Mandarin - might be a mistake for
Kane (2009: 72) transliterated 309 as <hó>. <h> is his symbol for [ɣ]. My uvular gh [ʁ] corresponds to his velar <h>. The acute accent indicates that a vowel may have "the same, or perhaps a similar pronunciation" as its unaccented counterpart. I am more agnostic about vowels, so I don't add any accents to avoid implying that all characters that Kane transliterates with accented <ó> share a vowel that distinguishes them from the characters that he transliterates with plain <o>. In this particular case, he needs an accent to distinguish between
076 <ho> and 309 <hó> (both <gho> in my system)
in his transliteration.
I don't know yet that 309 rhymes with
021 090 125 169
which Kane transliterates as <mó ó ió qó> to distinguish them from
021 090 168 <m(o) o qo>
in his transliteration. I can't find any <io> in his system, so I do not know why he transliterates 125 as <ió> on p. 302. He also transliterates 125 as <iáu> on p. 49 to distinguish it from
which may be "an allograph" (and hence would have the same vowels). The problem of allography in the Khitan small script remains to be fully solved.
Let's focus on the problem of how to read 309. The key is how 309
seems to correspond to the transcription 訛 in the name 訛里本 (= Khitan Gholbun?)
in 遼史 Liao shi (History of Liao, 1344; see Kane 2009: 72 for
details). In 1344, 訛 was read as something like *o in Old
Mandarin. But was 14th century Old Mandarin the language underlying the
choice of 訛? In Liao Chinese 訛 was read as something like *ng(w)o.
ng- is unlikely for a native Khitan word, and ngw- even
less unlikely. (Initial ŋ- is uncommon in the 'Altaic' world,
and ŋw- may be unknown except in loanwords.) So that seems to
rule out interpreting 訛 in terms of Liao Chinese (though one must
wonder how accurately Gholbun's name was preserved by the 14th century,
two centuries after the fall of the [first] Khitan Empire in 1125 - the
only datable Gholbun I can find is also known as 侯古 Hougu, sixth son of
Emperor 聖宗 Shengzong [b. 972; r. 982-1031]).
Perhaps Khitan 309 gho [ʁɔ] was approximated in Old Mandarin as something like *o without any initial consonant since Old Mandarin had nothing like [ʁ]. 309 could not simply be o since a character for o has already been identified:
That character represents Liao Chinese *o in loanwords. 309, on the other hand, is apparently never in Khitan small script spellings of Chinese loanwords. That implies 309 represented a sound or a syllable absent from Liao Chinese. gho fits the bill: it has an un-Liao Chinese initial gh- disqualifying it from Khitan spellings of Chinese loans, and its vowel o matches the vowel of its apparent transcription 訛 *o.
11.30.21:57: APPENDIX: Other readings of 309
Liu (2009, 2014) reads 309 as u which doesn't match 訛 *o. It is, however, homophonous with Liu's readings of
076 e ~ u ~ ulu, 172 u, 245 u, 372 u, 131 u
which I read as gho, ugh, u, o, u, u, more or less following Kane (2009). (I can't explain the differences between the various u-graphs either. 131 is the usual graph for transcribing Liao Chinese *u, but 172 and 372 also transcribe that vowel. See Kane [2009: 246-247].)
Jishi (2012) reads 309 as k'ua ([kʰwa]?) which is even further from 訛 *o. 訛 did end in Middle Chinese *-wa (cf. the Middle Chinese-derived Sino-Korean reading 와 wa of 訛), but *-wa had become *-(w)o by the Liao dynasty, and 訛 never began with a voiceless stop, aspirated or otherwise.
18.104.22.168:21: THE QUDUGH QUESTION (PART 5)
Here are the contexts of the two Khitan small script blocks
<FORTUNE₂> and <FORTUNE₄>
from Part 4:
1a. 蕭仲恭 Xiao Zhonggong 35.28:
<qatun.i 343.p.en FORTUNE₂.a.an m.gha.379 c.er>
queen-GEN wine?-GEN fortune-GEN N write-PFV
'... wrote the queen's wine's? fortune's ?'
<343.p> may be a noun possessed by the queen and possessing 'fortune' in turn. It may be a variant spelling of <342.b> 'wine' (?; Kane 2009: 76) and <342.p> (if <342.p.en> is a genitive). (<p>/<b> alternation is common in Khitan.) 342 and 343 are similar:
But 'wine's fortune' seems like an unlikely combination of words.
<m.gha.379> is in the slot for a noun possessed by the
preceding genitive of 'fortune' and an object of the verb cer
'wrote', so I expect it to be a noun.
1b. 蕭仲恭 Xiao Zhonggong 47.26:
<343.p.en FORTUNE₂.a.an t.ugh.ii c.iu.ur.094.c>
'?-GEN fortune-GEN N V-after'
'After ?'s fortune's N V-ed,' or
'After [?] V-ed ?'s fortune's N,'
There's <343.p.en> again and in the same position before 'fortune'.
<t.ugh.ii> could be a verb ending in a converb <ii>, but 'fortune-GEN' needs something to possess, so I regard it as a noun which is either the subject or object of the following verb.
<c.iu.ur.094.c> ends in a converb <c> that I translate
as 'after' (following Kane 2009: 153-154).
2. 仁懿 Renyi 5.29:
<326.041 c.l.ugh s.tumu FORTUNE₄.ń hong.ghu p.ud.z.iu TWO en b.qo ○ HEAVEN as.ar hong di>
'? ? fortune-GEN.PL Hongghu Pudziu two ? son heaven clear emperor'
'... Pudziu Hongghu of ... blessings' two ? son / the Heaven Clear Emperor'
<326.041> is a hapax legomenon.
<c.l.ugh> is also a hapax legomenon; it might be the singular
<c.l.ghu.ad> (for cVlughad?; Xingzong 28.7) which might
have a plural ending in -ad (but I'd expect a plural ending in -ud
with vowel harmony!).
It's tempting to assume <tumu> is 'ten thousand' which it can
be elsewhere but not here with a preceding <s>. There do not seem
to be any prefixes in Khitan, and I don't know of any numeral beginning
with s-, so <s.tumu> can't be interpreted as 'X ten
thousands' with <s> representing a reduced form of a numeral X.
<s.tumu> doesn't have a verb ending, so it is unlikely to be the
end of a clause. It may be a noun or adjective modifying 'fortune'. It
could also be a variant spelling of <s.313> which occurs four
times in Zhonggong. Characters 312 <tumu> and 313 <?> are
identical in shape except for the location of their right-hand dots:
<p.ud.z.iu> is a unique spelling of a title for Khitan noblewomen that appears elsewhere as <p.ü.z.iu>, <b.ü.z.iu>, and <p.ü.089.iu> (more here). The name of this pudziu is Hongghu, a hapax legomenon.
<TWO en> is strange. First, I would expect <TWO.en> as a
single block. Second, reading <TWO en> as 'two-GEN' = 'of two'
doesn't make sense in this context: why would 'son of two' be after the
name and title of a woman? Third, 'second' might make more sense, but
'second' for masculine nouns like the following <b.qo>
'son' is <c.ur.er> ~ <dz.ur.er>, not <TWO>. Fourth,
<TWO> without a dot is grammatically feminine, not masculine.
Fifth, <en> is almost always never by itself; the only other
instance of isolated <en> that I know of is in a wall inscription
of the 萬部華嚴經塔 Wanbu Avataṁsaka Sutra pagoda.
'Heaven Clear' following a space of respect transliterated as ○ (and
converted to '/' in the translation) is the Khitan era name
corresponding to Liao Chinese 清寧 *cingning (now Qingning;
1055-1064) 'Clear and Tranquil' in Chinese. The emperor of that era was
first (not second!) son of Empress Renyi (birth name 撻里 Dali, not
Hongghu!), the subject of this epitaph. In other words, Daozong
isn't the <b.qo> 'son' before the respectful space.
Someone else is the son of two - Pudziu Hongghu and some man. Might
the mystery words before 'Pudziu Hongghu' be the man's name? I could
parse the mystery phrase as
X Hongghu pudziu two-GEN son
'son of the couple - X and Pudziu Hongghu'
X might be somewhere in <326.041 c.l.ugh s.tumu>. The hapax
legomena <326.041 c.l.ugh> might be a name.
22.214.171.124:59: THE QUDUGH QUESTION (PART 4)
In Part 3 I mentioned that these proposed
characters for qudugh 'good fortune' previously regarded as
<FORTUNE₂> and <FORTUNE₄>
were attested as parts of larger blocks in 契丹小字研究 Qidan xiaozi yanjiu (Research on the Khitan Small Script, 1985):
<FORTUNE₂.a.an> and <FORTUNE₄.ń>
-ń might be a genitive plural ending which can follow final consonants (Kane 2009: 135), but <a.an> -an looks like a genitive ending for an -a-final noun (Kane 2009: 132) ... which qudugh is not. Could <FORTUNE₂> represent a Khitan cognate of Written Mongolian aja 'fortune'? Could all of the five <FORTUNE> graphs
<FORTUNE FORTUNE₂ FORTUNE₃ FORTUNE₄ FORTUNE₅>
represent that a-final word? qudugh is attested phonetically in Chinese transcription as 胡覩古 *xutuku and 胡都 *xutu¹. Could its Khitan small script spelling be something other than a form of <FORTUNE> - a block of multiple phonetic characters?
(On to Part 5)
¹11.29.11:33: The reasoning for the interpretation of *xutu(ku) as qudugh:
The transcriptions resemble Written Mongolian qutugh [qʊtʰʊʁ] 'bliss', Manchu hūturi [χʊtʰʊri] 'good fortune' (with a noun suffix -ri; Gorelova 2002: 114), and Jurchen
hutur 'good fortune' (transcribed as 忽禿兒 *xutʰuər; Bureau of Translators vocabulary #343)
huturi 'good fortune' (transcribed as 忽禿力 *xutʰuli; Bureau of Interpreters vocabulary #740)
Jurchen/Manchu h [χ] < *q corresponds to Written Mongolian q which corresponds to Khitan q
2nd millennium AD northern Chinese had no uvulars, so uvulars had to be transcribed as velars:
Chinese*x could correspond to Khitan q-
Chinese *k could correspond to Khitan gh- [ʁ]
In my non-IPA transcription system for Written Mongolian, Jurchen/Manchu, and Khitan, /nonaspirates/ are written as voiced, and /aspirates/ are written as voiceless. The Chinese transcriptions of Khitan 'good fortune' have unaspirated *-t- which points to Khitan unaspirated -d- /t/ (with an intervocalic [d]?) rather than t /tʰ/ like Mongolian or Jurchen/Manchu.
'Altaic' harmonic rules forbid uvular q coexisting with velar g, so Chinese *k cannot represent Khitan velar g /k/; it must stand for a uvular. Chinese unaspirated *k cannot represent Khitan uvular q /qʰ/ which is aspirated. It must represent the other Khitan uvular obstruent gh /ʁ/, the voiced counterpart of q.
2nd millennium AD northern Chinese had no final *-gh, so the Chinese transcriptions of the Khitan word exemplify two coping strategies:
approximate -gh with a voiceless stop and add an echo filler vowel: -gh as 古 *-ku
simply ignore -gh (and a final sonorant is easier to
ignore than a final stop)
126.96.36.199:59: THE QUDUGH QUESTION (PART 3)
(Back to Part 2)
I put off my original plans for Part 3 just minutes ago when I spotted an unusual Khitan small script block configuration in 契丹小字研究 Qidan xiaozi yanjiu (Research on the Khitan Small Script, 1985: 457):
The normal three-character block configuration has two characters on top and one on the bottom:
<013.224.327> <?.mu.ie> (Gu 12.1)
A less common configuration has a wide horizontal character atop two characters:
<001.251.257> <?.n.em> (Ren 13.5)
This new (to me) configuration has a tall vertical character to the left of a stack of two characters:
<335.327.054> <ia.ie.?> (Xing 27.12)
That combination looks like the single character
<380> qudugh 'good fortune'
and is even glossed on p. 500 of Qidan xiaozi yanjiu as 福 'good fortune'.
So I now think there are at least five different versions of the single character qudugh 'good fortune':
<FORTUNE FORTUNE₂ FORTUNE₃ FORTUNE₄ FORTUNE₅>
(11.28.23:41: Transliterations added.)
1. 380, previously regarded as <335.277>
2. serial number needed, previously regarded as <335.275>¹
3. serial number needed, previously regarded as <335.276>
4. serial number needed, previously regarded as <335.278>²
5. serial number needed, previously regarded as <335.327.054> (note the proportions of the components)
(On to Part 4)
¹11.28.15:30: I've extracted this character from a
larger block in Qidan xiaozi yanjiu. See part 4.
²11.28.15:30: I've extracted this character from a larger block. in Qidan xiaozi yanjiu. See part 4.
188.8.131.52:56: THE QUDUGH QUESTION (PART 2)
Yesterday in Part 1, I mentioned that one
reason I didn't think
<380> qudugh 'good fortune'
in the Khitan small script was a sequence of two characters
was that qudug does not begin with ia- (i.e., the reading of 335). But what is the reading of 277?
Maybe 277 has no reading. In 契丹小字研究 Qidan xiaozi yanjiu (Research on the Khitan Small Script, 1985), a quartet of characters
<277 275 276 278>
transliterated by Wu and Janhunen 2010 as <LUCK LUCK₂ LUCK₃
LUCK₄> always occur in second position after 335. I suspect that the
three two-character blocks
<335.275> <335.276> <335.278>
are really three characters
(note the proportions of the components)
that are variants of
<380> qudugh 'good fortune' (Wu and Janhunen's <LUCK₅>).
So I think three new serial numbers are needed for those variants.
And the serial numbers 275-278 should be retired except for historical
purposes ("we used to think 275-278 were characters, but we don't
anymore"). I don't object to 275-278 being in Unicode, as there has to
be a way to discuss elements that were regarded as characters for
I initially thought that 275-278 and
<448> (function unknown)
formed a graphic family sharing 厶 on top, but now I don't think
275-278 are independent characters. 448 seems to be isolated within the
Khitan small script character inventory unless it turns out to be a
variant of a character without 厶 on top.
THE QUDUGH QUESTION (PART 1)
Today I was copying line 11 of the Khitan small script epitaph for
Liao dynasty Empress 宣懿 Xuanyi (1040-1075) which contains the word qudugh-er
'good.fortune-ACC'. In 契丹小字研究 Qidan xiaozi yanjiu (Research on the
Khitan Small Script, 1985), the word appears as a block of three
However, looking at the
actual inscription at Wikipedia, I see that the character looks
Can you spot the difference?
It's the proportions of the two elements
in the top half of the block. In Qidan xiaozi yanjiu, they are roughly the same width, whereas in the actual inscription, the left element is about 60% the width of the right element. A narrow left-hand element is characteristic of many two-element Chinese characters: e.g., 福 'good fortune' in which the left side ネ is narrower than the right side 畐. To put it another way, both ネ and 畐 are narrowed in side-by-side combination, but ネ is more compressed than its neighbor. And the proportions of
are closer to those of the single character 福 than to a
two-character sequence ネ畐.
So I see for myself now that Kane (2009: 81, 99) is right:
is a single logogram 380 (or 379, as he numbered it on p. 305):
I've always thought that was a single character because of a reason
I'm surprised Kane doesn't bring up. 335 in other contexts is read ia,
and obviously qudugh doesn't begin with ia-.
Kane (2009: 99) suggests that 380 may be "derived from the cursive form of the Chinese character 福 fu 'good fortune, happiness'." The site cidianwang.com (词典网 Dictionary Net) has 23 samples of 福 in cursive. If Kane is also correct about his derivation of 380 (and I think he is), 380 is a rare Khitan small script character that not only mimics the shape of a Chinese character but also represents a Khitan word (qudugh) with a meaning similar to that of the word represented by that Chinese character (Liao Chinese fuʔ; modern standard Mandarin has lost the glottal stop). Usually Khitan small script characters with Chinese lookalikes have functions completely different from those of their apparent graphic models: e.g.,
|shape||Khitan small script
||fourth Heavenly Stem|
||third Heavenly Stem
335 nor 277
looks exactly like any Chinese characters. And even if they did, their functions could not be guessed on the basis of their hypothetical Chinese lookalikes.
I have no idea why 335 has the reading ia, and I have no idea what the reading of 277 is. I have no time either, so I'll have to look into 277 ... another time.
(On to Part 2)
184.108.40.206:27: WHAT IS THE RELATIONSHIP BETWEEN THE KHITAN SMALL SCRIPT AND THE JURCHEN LARGE SCRIPT? (PART 3: THE LOYALTY PRINCIPLE IN JURCHEN)
(Back to Part 2)
The blocks of Jurchen characters resembling Khitan small script blocks in 弇州山人四部稿 Yanzhou shanren sibu gao (Draft [Catalog of] the Four Categories of Yanzhou Shanren['s Library]; 16th c.) and 方氏墨譜 Fang shi mopu (Mr. Fang's Ink [Cake] Book, 1588) exemplify an extreme version of the loyalty principle from part 2: each block is a translation or borrowing of a Chinese monosyllabic word in Chinese word order:
||bright > wise
|Ming Chinese||明 *1'miŋ
|English gloss||bright > wise||prince
(I have slightly altered Kiyose's reading of the Jurchen.)
The first line would have object-verb order in regular Jurchen:
'wise prince virtue heedful-if'
This special kind of Jurchen appears to be related to the highly
sinified Jurchen of Ming dynasty petitions. I hypothesize that unlike
the Japanese who read Chinese in a highly stylized Japanese that still
maintained Japanese syntax and morphology, the Jurchen read Chinese in
a highly stylized Jurchen with Chinese syntax and little morphology.
Notes on the words (the blocks will have to wait until part 4):
1. genggiyen 'bright, wise': cf. Manchu genggiyen
2. wang 'king, prince': a borrowing from Chinese; cf. Manchu wang. Kiyose wrote wan, probably since Jurchen only had -n (possibly [ɴ] as in Japanese) in native words, but the Bureau of Translators vocabulary transcribes this word as 王 *1'waŋ whose *-ŋ could either point to -n [ɴ] or even -ng as in Manchu. I would be more certain about wan if this word were transcribed in Chinese with a *wan character.
3. tiko-ci-ghun 'if heedful': cf. Manchu -ci 'if'. tiko-
is not cognate to Manchu yohi- 'to pay heed to', and -ghun
is a verbal suffix of unknown function without any Manchu cognate.
4. de 'virtue': a Chinese borrowing. Could this have
coexisted with or even replaced a Jurchen cognate of Manchu erdemu
1. duin 'four': cf. Manchu duin. (I dropped
Kiyose's -w- since there is no distinction between ui
and uwi in Jurchen.)
2. tulile 'outside': cf. Manchu tule 'id.' (with
haplology: i.e., loss of -li- before a similar -le?).
3. hiyen 'all': a Chinese borrowing. This word is literary
in Chinese and is probably not the normal Jurchen word for 'all'.
4. andahai 'guest': cf. Manchu antaha 'id.'
Did Ming Jurchen shift *nt to *nd? If it did, are words
like fanti 'south' and fonto 'chestnut'
(the only known cases of -nt- in the Bureau of Translators
220.127.116.11:59: THE ETYMOLOGY OF CANTONESE 'TONGUE'
Wiktionary gives two etymologies for Cantonese 脷 lei⁶ 'tongue':
Both of these proposals have issues.
From 利 (“benefit; profit”), used as a euphemism for “tongue”), which is homophonous to 折, 蝕／蚀 (sit6, “to be at a loss”).
Alternatively, it may be from 舐 (OC *ɦljeʔ [= *mI-leʔ in my reconstruction], “to lick”), preserving the Old Chinese initial *l- (Schuessler, 2007).
Let's start with the second one which is closer to my position. 舐 *mI-leʔ
should hypothetically become Cantonese ˟sei5, not lei6.
I think it might be more accurate to say that lei6 is related
to 舐 *mI-leʔ rather than from it. lei6 may be from a
derivative of 舐 *mI-leʔ that lost its first syllable and has a
nominalizing suffix *-s ('lick-NMLZ' > 'that which licks' =
*mIleʔ-s > *mIlies > *məlieh > *lie̤ > 19th c. li6 > lei6
The presence of *mə- blocked the Late Old Chinese sound
change *l- > *j- from applying to -l-. *mə-
must have been dropped at some point after *l- > *j-.
Now for the homophone avoidance taboo etymology [revised 11.24.20:06]: It makes sense in
Cantonese now, but would it have made sense at the Proto-Yue level? The
脷 lei6-type word for 'tongue' is widespread throughout Yue
Chinese¹ (see this
map), and therefore is likely to have been in Proto-Yue. 舌 'tongue'
and 折 (蝕 is a Cantonese respelling) 'to be at a loss' are homophonous
in the Middle Chinese phonological tradition and were probably also
homophonous in Proto-Yue.
However, Wiktionary also reports 脷 for 'tongue of an animal' in Sichuan (i.e., Sichuan Mandarin) and Hakka, though it does not specify any dialects. There is no Yue in Sichuan, so 脷 there cannot be explained away as a Yue loan. The only Hakka 脷-like word for 'tongue' that I could find in Wiktionary is 脷錢 'tongue' in 陸川 Luchuan Hakka. 脷錢 in that variety and in 柳州 Liuzhou Mandarin is a borrowing from neighboring Yue dialects and is not evidence for reconstructing the word represented by 脷 back to the common ancestor of Hakka and Mandarin as well as Yue.
But if 脷 is in one or more varieties of Sichuan Mandarin, then the
word represented by 脷 would be reconstructible in Old Chinese. (The
character脷 seems to be a relatively recent invention and cannot be 'Old Chinese'.) And in the early 2nd century, Late Old Chinese 舌 *ʑɨat 'tongue' and 折 *dʑiet
'to lose' might not yet have become homophones (I don't have any
rhyming data for that period), so there would be no motivation to
replace 折 with 利 *lis 'profit', assuming homophonic substitution existed 1900 years ago.
Without any source to verify that 脷 is in Sichuan, I wouldn't bet on that scenario.
So I need to go back up to the Yue level and ask:
1. How old is the practice of homophonic taboo substitution? Can it be reconstructed at the Proto-Yue level?
2. How common is the practice of homophonic taboo substitution in Yue varieties?
3. Where did the practice of homophonic taboo substitution originate?
4. How did the practice of homophonic taboo substitution spread: inheritance or diffusion?
¹The earliest attestation of the character 折 representing a word 'to lose' (not quite 'to be at a loss', but close enough) is in 漢書 Hanshu (Book of Han, 111 AD).
APPENDIX: Cantonese readings of 舐
Above I wrote that the expected reading of 舐 should be ˟sei5. 粵語音韻集成電子版 A Chinese Talking Syllabary of the Cantonese Dialect: An Electronic Repository has five readings:
laai2 < *l̥ieʔ < *sI-leʔ, cognate to Old Chinese 舐 *mI-leʔ 'to lick'
Old Chinese *-ICe has two reflexes in Cantonese: -ei
< 19th c. -i and, in a minority of cases, -aai.
The latter seems to be either native (meaning that the bulk of
Cantonese vocabuary was borrowed or remodelled after a prestiget
dialect) or substratal.
lem2 < Old
Chinese *l̥emʔ 'to lick'
This word has no known Old Chinese spelling and first appears
in Middle Chinese as 舔, so I cannot give an Old Chinese character for
lim2, ditto but with a less conservative rhyme, the
reflex of Old Chinese *-em in the newer, majority layer of
saai2 < *sI-leʔ, cognate to Old Chinese 舐 *mI-leʔ 'to lick'
saai5 < *ʑ- < *mʑ- < *mj- < Old Chinese 舐 *mI-leʔ
This is the closest to the ˟sei5 I predicted, but its rhyme is the (older?) minority reflex of Old Chinese *iCe.
Now for the first. As far as I know,
APPENDIX 2: How can Old Chinese *sl- fuse in two different ways?
The two fusions occurred at different times. I don't know the
relative chronology, so I provide two different scenarios below.
Scenario 1: *sl- > *l̥- occurred first.
Scenario 2: *sl- > *s- occurred first.
At an even earlier stage some or all *sl- could have been *sVl-.
First vowel loss in both scenarios above is unpredictable; both
monosyllabic and disyllabic variants of *s(I)-leʔ existed in
stage 2, just as full and abbreviated variants exist in English today
(e.g., select [səˈlɛkt] ~ [slɛkt]).
18.104.22.168:55: SINO-TIBETAN WORDS FOR 'TONGUE'
The native Cantonese word 脷 lei⁶ 'tongue' is reminiscent of l-words for 'tongue' found elsewhere in Sino-Tibetan: e.g.,
Old Chinese *mIlat
a Late Old Chinese form like ?mliet was borrowed into Proto-Hmong-Mien as *mblet (Ratliff 2010: 48)
Tangut 𗢯 3190 1lhwa4
< pre-Tangut *PIl̥aC
Written Burmese lhyā < Proto-Lolo-Burmese *sla1b
(Burling 1967: 97; Burling disregarded Burmese spelling, so there was
no evidence pointing to a *glide in his data; perhaps the
reconstruction can be emended to *slja1b)
Written Tibetan lce < pre-Tibetan *ɣl̥ʲe (Hill 2013: 195), ljags < pre-Tibetan *ɣlʲaks (hon.; my attempt at a Hill-style reconstruction; I've changed his *ḫ to *ɣ which is his phonetic value for *ḫ)
I wish I knew the Pyu word to complete the set of the 'big five' Sino-Tibetan literary languages, but Pyu basic vocabulary is all but unknown.
To keep things simple, I have not looked at other potentially related *l-words in Chinese, much less other Sino-Tibetan *l-words for 'tongue' or 'lick' available at STEDT.
Before one jumps to the conclusion that all of the above must share an *l-root, one should note Schuessler's (2007: 467) warning:
Initial *l- is a near-universal sound symbolic feature for 'lick / tongue', hence similar words in other languages are not likely to be related, such as MK-PVM [Mon-Khmer-Proto-Viet-Muong] *laːs 'tongue' [Ferlus]; Kam-Tai: S[iamese] liaA2 < *dl- 'to lick' [cf. ], PKS [Proto-Kam-Sui] *lja² ? [Thurgood].
Proto-Kra *l-maA 'tongue' (Ostapirat 2000:
223; cf. Proto-Kam-Sui *maA 'id.' [Peiros]),
Proto-Hlai *liːnʔ 'id.' (Norquest 2016 appendix: 127),
Proto-Tai *liːnC 'id.' (Pittayaporn 2009:
389), and Proto-Austronesian *lidam (on the basis of
and Rukai; Blust and Trussel
2019) also fit the pattern. (A single Proto-Kra-Dai word for
'tongue' doesn't seem to be reconstructible.)
Continental 'Altaic' words for 'tongue' have noninitial l-:
Ming Jurchen ilenggu ~ ilenggi, Written
Mongolian kelen, and Turkish dil. (But
peripheral 'Altaic' words don't: e.g., Korean hyŏ < *he
and Japanese shita.)
European examples are English lick and Latin lingua
'tongue'. (The latter, of course, has an irregular l- < *d-
which became the t- in tongue. Wiktionary derives
the l- of lingua by analogy with lingō 'I lick',
the true Latin cognate of lick. If we ignore that inconvenient
fact, we could be daring and 'reconstruct' a 'Proto-World' *lV
Schuessler was of course warning against linking Sino-Tibetan words to non-Sino-Tibetan words which happen to share the same initials, but lookalikes do also occur within families: e.g., lick and lingua. There could, at least in theory, be two unrelated lateral roots for 'tongue' in Sino-Tibetan.
Trying to reconcile the small set of Sino-Tibetan forms that I listed at the beginning runs into all sorts of difficulties:
Prelaterals (i.e., whatever comes before the L: prefixes or first syllables of disyllabic roots?): If Old Chinese *mI- and pre-Tangut *PI- are prefixes, what are their functions? Maybe the unknown pre-Tangut labial *P- was *m-. (The high vowel *-I- in both proto-forms is needed to account for the fronting of *a.) The labials in those prelaterals clash with Burling's Proto-Lolo-Burmese alveolar *s- and Hill's pre-Tibetan velar *ɣ-.
Laterals: Chinese and Proto-Lolo-Burmese have a voiced *l-, pre-Tangut has voiceless *l̥- (pre-Tangut *Sl- would correspond to Tangut l- + vowel tension), and pre-Tibetan has both voiced *-lʲ-. and voiceless *-l̥ʲ- with palatalization that might be a trace of a preceding high vowel *-I-:
Vowels: Three types are in the six words at the top:
*ɣIl̥- > *ɣIl̥ʲ > *ɣl̥ʲ-
*ɣIl- > *ɣIlʲ > *ɣlʲ-
Cantonese -ei only recently broke from *i.
One pre-Tibetan word has *e.
The others have *a.
Cantonese tone 6 points to a voiced pre-Cantonese initial *l- and a pre-Cantonese final *-(p/t/k-)s.
Old Chinese has final *-t.
Proto-Lolo-Burmese has no coda.
Pre-Tibetan has both codaless and *-ks forms.
Tangut *-C could have been
*-k (as in pre-Tibetan)
*-t (as in Old Chinese)
or even *-p (cf. labial-final 舔 'to lick' < Old Chinese *l̥emʔ [unattested!])
- the same three stops that might have preceded *-s in pre-Cantonese.
If one regards the various codas as suffixes, one should ideally be able to identify the functions of those suffixes. Affixation can be a dangerous pseudoexplanation for mismatching segments in forms under comparison.
This exercise shows how far we are from being able to reconstruct Proto-Sino-Tibetan. Much more work needs to be done on subgroups before the outlines of their common ancestor can emerge.