HAPPY NEW YEAR 2020
It's still the year of the pig in traditional East Asian calendars, but it's the year of the rat (2020) if one coordinates the Chinese animal cycle with the Gregorian calendar:
Center (red): Khitan large script <RAT>
Horizontal (green): Khitan small script <216> <?> and <151> <ghu> split by <RAT>
Vertical (blue): Jurchen <sin> and <ge> for singge 'rat' split by <RAT>
Circle (black): Jurchen <TWENTY> x 20
Last night I realized that Khitan small script character 216 might be a derivative of 118 <qu>:
Let's assume 216 was <qu*> with <*> indicating 'different from <qu> in some way'. Then
would be read <qu*ghu> which is close to Written Mongol qulughana 'rat'.
What if <qu*> were <qul>? <qul.ghu>
is close to qulughana, but I wouldn't expect Khitan u
to correspond to Written Mongol a.
That's where I left off last night. Today I realized that
<ghu> might be read <ugh> after a consonant. So maybe
<216.151> was read <qul.ugh> which is even closer to
Written Mongol qulughana and requires no vocalic gymnastics.
The low frequency of 216 (7 times in the 契丹小字研究 Qidan xiaozi yanjiu corpus and 0 times in initial position in Wu and Janhunen 2011 [whose index is organized by initial graphs]) suggests that it probably did not represent a simple CV syllable. If it didn't represent the CVC syllable qul, it may have represented a CVCV sequence qulu, and <qulu.ugh> was read qulugh.
The <qul(u)> hypothesis could be confirmed if 216 alternated with <qu.l>, <qu.ul> (= <qu.lu>?), etc.
As far as I know, 216 appears only in initial position with one exception: this block
from line 3 of the second inscription in the 萬部華嚴經塔 Wanbu
Avataṁsakasūtra Pagoda in Hohhot.
2. I still practice writing Tangut, Khitan, and Jurchen (TJK) every day. Recently I added Manchu to my regimen and today I started writing Mongolian (in the traditional script - I still don't know how to handwrite Ө and Ү in Cyrillic).
All my TJK exercises begin with the date. I'm still going to date
these blog entries in Jurchen since it's the thousandth anniversary of
the Jurchen large script or close to it (see Kiyose [1977: 22] for
three possible dates: 1119, 1121, and 1123; Kane [1989: 3] gives the
date 1120, though Kane [2009: 3] gives the date 1119). Today's date in
songgiyan uliya aniya juwa emu biya ice nadan inenggi
'yellow pig year, ten one month, new seven day'.
3. Last night I learned about prothesis in Bashkir:
арыш 'rye' < Russian рожь
өҫтәл 'table' < Russian стол
эскәмйә 'bench' < Russian скамья
but why ө- [ø] instead of э [e] in өҫтәл?
The prothesis is mostly unsurprising, but these correspondences are:
B ы [ɤ] ~ [ʌ] : R о [o]
B ә [a] : R о [o]
1.2.11:00: I forgot to mention these cases of prothesis in native
ыласын 'falcon' < *laːčïn
ысыҡ 'dew' < *čïq
Without more Bashkir data, I can't test my guesses for motivations: e.g., avoiding initial l- and making monosyllables disyllabic.
letter ҡ <q> surprised me since I'm accustomed to қ <q>
from Kazakh, etc. Why do Bashkir and Siberian
Tatar have their own special ҡ <q>? Siberian Tatars
were educated in (Volga)
Tatar which has к <k> for /k/ (including a [q] allophone)
and къ <k"> for /q/.
4. Today I learned about the Caucasian Albanian script used to write a (near?-)ancestor of the Udi language.
I've thought Old Chinese might have had pharyngealized vowels, so
I'm interested in the phonetics of Udi's
5. What is the etymology of Persian شمشیر <šmšyr> shamshir,
first (?) attested in Middle Persian as <šmšyl>? It doesn't look
Indo-European. Is it an areal word?
6. Why does the Persian word/name فرشته <frsth> fereshte
< firishta sometimes appear as Farishta(h),
e.g., in this
1958 Bollywood film title (फरिश्ता Phraiśtā; cf. Urdu فرشته
list of Pashto (not Persian, I know) names?
220.127.116.11:56: YELLOW PIG 12/6
songgiyan uliya aniya juwa emu biya ice ninggu inenggi
'yellow pig year, ten one month, new six day'
1. Last night I looked up 䯗 'hip bone' and discovered it could also be called the innominate bone. Why 'nameless'?
2. Are 清樂 Shingaku
'Qing music' lyrics an overlooked source of data for premodern Mandarin
reconstruction? In this sample from 月琴樂譜 Gekkin gakufu (Moon
Guitar Sheet Music, 1877), 兒 (now ér
[aɚ˧˥] in modern standard Mandarin) has the furigana ルウ <ruu>.
That seems to indicate that the kana transcription is based on a
dialect in which 兒 was pronounced like [ɻ̩]. (Other evidence rules out
the most obvious interpretation [ruː]: e.g., no Mandarin dialect has
[u] in 兒.)
The date of the text does not necessarily indicate that the [ɻ̩]
pronunciation still existed in the source dialect as of 1878. The kana
spelling ルウ <ruu> could have been copied from some earlier source.
ルウ <ruu> bears no resemblance to ジ <zi> [dʑi], the usual
Japanese reading of 兒. Strictly speaking, the two Japanese borrowings
are not from the same
dialect in two different periods: <zi> is from a 7th century
northwestern Chinese dialect, whereas <ruu> is from a Qing
(perhaps 18th century?) Mandarin dialect. Nonetheless the latter
probably underwent more or less the same changes as the former, so as a
convenient fiction, here's how the sources of <zi> and
<ruu> could be bridged:
Stage 1: *ɲʑi > borrowed into Old Japanese as /Nzi/ (> modern [dʑi])
Stage 2: *ʑi
Stage 3: *ʐi
Stage 4: *ʐɻ̩
Stage 5: *ɻ̩ > borrowed into Edo Japanese as /ruː/
Modern standard [aɚ] is from a stage 5-type form that developed a prothetic vowel:
*ɻ̩ > *əɻ > *ɚ > [aɚ]
In some Mandarin varieties, only the prothetic vowel has survived without any trace of retroflexion: e.g., 壽縣 Shouxian [ə] and 鳳陽 Fengyang [a] for 兒.
It is tempting to derive Sino-Korean 아 a for 兒 from a Fengyang-like form, but that would be anachronistic. Fengyang [a] is probably a very recent development from *ar, whereas the earliest attested ancestor of 아 a is ᅀᆞ zʌ borrowed from a form like stage 4 *ʐɻ̩. zʌ became ʌ in the 16th century, and ʌ then became a in the 18th century.
3. I don't understand how Korean z vanished without a trace. Lee and Ramsey (2011: 142) state that "early examples of the elision of z are all restricted to the environment _i, y, which suggests that the process of change started there." They give these examples:
/sʌzi/ > /sʌi/ 'interval'
/nʌyzir/ > /nʌyir/ 'tomorrow'
In those particular cases, I can imagine /z/ being phonetically something like [ʑ] that lenited to [j] and then disappeared before /i/. But what were the intermediate stages between /z/ and zero in initial position before /ʌ/ as in 15th century /zʌ/ > 16th century /ʌ/?
I thought [ɦ] might be a possible intermediate stage by analogy with Sanskrit:
Proto-Indo-Iranian *ĵʱ > Sanskrit h [ɦ] but Avestan z
I assume there was a stage like *ʑʱ underlying both
Sanskrit and Avestan reflexes. (No, see topic 4 below.) That stage
would be like Middle Korean /z/. In some modern Indic languages,
Sanskrit initial h- has disappeared in reflexes
of hima- 'winter'. I don't know if that's a regular change.
4. I've been trying to work out the phonetics of Proto-Indo-Iranic¹ (PII) reflexes of Proto-Indo-European (PIE)
4.1. The PIE starting point:
4.2. The first palatalization in PII
4.3. Affrication in PII (cf. the alveolar affricate reflexes of Sanskrit palatals in some modern Indic languages)
4.4. The merger of plain velars and labiovelars
4.5. The second palatalization in PII
Velars palatalized in certain environments. Compare:
*kʷe > *ke > *ce
(palatalization before *e) 'and'
4.6. The merger of *e and *o into *a made the second palatalization phonemic:
*ce > *ca 'and'
It was no longer possible to regard *c as an allophone of
/k/ before /e/, since /e/ no longer existed. (The e of later
Indo-Iranic languages is not from the earlier *e that merged
with *a: e.g., Sanskrit e is from PII *ai which
could be from PIE *ei or *oi but not PIE *e.)
1.1.0:59: The following sections deal with post-PII developments.
4.7. Pre-Sanskrit (Proto-Indic²) stage 1
The affricate series palatalized. I thought the absence of *ts-type affricates in Proto-Dravidian might have pressured a shift away from alveolar affricates, but the traces of Indic in the Near East - far from Dravidian - underwent stage 2 (4.8 below): e.g., the name Paršasatar from praśāstar- 'director' with ś < PII *ts-.
4.8. Pre-Sanskrit (Proto-Indic) stage 2
Voiceless *tɕ simplified to *ɕ.
The voiced affricates merged with the voiced palatals.
I don't know the order of those two changes, so I show the results of both changes in the same table instead of arbitarily showing one change at a time in two tables.
4.9. Sanskrit (Proto-Indic)
*ɟʱ weakened to h [ɦ].
4.10. Proto-Iranic (continuing from 4.6)
The voiced aspirate series merged with the plain voiced series.
The affricates deaffricated. The change of *ts to s is roughly parallel to the change of *tɕ to ś in Sanskrit. But note that Proto-Iranic *dz became Avestan z, whereas pre-Sanskrit *dz did not become Sanskrit ź [ʑ], a sound that does not exist in Sanskrit.
The exact phonetics of c and j are unknown. They
were palatal unlike s and z, so I have projected
palatal stops forward into Avestan. But maybe Avestan c and j
were actually affricates.
4.12. Summing up
¹1.1.0:40: I favor the term Iranic by analogy
with Turkic, Mongolic, etc. to avoid confusion with the country of Iran.
²18.104.22.168: I prefer the term Indic to Indo-Aryan, as the word Aryan is shared by both Indic and Iranic. Ironically, the name Indic is actually Iranic, as it is an Hellenization of Old Persian 𐏃𐎡𐎯𐎢𐏁 <ha i du u sha> [hi(n)duš] 'India', cognate to Sanskrit Sindhus 'Sindhu'. The Old Persian form has two Iranic innovations:
*s > h
*dʱ > d (cf. *gʱ > g in 4.10 above)
It occurs to me tonight that an Indic name for Indic would be Sindhic,
but that's not going to catch on. No one is going to rename the country
Sindhia either. And Hindutva advocates
are probably not going to change the name of their ideology to Sindhutva.
22.214.171.124:45: YELLOW PIG 12/6
songgiyan uliya aniya juwa emu biya ice shunja inenggi
'yellow pig year, ten one month, new five day'
1. I checked Jan van Steenbergen's Interslavic page for updates and noticed a new item in the menu:
The Painted Bird (in Czech: Nabarvené Ptáče) a Czech-Slovak-Ukrainian film written, directed and produced by Václav Marhoul. It is based on Jerzy Kosiński’s novel The Painted Bird from 1965.
The action takes place in some unspecified East-European, Slavic-speaking country. A place that cannot directly be linked to a specific Slavic population requires a language that can instantly be recognised as Slavic but not be linked directly to any specific Slavic population either. That's why Marhoul decided to use Interslavic:
2. I just bought e-access to Vojtěch Merunka's Interslavic zonal constructed language: an introduction for English-speakers. Google says I can check a box to "Make [the book] available offline", but I can't find it.
On page 5, Merunka writes (12.31.14:03: links added),
Interslavic is also an interesting experiment of alternative history: If there was not such strong pressure from the Frankish Latin-oriented church (e.g. Wiching of Nitra and his band) against the Moravian Church in the 9th century, the invasion of the Hungarians into Central Europe and the subsquent collapse of contacts between Moravia (now a territory of both the Czech and Slovak Republics) and Bulgarian, Serbian and Kiev (later Russian) states, it is possible to imagine a hypothetic different evolution of the Slavic early Middle Age language - we have seen a similar phenomenon in the Arabic World: After the end of natural linguistic unity during the Middle Ages, the modernized universal Arabic language based on the religious language of the Qur'an still prevails. It is an artificial language which is close enough to the various contemporary spoken national dialects of Arabic that it is recognized as the standard for communication between Arabic nations and for contact with foreigners and used as an auxiliary language by both state apparatus and the media.
It would be fun to see historical fiction depicting a world where Interslavic - probably simply 'Slavic' - has the same position that modern standard Arabic has.
Page 143 presents a modified Arebica alphabet to
3. 𗡠 0271 2mer4, representing the second syllable of 𗡢𗡠 0702 0271 1to'4 2mer4 'to seek, find', has a right side (Boxenhorn code: baedar) found nowhere else. I found it in Li (2008: 47) when looking up 𘅊 0273 1le1 for my last entry.
2mer4 sounds like Old and Middle Chinese 覓 *mek 'to
seek'. If I were to force a relationship between the two, I could trace
2mer4 back to pre-Tangut *RImek-H with labial
*Pek > *Pew > *Pej > Pe
*RImek-H could be related to
𗑉 4684 1me1 < *CAmik or *mek 'eye'
cf. Tibetan mig (archaic dmyig) 'eye' (but Old Chinese has 目 *Cmuk - is *Cmikʷ possible?)
which is the word that made me discover labial dissimilation. Two
*CAmik > *CAmiw > *CAmij > *CAmi > *CAmai > *mai > 1me1
The relative chronology of *P-w dissimilation and *A-triggered diphthongization is uncertain.
Japhug tɯ-mɲaʁ has a high vowel presyllable,
not a low vowel presyllable needed to condition Grade I (the -1
at the end of 1me1).
*mek > *mew > *mej > 1me1
But there are other possible pre-Tangut sources of 2mer4 that would rule out a connection with the Chinese word:
𗡢 0702 1to'4 'to seek' can appear by itself. That suggests that 𗡠 0271 2mer4 might be a formerly independent verb that only survives as the second half of a synonym compound 'seek-seek'.
4. Li (2008: 120) gives this example of 0702 as an independent verb from The Timely Pearl 292:
5098 0702 0760 1715
2ngon4 1to'1 2dzen4 1rar4
'case seek judge ?'
It corresponds to Chinese 案檢判憑 'case examine judge ?'
Nishida (1964: 215) has the translation 'to examine the case and hand down a judgment'. Nishida (1964: xii) says Burton Watson and a ヤンポルスキー (Yampolsky? - I don't know who this is, or what his preferred Anglicization of Ямпольский is) helped him with the English translations. Later, Nishida (1964: 216) has the translation'deliver a judgment' for 判憑 in Timely Palm 302.
I would think then that 𘅤 1715 1rar4 /憑 means 'to hand down' or 'to deliver'. But the basic meaning of 𘅤 1715 1rar4 is 'to write' (Li 2008: 285). So might the Tangut phrase in The Timely Palm mean 'write a judgment'?
憑 can be translated many ways in Chinese, but none of those translations mean 'write' or 'hand down' or 'deliver'. Might it be 'proof': i.e., 'evidence'? If so, then there is only a vague parallel between the Tangut object-verb sequence 𗍷𘅤 'write a judgment' and the Chinese verb-object sequence 判憑 'judge evidence (?)', and mechanically equating 𘅤 with 憑 may be a mistake.
Then again, to say Burton Watson's knowledge of Chinese dwarfs mine would be an understatement, and maybe 判憑 is an idiom 'deliver/hand down a judgment' that I just failed to confirm in other sources.
I always assumed Watson had learned Japanese in the American
military in WWII, but in fact he didn't know any Japanese when he
arrived in Japan in 1945, and he was actually
a Chinese major.
5. My DuckDuckGo search for Yampolsky led me to a video of minerva scientia pronouncing Tangut in Gong's (more or less) and Arakawa's reconstructions.
6. ElitekidMu0 comments on that video:
Fun fact: Thunder Force VI [Wikipedia], a shooting game released in 2008 by SEGA for the PS2, included the Tangut Language as the main language for the protagonist of the series, Galaxy Federation (Vastian). Another language included in the game is the Mongolian Script, used by the antagonist of the series, ORN Empire.
7. Last night I learned that Kara Ben Nemsi was meant to mean 'Carl son German' (though nemsi is really closer to نمساوي namsāwiyy/nimsāwiyy 'Austrian'; 'German' is ألماني 'almāniyy).
Karl May has a way with foreign names. I couldn't have come up with
something equivalent to Old Shatterhand
or Old Surehand
8. I just noticed that the Old English Wikipedia (Ƿikipǣdia) is
Sēo Frēo Ƿīsdōmbōc
'the free wisdombook' (Ƿ <W> wynn is a rune borrowed into the Old English alphabet)
forms like Irish seo
'this' the only living reflexes of Proto-Indo-European
*só retaining s-? Greek [o] has lost h- <
*s-, and English the has a th- that spread from
the th-reflexes of the *t-initial oblique forms of *só.
9. I finally got around to rewriting my lost entry for 12.26 from memory. I finished right after I ordered a used hardcover copy of William C. Hannas' The Writing on the Wall: How Asian Orthography Curbs Creativity (2013).
10. Tonight I discovered the variant 槑 for 梅 <PLUM>.
11. Baxter and Sagart (2014) reconstruct 梅 <PLUM>. in Old Chinese as *C.mˤə. I suspect that *C was a voiceless consonant because Vietnamese mơ 'apricot' has a ngang tone pointing to an earlier *m̥- which may be from an even earlier *C̥m- with a voiceless *C̥- that conditioned the devoicing of *m-. I would reconstruct the word in Early Old Chinese as *C̥Amə with a low first vowel that triggered the warping of *ə to *ʌə:
*C̥Amə > *C̥Amʌə > *C̥mʌə > *m̥ʌə > *mʌe > *mʌj > *mɑj > *mwɑj > *muj > *mwəj > *məj > standard Mandarin [mej]
It is possible that *C̥A- was simply completely lost after warping in (many? most? all?) dialects other than the one underlying Vietnamese *C̥m-. I have not yet found any Chinese varieties with a yinping tone pointing to *m̥-.
The *m̥- in the scenario above is of late origin. An earlier
*m̥- in Old Chinese became *x- in stage 2 below, whereas
newer *m̥- merged with *m-:
The tones above are conditioned by final glottals: final glottal
stops conditioned the falling-rising tone [˧˩˧] and stage 3 voiced *m-
and the absence of a final glottal conditioned the high rising tone
songgiyan uliya aniya juwa emu biya ice duin inenggi
'yellow pig year, ten one month, new four day'
1. Tonight it occurred to me that the Jurchen and Khitan large script characters for 'four' might be graphic cognates:
One might be rotated - but which one? And did the Parhae script have both rotated and nonrotated variants of <FOUR>?
12.30.0:17: Both <FOUR>s have four strokes, so they may simply be two types of tally marks formalized as characters.
In any case, the Khitan large script character is not to be confused with Chinese 卅 <THIRTY> which is a fusion of three 十 <TEN>s.
12.30.12:50: Chinese 卅 <THIRTY> in turn should not be confused
with the Jurchen phonogram <sui>:
Jin (1984: 25, 26, 180) reports the first pair of forms in the 大金得勝陀頌碑 Great Jin Victory Hill stele (1185) and the second 卅-like pair of forms in the Berlin and Tōyō bunko copies of the Ming dynasty Bureau of Translators vocabulary from c. 1500. Without examining the original texts, I cannot be certain about minor variations such as the presence or absence of a hook in the 1185 stele.
I fear that the Bureau of Translators' forms might be
unintentionally 'sinified' in the sense that unfamiliar Jurchen
characters were accidentally modified by scribes more familiar with
sinography. Perhaps the resemblance of <sui> to Chinese卅
<THIRTY> in the Bureau of Translators vocabulary might be an
example of sinification.
12.30.15:33: Jin (1984: 58, 76) derives Jurchen <FOUR> from
the phonogram <da> which in turn he derives from Chinese 屠:
In the Jin dynasty, 屠 was pronounced *tʰu. Why base a
phonogram <da> on a Chinese character pronounced *tʰu?
I don't think <da> was a Jin dynasty invention. I think its roots go back further to a period when 屠 was pronounced as *da in Late Old Chinese. (屠 was once a transcription character for -ddha in 浮屠 *bu da = Buddha.) In other words, I think <da> is potential evidence for the Jurchen large script being an heir to an old tradition of phonetic writing rather than a 12th century invention.
I don't think there is any relationship between <FOUR> and
<da> beyond graphic convergence - the bottom of <da> (known
only from two inscriptions) may have been remodelled after the far more
common character <FOUR>.
2. Tonight while copying character 236 of the Golden Guide, I miswrote the Tangut character element 𘡛 by placing the dot too low so it intersected the stroke below it.
Nishida (1966: 242) interpreted as 𘡛 a radical for things having to do with 愛惜 aiseki 'cherish'. It just occurred to me that 𘡛 might be derived from the top of 愛 <LOVE> or the top right of 惜 <CHERISH>.
But ... what is 𘡛 doing on the top
of 𘓉 0993 1lhew1 'to herd',
of all things? Is 𘓉 0993 a semantic
compound like <CHERISH.LIVESTOCK>?
But ... the bottom of 𘓉 0993
code: baecie) is neither 'livestock' nor short for a character for any
animal. The only other character with baecie is 𘅊 0273 1le1, a character for writing
3. I was surprised by this passage (emphasis mine):
Martin Kümmel similarly proposes, based on observations from diachronic typology, that the consonants traditionally reconstructed as voiced stops were really implosive consonants, and the consonants traditionally reconstructed as aspirated stops were originally plain voiced stops, agreeing with a proposal by Michael Weiss that typologically compares the development of the stop system of the Tày language (Cao Bằng Province, Vietnam).
But then I checked Pittayaporn (2009: 110) who explains that in Cao Bằng,
Proto-Tai *implosives > [plain voiced stops]
Proto-Tai *plain voiced stops > [voiced aspirate stops]
The voiceless aspirate stop reflexes of Thai, Lao, etc. are from Cao Bằng-like *voiced aspirate stops (e.g., the name Thai [tʰaj] itself < *dʱ- < *d-; the name Tai for the language family has a unaspirated [t] reflex of *d-).
Was there a push or pull chain in Tai? I imagine a pull chain:
*plain voiced stops became *voiced aspirate stops, leaving a gap to be
filled by *implosives becoming *plain voiced stops. But that's just an
offhand scenario with zero research, much less testing.
I can see something similar happening in Proto-Indo-European ... except for this problem:
in Proto-Tai (and languages with implosives in general), *ɓ- is common and *ɠ- does not exist
in Kümmel-style Proto-Indo-European as I understand it, *ɓ-
would be rare, and *ɠ- and *ɠʷ- would be common
The ejective hypothesis, on the other hand, correctly predicts that
Proto-Indo-European labial *pʼ (corresponding to *ɓ- in
the implosive hypothesis) would be rare or absent.
4. I wish there were animated GIFs like the Georgian ones at georgian-language.com for Manchu and traditional Mongolian letters. I've been using Jun Jiang's Manchu app which has animated images for Manchu syllables and words, but it doesn't seem to match the verbal (nonvisual) instructions in Roth Li's Manchu textbook, so I'd like to see a second opinion.
5. I discovered that the Old English Wikipedia has a runic viewing option. Select ᚱᚢᚾ <run> under the article title.
12.30.0:16: Try the ȝƿ and ᵹƿ viewing options too.
6. Why is Gdańsk
Gduńsk in Kashubian? Is Polish a : Kashubian u
a regular correspondence in some environment(s)? I don't see anything
like *a > u in Stone's (1993: 765) sketch of
Kashubian vowel history.
7. Another Kashubian surprise: kùńszt [kwuɲʃt] (I think) 'art' < German Kunst. Why [wu]? How did Kashubian develop [wu] in native words? Is [ɲ] instead of [n] due to assimilation with [ʃ]? Was the word borrowed from a German dialect in which 'art' was [kunʃt] instead of [kʊnst]? 'Hyperlabial' [wu] for [ʊ] seems odd to me.
Aha, I see now that Kashubian /u/ becomes [wu] "[i]nitially or after
a labial or a velar" (Stone 1993: 762). So [wu] has nothing to do with
8. How did Proto-Slavic
*sŭnŭ 'sleep' become Lower Sorbian soń with a
palatal ń instead of the expected n as in the rest of
Slavic: e.g., Upper Sorbian son?