Or, in Jurchen,

<nion.giyan mo.nion DAY> niongiyan monion/bonion inenggi

1. I originally wrote 'green' and 'monkey' as nongiyan and monon more or less following Jin Qizong (1984), but then I realized that Ming Chinese 嫩 *nun in their transcriptions was the only possible way to write Jurchen [ɲɔn] in sinography since there are no characters for *ɲon, *ɲun, etc. The Manchu cognates niowanggiyan 'green' and monio/bonio 'monkey' with nio [ɲɔ] confirm a palatal nasal [ɲ]. It would be unlikely for n to become [ɲ] before a nonpalatal vowel [ɔ].

2.17.19:43: The vocabularies of the Bureau of Translators and Interpreters have different transcriptions of the first syllable of 'monkey': 卜 *pu (BoT #152) and 莫 *mo (BoI #332, #424). This parallels the b [p] ~ m variation in Manchu. Anna Dybo's Tungusic dictionary regards the m- as secondary. The m- may be due to assimilation with the following -n-: cf. the b- ~ m- alternation in the paradigm of Manchu 'I':

b- when no nasal follows: bi (nominative)

m- when a nasal follows: mini (genitive), minci (ablative), minde (dative), mimbe (accusative)

be 'we (exclusive)' has the same alternation: e.g., meni (genitive).

2. In "The Day of the Black Horse", I proposed that pre-Tangut *-aw became *-a. I just found a potential example:

*kraw > *kraɰ > *kra > *kri > 𗠭 4533 1ki2 'to call out, to shout'

cf. Written Burmese ကြော် <krau> < *graw? (following Pulleyblank's 1963 analysis of <au>) 'to shout loudly'.

This example entails *-w loss before *a-brightening (i.e., raising to i).

2.17.21:21: But I don't know when *rV > V2 (i.e., Grade II V). Above I've placed that change after *a shifted to *i, but it could have predated that.

3. "Talking tactics: Rihanna and the pop stars who change accent" (via Lisa Jansen) mentions an application of phonostatistics I never imagined:

Take the Beatles for example; a band who were masters in vocal shape-shifting, and picked up traits from their fans across the Atlantic during the height of Beatlemania in the US. In You Say Potato: A Book About Accents, authors David and Ben Crystal note the impact of the Beatles’ fluctuating tones. Citing a report by Peter Trudgill in 1980, which examined the way in which the Beatles sounded out the r after a vowel, something most American singers would do, they wrote:

"In 1963/64, in such songs as Please Please Me, almost 50% of the words containing this feature had the r sounded. By the time of the Sergeant Pepper album in 1967, this had fallen to less than 5%. Note that the use of the feature was never totally consistent. That’s normal. When singers copy Americans, they get the accent sometimes right, sometimes wrong. But over the years, the Beatles' singing voices show that they are leaving the mid-Atlantic way behind and starting to sound more consistently British."

That made me wonder if exceptions to sound changes are cases of incomplete imitation.

4. Andreas Hölzl's "Udi, Udihe, and the language(s) of the Kyakala" (2018: 136) mentioned an Alchuka form that looks like the missing link between Jurchen

<GOLD.un> ancun (or alcun?) 'gold' (originally spelled with a single character <GOLD>?)

and Manchu aisin 'id.': anʃïn!

5. Looking up 𗠭 4533 1ki2 'to call out, to shout' in Li Fanwen's 2008 Tangut dictionary, I stumbled on a nearby entry

𗄤 4536 2ror4 'wizard, witch, sorceror'

Li only mentions attestations in dictionaries. So 2ror4 may be a so-called 'ritual language' word or, in my view, a non-Sino-Tibetan substratum word. The Mixed Categories volume of the Tangraphic Sea mentions several possible (near-)synonyms. I'll look at them tomorrow.

6. Looking at Shimunek's (2017: 218) reconstruction of un-'Altaic'-looking Khitan initial clusters (e.g., kʰtʃʰ- and tʰg-)  made me think he could have cited Middle Korean initial clusters like pst- for areal/typological support.

Surprising even from a Middle Korean perspective is his initial [r̩l]. Middle Korean had no r-initial words. More on this tomorrow.

7. Looking at Shimunek's "Post-publication Addendum to Languages of Ancient Southern Mongolia and North China: A Revised Transcription of Middle Mongol in ’Phagspa Script", I wonder how he would reconstruct the initial consonant of ꡖꡞꡘ ꡂꡦ ꡋꡦ <ɦir gė nė> *ɦirgen-e 'person-DAT/LOC' at an earlier stage.

2.17.20:37: Two topics I forgot to mention:

8. I finally got to see text in the Mongolian Latin alphabet. Or to be more precise, two versions of it. I'm confused: Wikipedia says one system

was officially adopted in Mongolia in 1931. In 1939, the second version of the Latin alphabet was introduced but not used widely until it was replaced by the Cyrillic script in 1941.

citing Lenore A. Grenoble's Language Policy in the Soviet Union (2003: 49). But the 1931 date is for Kalmyk, not Khalkha Mongolian in Mongolia, and I don't see any mention of the other points.

On the other hand, the Mongolist György Kara (2005: 187) only mentions an "ephemeral attempt" at a Latin alphabet for Mongolia "launched by Choibalsan in 1940".

(2.18.13:40: No, wait, his timeline [p.197] says there was an experimental alphabet for Khalkha in the "early 1930s". No mention of the specific date 1931 or of a new alphabet in 1939. He gives 1945 as the date of the introduction of Cyrillic for Khalkha.)

9. More confusion: The Wikipedia article on hanja (Chinese characters in Korean) says,

South Korean primary schools abandoned the teaching of Hanja in 1971, although they are still taught as part of the mandatory curriculum in 6th grade. They are taught in separate courses in South Korean high schools, separately from the normal Korean-language curriculum. Formal Hanja education begins in grade 7 (junior high school) and continues until graduation from senior high school in grade 12.

So are hanja taught in sixth grade or not? The first sentence tells me 'yes'; the last sentence tells me 'no'.

I'd still love to see a list of hanja taught in North Korean schools. THE DAY OF THE BLACK SHEEP

Or, in Jurchen,

<saha.liyan SHEEP DAY> sahaliyan honi inenggi

1a. The Jurchen character <saha> is only attested in the vocabulary of the Bureau of Interpreters (#481, #620), but its shape goes back centuries.

Jin Qizong (1984: 93) observed that there is an identical character in the Khitan large script from a remnant of a memorial from the mausoleum of Emperor Taizu of Liao (r. 916-926). Could that memorial date from the mid-to-late 920s: i.e., only a few years after the 'creation' (whatever that really meant) of the Khitan large script?

As the Khitan large script character for 'black'

is somewhat (though not entirely) different, my guess is that the Jurchen character may be a recycling of a Khitan large script character pronounced saqa (Shimunek [2017: 213] did not reconstruct x or h for Khitan). That character in turn might be derived from a Parhae prototype that was either pronounced similarly or represented an unrelated Parhae (North Koreanic?) morpheme with a meaning similar to whatever Khitan saqa might have meant.

Another possibility is that

were variants of <BLACK> in the Khitan large script. But they might be too different to be variants.

I am hesitant to transliterate

as a logogram <BLACK> because it is also attested in the verb stem

sahada- 'to hunt' (#481); cf. Manchu sahada- 'id.'.

Could that spelling be <HUNT.da> which at some earlier point ? Did the Jurchen originally write 'to hunt' as a single logogram <HUNT>? Was sahaliya then spelled <HUNT.liya> with <HUNT> used as a phonogram for saha-? Perhaps

represented a Khitan root 'to hunt' in the Khitan large script. If so, I cannot think of any plausible cognate Chinese character, though with pareidolia, one can see a 'covered cross' on the right side of  狩 'to hunt'.

1b. Jin Qizong (1984: 296) observed that <liyan> has a near-lookalike in the epitaph for Xiao Xiaozhong 蕭孝忠 (1089):

(shown here in Jerry You's font)

Was that character also read something like liyan? Might the character be from a Parhae graphic cognate of Chinese 亮 or the right side of Chinese 涼? Both 亮 and 涼 would have been pronounced something like *ljaŋ in the northeastern Chinese known to the Parhae (cf. their Sino-Korean reading 량 ryang).

1c. Jin Qizong (1984: 11, 12) found a different form of <SHEEP> in the Jurchen Character Book thought to date from the early Jin dynasty. I presume he identified its meaning on the basis of context (e.g., being surrounded by other animal date terms in sequence?) since the Book is monolingual. He writes this Jin form of <SHEEP> in three different ways in his dictionary:

As I do not have a clear copy of the Book, I do not know which form is attested in it. (Maybe two or more are if the character appears more than once.)

The last form is the closest to Khitan <SHEEP>, though the top elements (ヒ and ユ) are oriented in opposite directions:

Could some or all of these have originated as pictographs of sheep?

2. I was hoping to write a report on Larry Hyman's talk "Functions of Vowel Length in Language: Phonological, Grammatical, & Pragmatic Consequences", but no one was there. There wasn't even a sign indicating a new location or cancellation.

2a. In his abstract, Hyman mentions Bantu languages which

- "have added restrictions which shorten long vowels in pre-(ante-)penultimate word position and/or on head nouns and verbs that are not final in their XP"

- "have lost the [vowel length] contrast but have added phrase-level penultimate lengthening"

Why would vowels shorten in pre-(ante-)penultimate position? Or lengthen in penultimate position?

Those which have "new long vowels (e.g. from the loss of an intervocalic consonant flanked by identical vowels)" are like Mongolian: e.g., the city name Улаанбаатар Ulaanbaatar < *hulagan 'red' + *bagatur 'hero' (the 1924 collocation is obviously of Communist origin and hence cannot be reconstructed at the proto-level).

2b. I wonder what Hyman would say about Pulleyblank's (1962: 99) and Starostin's (1989) theories of vowel length and Chinese vocalic development in what Sagart (1999) called 'type A' and 'type B' syllables. Four proposals on type B syllables:

Pulleyblank: Old Chinese *Vː > Middle Chinese *jV..

Starostin, OTOH, had the reverse idea: Old Chinese short *V > Middle Chinese *jV. (This is a simplification.)

In the Baxter-Sagart system, Old Chinese *V before nonpharyngeal consonants > Middle Chinese jV (their j is a notational device).

In my system, (1) *high vowels not preceded by high vowels and (2) *low vowels preceded by high vowels > Middle Chinese high vowel-initial diphthongs.

The traditional (i.e., Karlgrenian) view is that Old Chinese *jV > Middle Chinese *jV.

2.16.22:45: A comparison of different views:

Type A syllables (all agree the Middle Chinese reflexes had no *-j- before *-e)

Old Chinese
Middle Chinese
Pulleyblank (1962)
*Ce *Cej
Starostin (1989)
*Ceː *Ciej
Baxter and Sagart (2014)
This site (my view since 2002)

Type B syllables (all agree the Middle Chinese reflexes had *-j- or *-i- before *-e)

Old Chinese
Middle Chinese
Pulleyblank (1962)
Starostin (1989)
Baxter and Sagart (2014)
This site (my view since 2002)

Baxter and Sagart's Middle Chinese notation is not starred since it is not phonetic. Their -ji- is a spelling device to indicate Grade IV chongniu status. I don't know how they think -jie was pronounced.

If I wrote Middle Chinese the way I write Tangut and Tangut period northwestern Chinese, I would write *Cie as Ce4 with 4 for Grade IV. I have considered writing such a notation for Middle Chinese to avoid getting bogged down in phonetic trivia.

3. Two things struck me as I was looking at Shimunek's (2017: 215-217) reconstruction of Middle Khitan vowels.

3a. His Middle Khitan vowel inventory is front-heavy unlike the Mongolic, Jurchen/Manchu, or early Korean systems:

Shimunek's Middle Khitan (3 front vowels)






He respectively places *ɛ and *ʊ higher and lower than I would expect. *ʊ is similarly high in the next table.

Shimunek's Common Serbi-Mongolic (2 front vowels)





Proto-Mongolic (1 front vowel)





Ming Jurchen in the Sino-Jurchen vocabularies (1 front vowel; note the similarity to the Middle Khitan inventory except for the front vowels)






Manchu (1 front vowel; descended from a Jurchen dialect retaining ʊ unlike the vocabularies dialects)







Early Korean (1-2 front vowels; in a more phonetic notation than usual to facilitate comparison with Shimunek's systems)





So far nobody else believes in my *ɛ. I'll live.

(Tables added 2.16.0:16.)

3b. Another surprise from a Mongolic/Jurchen/Manchu/Korean perspective is that his Middle Khitan a and ə belong to the same vowel harmony category, whereas they are typically in opposing categories. Contrast:

his Middle Khitan nar-ən 'tomb-GEN' (instead of †nar-an)


Written Mongolian aqa-aca 'older.brother-ABL' vs. eke-ece 'mother-ABL' (e = [ə])

Jurchen ala-ha 'lose-PERF' vs. ete-he 'win-PERF' (e = [ə]; both from the Bureau of Translators vocabulary, #689, #794)

(2.16.0:24: I wonder if 阿剌 *a la- in Chinese transcription is an error for ana-; the Manchu cognate is ana-bu- 'to lose' with -n-, not -l-. See below for the Manchu verb ala- with -l-.)

Manchu ala-ha 'tell-PERF' vs. gene-he 'go-PERF' (e = [ə])

Korean 받아 pad-a 'receive-INF' vs. 벋어 pŏd-ŏ 'stretch-INF' [ɔ] is from earlier ə.)

Vowel harmony is breaking down in the spoken Korean 'infinitive': pad-a may be pronounced (but never spelled!) pad-ŏ (which is heard "increasingly in Seoul today" [Lee and Ramsey 2011: 296]).

I think nar-ən is also a case of vowel harmony breakdown possibily facilitated by a lack of stress on suffixes. Kane (2009: 132) gives examples of a-nouns followed by a genitive written <an>. However, Kane does not give examples of the type ... aC-an; all the stems in his examples end in -a, so, for instance,

<qa.gha.an> 'of the qaghan'

might have simply been [qaʁan] rather than [qaʁaːn]. Perhaps a-final nouns took -n and aC-final nouns took -ən. THE DAY OF THE BLACK HORSE

Or, in Jurchen,

<saha.liyan HORSE.in DAY> sahaliyan morin inenggi

I can't believe I started the day thinking I'd never have enough to fill this entry.

1. I recall that Grinstead (1972) derived the Jurchen character <HORSE> from Chinese 保 'to protect', which would have been pronounced *paw (would Pulleyblank have reconstructed *pɔw?) in Jin Chinese. But why would the Jurchen write an m-word with a p-character?

Today I realized that <HORSE> might be derived from a Parhae script graphic cognate of 保 with a para-Japonic (!) reading cognate to Japanese mor- 'to protect'.

2. I discovered Lisa Jansen's blog Lisa Loves Linguistics. Excerpts from two posts:

2a. " 'He said me haffi work, work, work…' – Rihanna's multivocal identity":

the insertion of a palatal glide between [k] and [a] as in cyar instead of care which is also a more or less Pan-Caribbean feature

At first I thought of how English [kæ] is borrowed into Japanese as kya (e.g., cat as kyatto), but care doesn't have [æ]. Is care [kja] in the Caribbean?

2b. "The Sociolinguistics of 'Indie' Music: Kate Nash" (by Anika Gerfer)

Trudgill (1983) and Simpson (1999) discovered that a range of British artists of the mid-20th century switched to an ‘American accent’ in singing (Simpson labels this set of features associated with ‘American accents’ the “USA-5 model”).

That reminds me of the story behind the Kinks' "Come Dancing":

While recording "Come Dancing," Ray was asked to sing in an "American accent," a request he turned down.

Even the content was thought to be too English for the American market:

Although Arista Records founder Clive Davis had reservations about releasing the single in the United States due to the English subject matter of dance halls, the track saw an American single release in April 1983.

But the lyrics didn't bother me in Hawaii.

3. I finally realized that Sino-Korean 天動 chhŏndong 'thunder' became 'nativized' as 천둥 chhŏndung to harmonize the lower series vowel o with the preceding higher series vowel ŏ.

Korean vowel classes (added 2.16.0:41; ă is obsolete)


4. A Haiman Tetralogy

Quoting from a grammar that's actually fun to read!

4a. In the Khmer dialect described by Haiman (2011: 1), what he transcribes as av (ៅ <au> in Khmer script)  is pronounced as [aɯ]. I suspect a similar shift of *-aw > *-aɰ occurred in Tangut. Eventually this *-aɰ simply became -a.

4b. Haiman (2011: 10):

Leaving this small number of words aside, it is still remarkable that in a language where almost every two-consonant cluster is attested word-initially, there are (virtually) no such (glottal stop + C) clusters.

I think "every" is too strong for Khmer which has many constraints on initial clusters: e.g., no clusters starting with implosives.

I'm reminded of how I thought anything could be in a Pyu consonant cluster after seeing sequences like kṭl- from inscription 12 and tdl- from inscription 16) until I actually collected all the clusters in the corpus and put aside marginal oddities. Then patterns emerged: e.g., what appeared to be three-consonant clusters were really sequences of preinitials followed by initials spelled with two consonants:

kṭl- /k.L̥/

tdl- /t.L/

/L̥ L/ may have been lateral affricates [tɬ dɮ].

2.16.20:11: Whether these mysterious laterals have anything to do with the laterals sometimes reconstructed for Tangut (e.g., Sofronov 1968 and Tai 2008's ld-) remains to be seen. I have not yet been able to identify any cognates of Pyu words with /L̥ L/ (or the similarly enigmatic /R̥ R/ written as  ṭr and dr).

4c. Haiman (2011: 19):

Smith (2007: ii) declares the native orthography to be "the best [transcription of Khmer phonetics] on the planet" and heroically dispenses with any romanizations in even the initial chapters of his introductory textbook. No other scholar has followed him in either this bold assessment or in practice

I haven't seen Smith (2007), but it does seem "bold" to do so, given that I had to work through 148 pages of Huffman's Cambodian System of Writing (1970) to learn the script.

4d. Haiman (2011: 22):

Final <s> may be pronounced [s], in a hypercorrect reading style: thus nah, written as <nas> can be pronounced [nas] or [nah]. Otherwise, it is pronounced as [h]

This makes the Khmer borrowing of juif 'Jew' as ជ្វីស <jvīs> [cʋih] (hypercorrect [cʋis]) with <s> instead of <ḥ> even stranger; a nonsibilant [h] seems more like [f] to me than a sibilant [s].

5. Looking at Roland Emmerick's 2009 sketch of Khotanese, I wondered where balysa- /balza-/? 'Buddha' came from. (ys in Khotanese Brahmi stands for non-Indic /z/, a common sound in Iranian languages.)

6. Today's color is black, and yesterday I proposed that the Jurchen phonogram <he> was from a Parhae script counterpart of Chinese 黑 'black'. In Middle Chinese, was pronounced 黑 *xək (probably more like *xʌk), yet its Sino-Korean reading is hŭk [hɯk] with a high vowel. That oddity is not isolated; it is true of Sino-Korean readings corresponding to Middle Chinese *-ək/*-əŋ in general. What's going on? The borrowing of Middle Chinese *-ək/*-əŋ (*-ʌk/-ʌŋ?) as Sino-Korean [ɯk]/[ɯŋ] is even more puzzling considering that Korean once had [ʌk]/[ʌŋ]. The early ('Go-on') layer of Sino-Japanese presumably borrowed via a Koreanic language (Paekche) has -oku/-ou < -ək/-əũ for those Middle Chinese rhymes. (That tells us a bit about how Sino-Paekche differed from Sino-Shilla which became Sino-Korean.)

7. I was reluctant to propose that Ming Jurchen gulmahun 'hare' and Manchu gūlmahūn 'id.' had acquired their final syllables by analogy with Ming Jurchen indahun and Manchu indahūn 'dog', but now here I am mentioning it after seeing Shimunek (2007: 353)'s similar proposal for Middle Mongol 'snake':

The ai /Ay/ element in the Middle Mongol form [moqai ~ moqoi] is probably the result of analogical change: cf. MMgl noqai 'dog', qaqai 'pig', taulai 'hare', etc. (Emphasis mine.)

Note that all four of those animals are part of the twelve-animal cycle.

8. Shimunek's 2018 article on Jurchen numerals is a good companion to Andrew West's article on the same topic.

9. I agree with Juha Janhunen (2012: 13) about

the assimilation model of linguistic expansion. According to this model, it is not populations that migrate but languages. When a speech community expands its territory to comprise areas where other languages are originally spoken, the principal process is that of linguistic replacement, or language shift, due to which the new language is, in most cases voluntarily, adopted by speakers of the former local languages. Empirical experience from different parts of the world tells us that language shift is by far the most important mechanism of linguistic expansion. This conclusion has only been confirmed by recent progress in human genetics.

That is why I like to speak of the coming of Burmese speakers into the Pyu lands rather than just 'the Burmese'; the latter could imply that the Pyu were completely replaced by 'the Burmese', whereas it is more likely that Pyu speakers switched to Burmese. The descendants of the Pyu are still here, though they don't speak Pyu or identify as Pyu anymore.

10. I disagree with Pevnov (2012: 17) about the term 'Tungusic':

which in my opinion is incorrect for the following reasns: first, it would at the very least be strange to consider Jurchen or Manchu to be Tungusic, and second, following such a logic of terminological simplification, it would analogically be possible to replace the term "Indo-European" with "European," "Finno-Ugric" with "Finnic" or "Ugric" and so forth, although it is unlikely that anyone would agree with such innovations.

The term Manchu-Tungusic could imply there are only two branches, Jurchen/Manchu and an 'everything else' branch (which is in fact Pevnov's view, one he shares with Sunik and Vasilevich). But that may not be the case: e.g., on the previous page, Janhunen (2012: 16) posits a different model in which Jurchenic (Jurchen and Manchu) are a subbranch of Southern Tungusic:


See Wikipedia for a model with the same basic structure (but different details below the second-level branches: e.g., Janhunen regards Kili as Nanaic, whereas Wikipedia lists Kili as Ewenic).

The term 'Sino-Tibetan' has similar problems - it could imply there are only two branches, Sinitic and Tibeto-Burman, which I do not think is the case. But at least Chinese and Tibetan are both well-known languages that could serve as representatives of the family. The layman has heard of Manchu but not of 'Tungusic'. Moreoever, there is no language called 'Tungusic'.

Shimunek's term 'Serbi-Mongolic' also implies there are two (known) branches, Serbi and Mongolic, and that does seem to be the case. Serbi is not a well-known language, but at least it was a language (see Shimunek 2017: 121-168 for details on Middle Serbi).

2.16.21:30: For further reading on naming language families, I recommend Ostapirat (2000: 18):

We propose to call the whole language stock, to which Kra and other sister languages belong, Kra-Dai. The term follows the popular tradition of juxtaposing two big language members of the family, which sometimes are also linguistically distant enough from each other to give the feel of the whole family (cf. Sino-Tibetan, Tibeto-Burman, Mon-Khmer, etc). Such "dual" names appear to have proved practical; the longer names have seemed to be less successful in competition. For instance the term "Kam-Tai" which represents the Tai and Kam-Sui branches have quickly taken over the older names such as "Tai-Kam-Sui-Mak" (the last three members belong to the Kam-Sui branch).

Rereading that, I see the first line might give the impression that Kra is a language, though it is actually a group of languages.

Dai in Kra-Dai also refers to a group of languages; it "is the reconstructed form of autonyms of various Tai groups" such as the Thai. I like Dai as it avoids the homophonous confusion of Tai and Thai in English. Dai does have homophony problems of its own, but as a proto-word it is the shared heritage of all Tai peoples.

'Tungusic', on the other hand, is not based on a proto-autonym shared by most Tungusic languages (or even most non-Jurchenic Tungusic languages); it is a Turkic word for 'pig' that was an exonym of the Evenks. It has stuck in English, and I doubt it has any potential serious competitor other than Manchu-Tungusic: e.g., Eweno-Jurchenic. THE DAY OF THE WHITE SNAKE

Or, in Jurchen,

<šang.giyan SNAKE.he DAY> šanggiyan meihe inenggi

1. Related or abbreviated? For years I thought of <SNAKE> as resembling 厄 'adversity', but today I finally realized it's related to the right side of 蛇 Chinese 'snake'. The left side 虫 'bug' is a later addition; the right side 它 was originally a standalone drawing of a snake. <SNAKE> may be from a northeastern version of 它 that became part of the Parhae script. I then saw that Jin Qizong (1984: 35) thought <SNAKE> is an abbreviation of 蛇.

2. Today I also wondered if <he> could somehow be related to the graph for Jin Chinese 黑 *xə 'black'. If <he> goes back to the Parhae script or even earlier, then its original phonetic value may have been *xək like the Middle Chinese reading of 黑. Native Jurchen words can only end in -n, so it would be understandable if the Jurchen took a Parhae graph for *xək and used it to write their he [xə].

3. Two days ago I was reading Jonathan Evans' Introduction to Qiang Phonology and Lexicon (2001: 182) on the "weak role of tone in [Qiang] tonal dialects". He got different tones for the morpheme 'finger' in the names of the five fingers in two different recording sessions:

session 1: low (4×), high (1×)

session 2: high (5×)

Was Tangut like its modern living Qiang relatives? Were its tones as unstable? Or as unstable at some earlier point in its history before they 'settled down' to the point where a rhyme dictionary organized by tone (the Tangraphic Sea) made sense? Is my assumption that the 'rising tone' originated from a final glottal *-H misguided? I fear the history of Tangut tones is complex.

I should have written all that in "The Day of the Yellow Hare", but I forgot until I stumbled across that page again today.

4. Sergey Dmitriev's 2018 article on Tangut tree names shows how much can be extracted from just a few entries of the Sino-Tangut glossary Pearl in the Palm. I hope other semantic categories in that booklet are subjected to similarly intense analyses.

The dedication is to Elena N. Nevskaja, the late daughter of NA Nevsky, the greatest Tangutologist of all. I am saddened to learn she is gone.

5. Going back to Evans, I was looking at his reconstructions of Proto-Southern Qiang (PSQ) initial clusters (2001: 165-166). Looking at *KC-clusters, it is tempting to phonologize them all with a preinitial /k/ whose aspiration and voicing are conditioned by the following initial:

PSQ (phonological)
tsh- before e, tɕh- elsewhere
ɕ- s-
*khɕ- */kɕ-/ ɕ-, tsh-
s-, ɕ- khɕ-
dʑ- ʑ- gʑ-
gɹ-, dz-
g-, dʐ-
gʑ- before y, gʐ- elsewhere

Such assimilation has a modern parallel in Taoping in which preinitial /χ/ is [ʁ] before voiced initials.

But that analysis requires a voiceless /r̥/, a consonant not reconstructed elsewhere in PSQ. Moreover, it doesn't work for *PC-clusters:

PSQ (phonological)

Or does it? What if Evans' *pz-, *pr-, and *phr- are */ps- pr̥- pʂ-/?

2.14.17:14: I could also reinterpret *khr- as /kʂ-/ to parallel *phr- /pʂ-/. All voiceless sibilants would then condition aspiration of the preinitial: */CS̥/ = *ChS-. Nonsibilant */r̥/ would not: */pr̥/ = *pr- (not *phr-). No, not all - */ps/ isn't *phs-, it's ... *pz-! /voiceless/ + /voiceless/ = /voiced/? I think not, though maybe I could just rewrite *pz- and *bz- as *ps- and *pz- (i.e., regard the phonological and phonetic forms as identical) and have Taoping undergo a chain shift:

*/ps-/ > */pz-/ > bz-.

Still, there seems to be strong if not perfect complementary distribution - there is a tendency against voicing mismatches: e.g., no *kz- or *bs-. Perhaps a neater earlier system was complicated by

- borrowings from languages with different phonotactics

- and or/by new preinitials from earlier syllables that lost their vowels after the voicing assimilation rule ceased to operate: e.g.,

*pz- > Taoping bz-

*pVz- > Taoping pz-

The reanalysis above is motivated by a hypothesis that Proto-Sino-Tibetan had fewer preinitials than initials: e.g., one preinitial velar stop *k- but three initial velar stops *k- *kʰ- *g-. But in theory Qiang could have preserved preinitials lost in Old Chinese, Old Tibetan, pre-Tangut, Pyu, etc.

6. Today I learned that 'Jewish' in Khmer is ជ្វីស <jvīs> [cʋih], a borrowing from French juif [ʒɥif]. Why is it spelled with s and not <ḥ>?

7. Looking at Vovin (2017) again while writing footnote 2 of "The Day of the White Dragon", I noticed he reconstructed Old Korean 日尸 <SUN.l> 'sun' (普皆廻向歌 Pogaehoehyangga, line 5, mid-960s) as *nal. That would seem to rule out a connection with Serbi-Mongolic forms like Khitan ñayr 'day' (as reconstructed by Shimunek 2017: 358) and Middle Mongolian naran 'sun'.

2.14.15:51: The only way around this would be to reconstruct a third liquid or a liquid cluster in the source language of 'day/sun' that became *r in Serbi-Mongolic but *l in Koreanic.

8. 2.14.17:57: I forgot to mention this passage I saw yesterday:

The Shakya clan of India, to which Gautama Buddha, called Śākyamuni "Sage of the Shakyas", belonged, were also likely Sakas as Michael Witzel and Christopher I. Beckwith have demonstrated.

I hope there is more to the argument than the similarity between Śākya and Saka. As Attwood (2012: 58) wrote,

The similarity in names is not enough to identify the Śākyas with the Iranian Sakas.

Attwood evaluates and expands upon Witzel's 2010 proposal. I am unable to evaluate it or Beckwith's 2015 book. THE DAY OF THE WHITE DRAGON

Or, in Jurchen,

<šang.giyan DRAGON.r DAY> šanggiyan mudur inenggi

I thought I had lost my list of topics for yesterday's entry, but I found the former as I was about to post the latter. The list was in index.htm before I was about to paste "The Day of the Yellow Hare" onto the top.

1. In my discussion of the Jurchen word for 'red', I forgot to mention modern Sanjiazi Manchu fulxajn 'red' (Kim 2008: 144) corresponding to standard written Manchu fulgiyan (which I presume to have been [fʊlɢʲaʜ]). x seems to be from an earlier fricative *[ʁ] rather than a stop *[ɢ]. But why is it devoiced between voiced segments [l] and [a]? Is it from an earlier unaspirated stop *[q]? (Voiced stop symbols in my Jurchen/Manchu notation may have been either voiceless unaspirated or voiced in medial position.)

I'd like to find more instances of the g : x correspondence.

I'd also like to find more examples of palatality moving to the end: C₁iyVC₂ > C₁VyC₂. Having just mentioned Jurchen šanggiyan 'white' (the standard written Manchu word is the same), I would expect a Sanjiazi form ending in -ajn, but the actual form is ɕaŋŋən without -j- (Kim 2008: 94) Could -ən be a reduction of *-ajn?

2. Middle Korean 븕- pɯrk- 'red' is somehow related to Sanjiazi fulxajn. When looking for what Alexander Vovin (2009: 73) had to say about 븕- pɯrk-, I found his proposal that the attributive suffix of Old Korean

明期 <BRIGHT.kɯj> *pʌlk-kɯj 'bright¹-ATTR²' (處容歌 Chŏyong-ga, line 1, mid-700s)

is the source of the Proto-Japanese³ attributive suffix *-ke. I suspect the Koreanic source of loans in Proto-Japanese was not a direct ancestor of Old Korean. So maybe the source language had an attributive suffix *-ke, possibly from a Proto-Koreanic *-kɯj. Otherwise I would expect Old Korean *-kɯj to be borrowed into Proto-Japanese as *-kəj or even *-kɨj if Frellesvig and Whitman's proposal of a seventh Proto-Japonic (and by extension, Proto-Japanese) vowel is correct.

An apparent paradox just occurred to me: Old Korean has -Vj where the Koreanic source of Proto-Japanese loanwords has *-e and vice versa:

'ATTR': OK -kɯj : PJN *-ke

'Buddha': OK *putke (cf. pre-Jurchen *putiki; pre-Jurchen had no front vowel *[e]⁴) : PJN *pətəkaj

Adding yet another layer of complication:

'temple': OK *tjara (cf. Jurchen taira(n); ty- [tj] is not possible in Jurchen⁵) : PJN *tera (< *tjara?)

Or was Jurchen ai an attempt to approximate a Koreanic *[e]? a is nonhigh like *[e] and i is palatal like *[e].

Maybe this can be resolved at the Proto-Koreanic level. And/or maybe there was more than one Koreanic source of Proto-Japanese loanwords: e.g., one language at two different periods or two languages/dialects at once.

3. A sequel to my proposal of *rjaC > rar4 in Tangut: I looked up all three rar4 words with etymologies in Jacques (2014), and none have cognates with *-j-:

𘗶 0803 2rar4 'horse' < *-k-H?, suffixed stop-final variant of 𘆝 0764 1rer4 < *-ŋ 'id.' : Japhug mbro < *-ŋ 'id.', Written Burmese mraṅḥ 'id.'

𘅤 1715 1rar4 'to write' : Japhug rɤt 'id.'

𘃜 5523 1rar4 'must' : Japhug ra 'id.', Written Burmese 'id.'

If Gong Xun is right, all three had a simple initial *r- in pre-Tangut like the cognates for the last two words:

2rar4 < *rak-H? 'horse'

1rar4 < *rat 'write'

1rar4 < *raC 'must' (but why does Tangut have a final consonant corresponding to zero in Japhug and Written Burmese?)

That's simpler than my scenario in which Grade IV lower vowels are 'bent up' by preceding high vowels in presyllables that were lost:

rar4 < *.raC

On the basis of Japhug and Written Burmese, I could propose *mɯ.rak-H as the source of 2rar4. But there is no external evidence for presyllables for 'to write' and 'must'; at this point they are merely constructs necessitated by my theory.

The relative simplicity of Gong's theory and mine is reversed with Grade I (there are no Grade II or III syllables with r-):

rar1 < *raʶ (Gong) but *(Cʌ.)ra (this site)

The advantage of my theory is that it requires no exotic segments like uvularized *aʶ. (But nonexotic segments not supported by external evidence are not to be embraced.)

The ratio of rar1 to rar4 in my database of Tangut character readings (≠ morphemes or words!)  At a glance that may suggest Gong's *aʶ (> a1) was almost as common as his *a (> a4), which seems implausible. However, a count of types is not a count of tokens. Phonemic frequency analysis of Tangut texts remains to be done. 

4. Yesterday I learned who founded Taitō and why it has its name - the kanji are 太東, and 太 is short for 猶太 'Jewish', a reference to Michael Kogan's background. 猶太, pronounced [jowtʰaj] in standard Mandarin, is a Chinese phonetic transcription of a form like Judaea.

After so many years, I finally wondered - why does the d of Judaea correspond to an aspirated [tʰ] in most Chinese readings of 太⁶: e.g., standard Mandarin [tʰaj]? Was 太 'great' chosen more for its meaning than its phonetic value? But then why not transcribe dae as 大 'great' without a dot and with either an unaspirated [t] or a voiced [d] depending on Chinese variety?

(2.13.22:48: Answering my own question, I learned that 猶大 without a dot already exists as the Chinese borrowing of Judah [son of Jacob] and, in Protestantism, Judas and Jude. But I would imagine 猶 太 predates 猶大, so it wouldn't be as if 猶大 were already taken. I could be wrong, though. I don't have time to track down those words.

I don't know what the earliest Chinese term for 'Jewish' was. The English and Chinese Wikipedia mention Yuan dynasty terms 竹忽 *tʂu xu and 朱乎得 *tʂu xu tə as terms for Jews, but I can't find any attestations at Scripta Sinica. The initial *tʂ- is odd since I would expect a glide *j-. What language with an affricate-initial word for 'Jew' would be a plausible source for those borrowings?

5. Last month I proposed that the Jurchen word for 'sword'


might be halmar corresponding to Manchu halmari 'a sword used by shamans'. I then realized that Jurchen


mudur 'dragon' : Manchu muduri 'id.'

is another example of that correspondence, but forgot to blog about it until today, a dragon day. Is Manchu -ri in part from earlier -r, or is this another case where Manchu is more conservative?

6. Back in 2011 I proposed that the Jurchen phonogram


as in

<OX.an> wihan 'ox'

had what looked like Jin Chinese 不 *pu 'not' on the bottom because it was originally intended to write a Koreanic word *an 'not'. That was an extremely stupid idea, even 'wronger' than usual for this site, because an isn't attested until the late 1800s (Martin 1992: 419); the earlier Korean form was disyllabic ani.

But maybe that idea can be salvaged minus the anachronistic reference to an. Today I saw Alexander Vovin's "Two Tungusic Etymologies" (2018) in which he reads Late Old Korean 不知 <NOT.ti> as anti 'not'. So 不 was read an-, though that was still centuries before there was a standalone word an 'not'. He then proposes that Proto-Korean *an-negatives are the sources of Tungusic negatives. That borrowing must have occurred very long ago - long before the rise (and fall) of Parhae in the second half of the first millennium CE. Still, the idea of Jurchen speakers knowing of Koreanic an-negatives now seems a bit more plausible.

7. Some new terms for convenience:

North Koreanic: hypothetical prestige language of Parhae underlying the Parhae script. Inexplicable sound-symbol matches in the Khitan and Jurchen large scripts (e.g., why write Jurchen an with a character 不 read *pu in Jin Chinese?) might involve North Koreanic readings.

Late Koreanic loans in Jurchen (e.g., taira(n) 'temple') are from North Koreanic.

Earlier Koreanic loans in Tungusic (or vice versa - e.g., 'red'?) could predate the North/South Koreanic split.

South Koreanic: source language(s) of Koreanic loans in Proto-Japanese and Old Japanese. The inconsistent correspondences between Old Japanese and Old Korean *e may reflect borrowings from two varieties of South Koreanic, 'A' (from Paekche?) and 'B' (from Kaya?). The Old Korean of Shilla may be a third variety, 'C'.

Japanese tera 'temple' is a loan from South Koreanic, so there is no need to come up with a single early Koreanic form underlying both Jurchen taira(n) and Japanese tera; the vowel of the first syllable could have developed differently in North and South Koreanic.

I could just use terms like 'Koguryo' for North Koreanic and 'Paekche' for South Koreanic, but I want to avoid conflating languages with states, particularly given the presence of a Japonic and perhaps even Tungusic substratum on the peninsula.

¹2.13.23:02: Korean 'red' and 'bright' are thought to be related via ablaut (Vovin 2009: 7). In Middle Korean, they both have -r-, but I reconstruct -l- for Old Korean for both words, given that

1. Old Korean had an r/l contrast lost in Middle Korean (Vovin 2017)

2. Tungusic and Mongolic have an r/l contrast

3. Tungusic and Mongolic have l in 'red'

4. So Old Korean had *l in 'red'

5. And if 'red' and 'bright' have the same root

6. Then 'bright' had *l too

Old Korean
Middle Korean
Modern Korean
*pɯlk- pŭrk-
붉- pulk-
*pʌlk- ᄇᆞᆰ părk-
밝- palk-

²2.13.23:24: The sequence *ʌ ... ɯ looks bizarre from a Middle Korean perspective because it combines a lower vowel stem with a higher vowel suffix, but Old Korean did not have vowel harmony (Vovin 2009: 11). I suspect that vowel harmony was introduced into Korean by Tungusic speakers in the northern half of the peninsula. But is there any evidence for more vowel harmony in northern Korean than in southern Korean?

The sequence *pʌlk-kɯj is bizarre in another way: the normal Korean attributive suffix is -ɯn: cf. 去隱 <LEAVE.ɯn> for *?-ɯn 'left' in 慕竹旨郞歌 Mojukchirangga (c. 700).

Stranger still, it is also possible to interpret 明期 <BRIGHT.kɯj> as *pʌlk-ɯj 'bright-GEN'. Strange because pʌlk- is a verbal root that should not be followed by a genitive suffix. Could *pʌlk-ɯj be a remnant of a time when the verb/noun distinction was not as strict?

³2.13.13:33: Proto-Japanese is distinct from Proto-Japonic:

Japanese dialects
Ryukyuan languages

Proto-Japonic is the ancestor of the entire family. Proto-Japanese is the ancestor of the dialects of mainland Japan.

⁴2.13.13:01: E in my Möllendorff-style notation for (pre-)Jurchen represents [ə], not [e]. [i] was the only front vowel in (pre-)Jurchen.

⁵2.13.13:43: Alexander Vovin (2007: 77) proposed Old Korean *tiara 'temple' and metathesis in Jurchen (*ia > ai) to work around the impossibility of the initial cluster tj- in Jurchen.

⁶2.14.0:31: The major exception is Toisanese in which [tʰ] became [h], so 猶太 theoretically would be read [ziwhaj]. (*j- became [z] in Toisanese - a sound change shared by Vietnamese.) But I have no idea if [ziwhaj] is the actual Toisanese word for 'Jewish'. I don't know how far 'syllabic conversion' goes in nonstandard Chinese varieties. Have Toisanese speakers simply borrowed Cantonese 猶太 [jɐwtʰaːj]?

I suspect nonstandard Chinese varieties have a lot of borrowings from prestige languages: e.g., why read 妮妲莉寶雯 'Natalie Portman' in Toisanese instead of borrowing Cantonese [nejtaːtlej powmɐn]?

I hope I read that correctly. 妲 can also be read [tʰaːn]. That might be a recent modern reading by analogy with 袒 and 坦, both [tʰaːn]. Jiyun (1037) lists two fanqie for 妲:

- 當割切 for *tat, corresponding to Cantonese [taːt]

- 得案切 for *tan which should correspond to Cantonese †[taːn] with unaspirated [t]

The only Mandarin reading I know of is da [ta] from *tat, so I guessed that 妲 was [taːt] in 妮妲莉 'Natalie' (even though I'd expect an aspirated [tʰ] corresponding to written -t-). Is the Cantonese name based on an American pronunciation [næɾəli] with a voiced alveolar flap [ɾ]? If so, then unaspirated [t] would be a better match for [ɾ] than aspirated [tʰ]. THE DAY OF THE YELLOW HARE

Or, in Jurchen,

<YELLOW.giyan HARE DAY> sogiyan gulma? inenggi

I don't have time to even make a list like last night¹. And I don't want to wait another twelve days to say this, so ...

It's not clear how the Ming Jurchen would have written 'hare' in their script. The Bureau of Translators vocabulary (early 1400s?) has the Ming Mandarin transcription

古魯麻孩 *ku lu ma xaj (#150)

for a two-character spelling ending in a phonogram

<HARE.hai> gulmahai

whereas the Bureau of Interpreters vocabulary (c. 1500?) without Jurchen script has the Ming Mandarin transcription

姑麻洪 *ku ma xuŋ (#1100)

for gu[l]mahun (Kane 1989: 218) which reflects a different suffix found in Manchu gūlmahūn².

Kiyose (1977: 105) suggested that the Bureau of Translators form gulmahai is genitive, implying that the word for 'hare' without the genitive case marker -i was *gulmaha³. But if that as the case, how would -ha have been written? N3696 lists eight different Jurchen characters read xa (= my ha [χa]).

Here I've followed Andrew West who regards 'hare' as simply gulma sans suffixes, but at present I cannot confirm that shorter reading because the only phonetic evidence for the word I have on hand are the two transcriptions above. I do not know of any Jin dynasty attestations of the word. I suspect that the original spelling was a single logogram *<HARE>. However, I cannot say whether *<HARE> would have been read as *gūlma, *gūlmaha, *gūlmahūn, or something else.

¹2.12.0:49: I did make notes for a list to appear in this entry, but I lost it due to computer problems. I should reconstruct it later today before I forget.

²2.12.10:39: -hūn is probably the same suffix found in Manchu indahūn 'dog' and Ming Jurchen

<DOG.hun> indahun 'dog'

(from the Bureau of Translators vocabulary, transcribed 引荅洪 *in ta xuŋ [#147]; the Bureau of Interpreters vocabulary has indahu, transcribed 因荅忽 *in ta xu [#413]).

Other Tungusic languages have a bare stem (e.g., Orok ŋinda) or a different suffix (e.g., Oroch inaki).

It's not possible to tell whether the one-character spelling


from the Jurchen Character Book manuscript thought to be an early catalog of characters represented indahūn⁴, the bare root inda, or even inda with a different suffix. It's even possible that Proto-Tungusic *ŋ- (cf. the Orok form above) was still present in the Jin Jurchen word for 'dog'.

The function of -hūn is unclear to me. It does not seem to be the -hūn that Gorelova (2002: 148-150) regards as a suffix for Manchu quality nouns: e.g., aibishūn 'swollen, swelling (n.)' (cf. aibimbi 'to swell').

³2.12.9:39: See Gorelova (2002: 114) for examples of the Manchu noun suffix -ha. It is unclear to me how she distinguishes between nouns with -ha suffixes and nouns with unsuffixed roots ending in -ha (assuming the latter type of noun exists in her view).

⁴2.12.10:28: Jin Jurchen probably had a Manchu-like u/ū [u/ʊ] distinction lost in the dialect recorded by the Bureau of Translators. See Kiyose (1977: 45-46) on how Ming Jurchen spellings indicate the loss of that distinction.

It is unclear whether the Bureau of Interpreters dialect retained the distinction because there would be no clear way to indicate it in Ming Mandarin transcriptions: e.g., *ku ma xuŋ could represent either gu[l]mahun as Kane thought or gulmahūn. THE DAY OF THE YELLOW TIGER

Or, in Jurchen,

<YELLOW.giyan TIGER DAY> sogiyan tasha inenggi

I'm going to try something new. I have too many topics on my mind and not enough time to cover any of them properly. Yet I don't want them to slip away forgotten or remain as unfinished stub entries, never to be completed. So I'll just make a quick list of topics I might return to later. Might.

1. In "The Day of the Red Ox", I didn't mention Middle Korean 븕 pŭrk- 'red' which is somehow related to the Mongolic/Tungusic word for 'red'. Was -ŭ- [ɯ] an attempt to imitate a foreign [ʊ]? Here's a modern Korean book in which English took [tʰʊk] is phonetically rendered in hangul as 특 thŭk [tʰɯk].

2. Looking at the cover of Jacques (2014) with examples of Tangut ar4 words, I realized that maybe I was wrong about pre-Tangut *rjaC becoming Tangut ar4. Maybe *rjaC became rar4, whereas *CV-rjaC became ar4: i.e., *-rj- lenited to zero between a presyllable and the main vowel. (Actually, I think ar4 was phonetically something like [jæʳ], so maybe *-rj- was reduced to *-j-.)

3. I wish this page on Tungusic from 1998 were rewritten in Unicode. Maybe it'd be legible if I dug up an old pre-Unicode SIL phonetic font.

4. I was looking at Nedjalkov's (1997: 311, 314-315) description of Evenki vowels and vowel harmony. Two eye-catching things off the top of my head:

1. no true high vowels [i u]

2. long [ɛː] patterning with [a] rather than [ɛ] in vowel harmony.

Could [ɛː] be from *aj (cf. Korean 애 [ɛ] < Middle Korean [aj], [ʌj])?

2.11.1:11: Wikipedia doesn't even try to describe Evenki vowel harmony rules:

Knowledge of the rules of vowel harmony is fading, as vowel harmony is a complex topic for elementary speakers to grasp, the language is severely endangered (Janhunen), and many speakers are multilingual.

5. For three years I've agreed with Beckwith (2002) who was the first to propose that Pyu aṁ was [ɛ]. I've been assuming that aṁ [ɛ] < *e. Today I realized that maybe it could partly directly come from *ja: e.g.,

*ja > *jæ  > *jɛ > [ɛ]

in hrat·ṁ [r̥ɛt] 'eight' (cf. Old Tibetan brgyad 'eight').

6. For years I've wanted to convert transcriptions of Rouran names into Middle Chinese and see if anything interesting emerges. Here's an example: 郁久閭社崙 ʔuk kuʔ lɨə dʑiæʔ lon for 'Yujiulü Shelun' in modern standard Mandarin.

