Amaravati: Abode of Amritas

09.2.28.4:27: FLY LIKE AN EGO (why?*)

The word

bã ŋææ

in my Tangut puzzle was glossed as Tangut period NW Chinese 鵝 ?*ŋgɔ 'domestic goose'. Although the etymology of the first syllable of bã ŋææ is unknown, the second syllable may be cognate to its Chinese translation which in turn may be related to 'wild goose':

Tangut ŋææ R23 2.20 < ?*ŋraC < ??*r-ŋaC
TPNWC ?*ŋgɔ < Old Chinese *ŋaj < ?*ŋal

also cf. Old Chinese 雁 *ŋrans < ?*r-ŋan-s < ?*r-ŋar-s 'wild goose'

Neither OC word can be reconstructed with a presyllable resembling bã, so one cannot reconstruct a Proto-Sino-Tibetan *ba-ŋal, etc.

ŋææ is a Grade II syllable that may have once had medial *-r- (Gong 1993, cited in Guillaume Jacques' 2008 paper "The origin of vowel alternations in the Tangut verb"). This medial may be a metathesized prefix (note how OC *ŋaj has no *r at all). Tangut and Chinese may have undergone the same changes: *rCa > *Cra > æ.

(I have long been bothered by the fronting of *a after *r because I have never seen anything like it elsewhere in the world. In Arakawa's reconstruction, the rhyme would be -ja', and the shift of *-r- > -j- has a direct parallel in Burmese. Perhaps there was a Burmese-like stage in OC and pre-Tangut: *ra > *ja > *jæ > *æ?)

The vowel of ŋææ may have been lengthened to compensate for the loss of a final consonant. Arakawa's reconstruction has what appears to be a final glottal stop (which he writes as -') and Sofronov reconstructed an early Tangut *-C that later weakened to -ɯ (implying that *-C was grave - i.e., either velar or labial). I wonder if the lost consonant was *-l, which would not be compatible with Arakawa's reconstruction since *-l > *-ʔ is unusual (though *-l > Sofronov's -ɯ is reminiscent of my English which has final [ɰ] for /l/). There are no -ææ R23 words with known cognates or Chinese sources which could verify the presence or absence of an old coda (other than the *-H which I posit as the source of the rising tone that invariably accompanies R23 - is this just chance or meaningful?).

OC 雁 *ŋrans 'wild goose' has been compared to Proto-Indo-European *ghans- < ?*gH₂ens- 'goose' (> Sanskrit haṃsa 'goose, swan', German Gans, Eng goose, etc.), but the initials are difficult to reconcile, even if OC *-r- is an affix. If Chinese or Proto-Sino-Tibetan borrowed the word from PIE, I would expect OC *g-, not *ŋ-, and I would have to reject a link with OC 鵝 *ŋaj < ?*ŋal 'domestic goose'. Conversely, if PIE borrowed the word from Chinese or Proto-Sino-Tibetan, I would expect *g-, not *gh- (unless *g- was glottalized, and nonglottalized *gh- was perceived as a better match for foreign *ŋ- [the 'emphasis' of OC *ŋ- is probably secondary]). Hence I regard the OC and PIE words for 'goose' as lookalikes.

Written Tibetan ŋaŋ 'goose, swan' has a -ŋ which does not match the *-j < *-l or *-n < ?*-r of the OC words or the -n of Written Burmese ŋanh < *-ns 'goose'. Should the WT form also be regarded as a lookalike (albeit within the same language family?), or should its coda be regarded as a simplification of root-final *-l/r plus a suffix *-ŋ? I reject the latter is an ad hoc solution. In any case, the pre-Tangut form was probably not *ŋaŋ since pT *-aŋ became Tangut -o (Gong 1995: 56, 73).

*The graph 鵝 'domestic goose' consists of 我 'I' (phonetic) + 鳥 'bird' (semantic).

In Old Chinese, 鵝*ŋaj and 我 *ŋajʔ were nearly homophonous, but in Mandarin, 鵝 regularly developed to e

*ŋaj > *ŋa [ŋɑ] > *ŋɔ > *ŋo > *o > e [ɤ]

whereas 我 has the irregular, slightly archaic reading wo retaining a labial vowel (instead of e).

09.2.27.7:39: UNDERCOVER COGNATE

Can you spot it among these sinographs transcribed by

TT5722 ŋææ R23 2.20

The cognate is also a clue to this mystery.

Sinograph	Gloss	Middle Chinese grade	Middle Chinese	Tangut period NW Chinese	Mandarin
顏	face	II	*ŋæn	*ŋgæ̃	yan
岩	rock		*ŋæm
眼	eye		*ŋænʔ
雁	wild goose		*ŋænh
崖	high bank		*ŋæj	*ŋgæj
牙	tooth		*ŋæ	*ŋgæ	ya
芽	bud		*ŋæ
雅	elegant		*ŋæʔ
鴨	duck		*ʔæp	?ŋæ*
晏	peaceful		*ʔænh	?ŋæ̃*	yan
宴	feast	IV	*ʔenh	?ŋiẽ*	yan
英	heroic	III	*ʔɨæŋ	?ŋɨæ̃*	ying
邪	evil		*jæ	??ŋiæ*	ye
琊	second half of the place name 瑯琊		*jæ	??ŋiæ*	ye

Most of the sinographs have TPNWC readings with Grade II corresponding to Tangut Grade II.

All but one reading might have had the vowel *æ matching -ææ R23 2.20. The readings contain both oral and nasal vowels. I would have expected Chinese -æ̃ to be transcribed with -æ̃ R26, not -ææ R23. Perhaps TPNWC had denasalized its vowels. It's also possible that a Tangut dialect denasalized its vowels so -ææ R23 and -æ < -æ̃ R26 were both oral.

The use of a ŋ-initial tangraph for syllables that once had MC *ʔ- and *j- suggests that TPNWC had a nonetymological *ŋ- added to a 'zero initial' (from earlier glottal stop) and *j-:

MC	pre-TPNWC	TPNWC
*ʔ-	*Ø-	*ŋ-
*j-	*j-	*ŋj-

This is reminiscent of modern NW Mandarin dialects with ŋ- < *ʔ- (though no such modern dialect has a nasal initial in 英). However, I don't know of any modern Chinese languages with ŋ(j)- < *j-. Furthermore, I know of only three other velar-initial tangraphs transcribing TPNWC syllables that once had MC *ʔ-:

LFW1009 gee R38 2.34 for 哀 MC *ʔəj (in Leilin; ŋææ R23 2.20 also appears in Leilin)
LFW4848 ɣã R25 1.24 for 安 MC *ʔan (in Cixiaozhuan)

LFW5253 ɣew ?[ɣeɰ] R44 1.43 for 哀 MC *ʔəj (in Cixiaozhuan)

The TPNWC dialect known to the translator of Cixiaozhuan seems to have shifted MC *ʔ- to *ɣ-.

Perhaps the TPNWC zero-to-nasal or *ɣ-shift was only beginning at the time of the transcriptions, and extended to syllables whose modern equivalents would lack nasal initials: e.g., 英.

09.2.27.1:03: REGRESSIVE NASALIZATION IN TANGUT DISYLLABIC WORDS

The previous post contains the disyllabic word

bã ŋææ

Until recently, I have assumed that Tangut nasal vowels come from two sources:

1. Original *-VN sequences (either indigenous to Tangut or borrowed from other languages, particularly NW Chinese before it developed nasal vowels)

2. Borrowed -Ṽ (entirely from NW Chinese after it developed nasal vowels?)

Note that not all pre-Tangut *-VN sequences became Tangut nasal vowels: e.g., sọ 'three' does not have a nasal vowel even though it is related to Written Tibetan gsum. I do not yet understand why some *-VN became -V while others became -Ṽ. Was the outcome dependent on the vowel and/or the point of articulation of the nasal? The fact that there are no nasalized ũ-type syllables of native origin implies that all pre-Tangut *-uN were denasalized, but the situation is less clear for other *-VN syllables.

Lately I wonder if there is a third source of Tangut nasal vowels: nasal onsets of following syllables:

*CVNV > *CṼNV

I can't reveal a possible etymology of ŋææ without revealing the answer to yesterday's question, but I can say that its nasal is original. However, I don't know whether the nasalization of the vowel of the preceding syllable is the result of regressive assimilation

*ba ŋ... > bã ŋ...

or a trace of a lost nasal coda:

*baN ŋ... > bã ŋ...

I could pick a solution if I knew a likely external cognate for bã, but I don't. Since ŋææ by itself has an external cognate, I assume it is the root and that bã ŋææ could have once been a compound whose parts later became inseparable.

09.2.26.5:24: LFW 3513 3672 3323 0176

məɨ bã ŋææ nɨaa = ?

Hints: bã ŋææ is a disyllabic word, and məɨ bã ŋææ is a calque of a Chinese word.

09.2.26.4:36: COLUMN AT THE CRIMSON CAPE

I wonder how many people reading this article will notice that Kızılburun has no dots on its i-s. Turkish distinguishes between dotted i [i] and dotless ı [ɯ] - like 'oo' but without lip rounding.

Kızılburun is from kızıl 'red' plus burun 'nose' (and by extension, 'tip' and 'cape'). "Kızılburun cape" on page 2 of the article is redundant.

As exciting as this may sound to some -

The ship carrying the column sank in 150 feet (46 meters) of water. That's deep for scuba gear. Each dive requires a 15- to 20-minute decompression stop on the way back to the surface. Carlson's team has just 20 minutes of actual work time on each dive—the risk of the bends goes up the longer they're down—and they can only dive twice a day.

- I'd rather stay on dry land where I can examine data for hours without worrying about the bends.

09.2.26.4:10: INCREASINGLY BEAUTIFUL SLEEP

I'd like that, but I have to stay up.

In my last post, I mentioned the transcription 阿彌陀 for Amitaabha Buddha. Amitaabha is from

a-mi-ta + aa-bha

'un-measure-d' + 'towards-shine'

'infinite' + 'splendor'

阿彌陀 corresponds to amita 'infinite'. One might think that 阿彌陀 was pronounced *ʔa mi ta in Chinese at the time the name was borrowed. But in fact the Late Old Chinese pronunciation of 阿彌陀 was *ʔa mie da. Although the *d could reflect a Middle Indo-Aryan source dialect with medial lenition, I don't think there's any Indic-internal explanation for *ie instead of *i.

Centuries later, the Old Japanese syllable *mi was generally transcribed as

彌 Middle Chinese *mie 'increase'

瀰 Middle Chinese *mie(ʔ) 'rich and full flow (of a river)'

美 Middle Chinese *mɨiʔ 'beautiful'

I used to explain the high frequency of 彌 MC *mie as an artifact of a period before pre-OJ *me and *mi merged into OJ *mi. I assumed that 彌 MC *mie was originally intended to transcribe pre-OJ *me but came to represent OJ *mi < pre-OJ *mi as well as pre-OJ *me.

However, that doesn't explain why 美 MC *mɨiʔ was favored over 寐 MC *mih 'sleep' to transcribe OJ *mi. In Nihon shoki poetry, 美 MC *mɨiʔ is used 12 times to write OJ *mi whereas 寐 MC *mih appears only once. And in Kojiki poetry, 美 MC *mɨiʔ outnumbers 寐 MC *mih for OJ *mi by a ratio of 179 to 0.

Here's what I think happened:

寐 MC *mih should come from Early Old Chinese *mi(t)s, yet its phonetic is 未 EOC *məts 'not yet', and an *ə-phonetic for a *i-syllable is unusual and possibly even nonexistent. Therefore it is likely that 寐 was also EOC *məts or perhaps *rməts. If I am correct, then OC had no *mi(t)s (or even *mi(ʔ)(s)!).

未 EOC *məts developed regularly to LOC *mɨəs, but 寐 EOC *(r)məts underwent subsequent irrregular changes:

EOC *(r)məts > *mɨəs > *mɨs > LOC *mis

寐 LOC *mis with a final fricative was clearly not an acceptable transcription of Indic mi or pre-OJ *mi, so 彌 LOC *mie was used for both (and perhaps for pre-OJ *me as well).

In dictionary MC, 寐 became *mih and became marginally acceptable as a transcription of OJ *mi.

Note though that the Sino-Vietnamese reading mị for 寐 seems to reflect MC *mɨih < LOC *rməts. MC *mih would correspond to SV dị. Perhaps 寐 was also *mɨih in the MC dialect known to the Japanese and therefore a unappealing choice for OJ *mi because of its nonpalatal *-ɨ-.

But what about its near-homophone 美 MC *mɨiʔ for OJ *mi? Maybe the MC reading is only indirectly relevant. If the Paekche borrowed 美 MC *mɨiʔ as *mi, then the Paekche-ized reading would be a perfect match for OJ *mi. Then again, if that were correct, the Paekche-ized reading of 寐 MC ?*m(ɨ)ih would also be *mi and be equally accurate as a transcription of OJ *mi from a purely phonetic perspective. 美 could have been favored because 'beautiful' is positive whereas 'sleep' is neutral.

09.2.26.00:12: "I SET OUT TO LEARN DUTCH IN NAGASAKI"

Ik wil dit lezen. Too bad only a snippet view is available.

Fukuzawa Yukichi learned Dutch only to discover

that virtually all of the European merchants there [in Kanagawa, where I lived in 1991-1992] were speaking English rather than Dutch. He then began to study English, but at that time, English-Japanese interpreters were rare and dictionaries nonexistent, so his studies were slow.

He later published an English-Japanese dictionary and founded Keio Gijuku (where I studied 21 years ago).

Why did Fukuzawa study Dutch? What was the appeal of 蘭學 Rangaku, the 'study of orchids'? (蘭 Ran 'orchid' is an abbreviation of 阿蘭陀 Oranda 'Holland'*.)

The Dutch traders at Dejima in Nagasaki were the only European foreigners tolerated in Japan after 1640, and their movements were carefully watched and strictly controlled, being limited initially to one yearly trip to give their homage to the Shogun in Edo. They became instrumental, however, in transmitting to Japan some knowledge of the industrial and scientific revolution that was occurring in the West ...

*阿蘭陀 Oranda is a strange hybrid based on early Mandarin 阿 o and Japanese 陀 da. The choice of 蘭 could be based on early Mandarin lan or Japanese ran. The Mandarin name for 'Holland' is 荷蘭 Helan 'Lotus Orchid' which would have been pronounced Holan centuries ago.

Changing the middle character results in 阿彌陀 Amida, the name of Amitaabha Buddha. I plan to post about the Tangut version of his name soon.

09.2.26.00:02: "GOD IS NON-INDO-EUROPEAN"

according to Robert SP Beekes. He was referring to the Germanic word, not God. He thinks god is a borrowing from a pre-Indo-European *gut or *gud.

Here are some other Germanic words which may be of non-Indo-European origin. Some of them may turn out to be Indo-European after all.

2.26.00:18: Germanic does preserve the native Indo-European root for 'god': e.g., as Tue- in Tuesday.

Other English words with that root are listed here.

09.2.25.23:55: Ś Ź З / Ć З́ S

Are these 'extra' letters (from a Serbian or Croatian perspective) really necessary for Montenegrin?

Nikčević advocates amending of the Latin alphabet with three letters Ś, Ź, and З and corresponding Cyrillic letters Ć, З́ and S (representing IPA: [ç], [ʝ] and [ʣ] respectively).

Opponents acknowledge that these sounds can be heard by many Montenegrin speakers, however, they do not form a language system [= are not phonemic?] and so are allophones rather than phonemes. In addition, there are speakers in Montenegro who don't utter them and speakers of Serbian and Croatian outside of Montenegro (notably in Herzegovina and Bosanska Krajina) who do. In addition, introduction of those letters could pose significant technical difficulties (Eastern European code page ISO/IEC 8859-2 does not contain letter З, for example, and the corresponding letters were not proposed for Cyrillic).

ś is an interesting choice for [ç] which sounds more like [h] to me. Japanese [çi] is romanized as hi. One might assume wrongly that Montenegrin ś ź [ç ʝ] are homophonous with Polish ś ź [ɕ ʑ].

з for [dz] in the Montenegrin Latin alphabet is just bizarre. That letter is normally Cyrillic and represents [z], not [dz]. Confusingly, its Cyrillic counterpart ѕ [dz] looks like Latin s, though it is actually based on Greek ϛ [st] stigma. Cyrillic ѕ has a long history and is currently used to write Macedonian.

09.2.25.23:50: EXCESSI-FFU-E LETTERS

A couple of weeks ago, I found the 19th century textbook How to Learn Russian (1878) at Google Books. Today I found How to Learn Danish (Dano-Norwegian) (1879) from the same publisher using the same "Ollendorffian System of Teaching Languages", whatever that was - fads come and go in language learning. What's with the title in parentheses?

The term "Dano-Norwegian" has been used throughout the present work to avoid the constant repetition of the words Danish and Norwegian, both being, in point of fact, one and the same language.

Decades later, Bokmål almost ended up being called Dano-Norwegian:

The name Bokmål was officially adopted in 1929 after a proposition to call the written language Dano-Norwegian lost by a single vote in the Lagting (a chamber in the Norwegian parliament).

The hyphenated term has baggage.

The author of How to Learn Danish (Dano-Norwegian) clearly takes one side of the Norwegian language struggle:

Of late years a desire has been shown by certain patriotic Norwegians to secure for their native land a special mother- tongue, distinct from that which has for ages been common to the natives of Denmark and Norway. But the attempt to revive the language spoken by Norwegians before the union of their country with Denmark, at the close of the fourteenth century, would seem as impracticable and undesirable in our times, as if Englishmen were to insist upon incorporating in their written language the various remnants of Old English, which still survive in the local dialects of Cumberland, Dorset, and Somerset.

HoLD is not going to teach me the differences between standard Danish and Dano-Norwegian, I mean, Bokmål . It doesn't even start by teaching the alphabet which is covered in Appendix II* starting on p. 287! I can't imagine many language students being interested in the introduction which mentions this bit of trivia:

... neither precept nor ridicule could cure his countrymen of the taste for indulging in such verbal superfluity as that, for instance, of using ffu to represent the sound of v, as in haffue (have, to have).

You may recall that in Welsh, f is [v] and ff is [f]!

Speaking of v, Wikipedia says it can correspond to Bokmål g, which reminds me of the [v]-pronunciation of Russian г in его, etc.:

Sometimes Danish has /v/ ([ʊ̯], spelled v) after originally long stressed vowels, where Norwegian has restored/preserved /g/ from Old Norse. Example: Danish skov (forest), mave (belly) - Norwegian skog, mage - Old Norse skógr, magi. However, in many cases Norwegian has kept the Danish form (lyve "tell a lie" - Old Norse ljúga [cognate to English lie and German lügen]), and variation is permitted (mave, lyge, and even ljuge).

Presumably Danish v [ʊ̯] < *g had been *ɣ at some point.

*HoLD is so old that its appendix says Danish "[t]ill recently" had both ø and ö and it gives the spellings Kjöbenhavn and Köbenhavn for 'Copenhagen'**. (Danish now only has ø: København.) And it says å "in some dictionaries ... is made to precede the single ordinary a" (now it only follows y). It even gives examples of Danish in blackletter with very old spelling, including an aae/aa distinction now lost (both are equivalent to modern å).

(The Danish alphabet is presented in blackletter, including both ø and ö, on pp. 2 of Rask's Danish Grammar. A Norwegian-Danish Grammar and Reader has ö and aa in nonblackletter but ø and aa ~ å in blackletter on p. 2! Both books have the alphabet where it belongs - in the beginning!)

Danish nouns throughout HoLD are capitalized German-style since decapitalization of nouns did not occur until after WWII. (Was that decision influenced by a desire to de-Germanize? Cf. the postwar abandonment of the artificial German-like case system of written Dutch.)

**Although it's tempting to explain the -g- in Copenhagen by assuming the English name was borrowed before Danish shifted *-g- to -v-, that consonant is ultimately from *-f- (cf. German Hafen and Old English hæfen > modern haven; the root has become modern have, and a haven is a place that has ships). The -f- is preserved in Hafnia, the Latin name for the city. Copenhagen is based on the Low German name Kopenhagen. Did Low German speakers hear Danish -v- as their -g- [ɣ]? That seems unlikely. Moreover, the modern Low German words for 'harbor' are Haven (identical to Dutch) and Hoben or Haben. Is Hagen an obsolete word?

The -m- of Swedish hamn 'harbor' is presumably an older -f- that assimilated to the following nasal:

-fn > -vn > -mn

09.2.25.4:05: BHAARATANAAMAANI

is Sanskrit for 'Indian names' (and yes, naamaani is cognate to English names).

This article is too much for me to digest in a single sitting. India is its own universe. A hypothetical article on European names would be much simpler.

Since I am interested in Avestan, I wonder what Zarathushtra would think about this:

Some families in India rename themselves on the basis of their profession. This is common among the Parsis, who often have surnames ending with "wala" (also spelled "walla" or "wallah"), meaning someone who engages in a particular activity. Names like Screwala when the person might have sold screws, or Cyclewala (cycle seller) are quite common; one Bollywood actress is named Shenaz Treasurywala.

The Irani Zoroastrians tend to have Irani as a surname.

What does Zarathushtra's name mean? The second part looks like Sanskrit uṣṭra 'camel', but the first part puzzles me too. And why ist the Greek form Zoroaster instead of, say, Zarathustra? (There was no [ʃ] in Classical Greek.)

09.2.25.3:45: 'VULGAR SANSKRIT'?

I wish the Wikipedia article on Middle Indo-Aryan languages were comparable to the article on Vulgar Latin. With the exception of Pali which is still very much like Sanskrit, I know nothing about MIA languages and almost nothing about how those languages developed into New Indo-Aryan languages like Hindi, Bengali, etc. I can't find an article on NIA languages, though the overall Indo-Aryan language article has a brief overview of NIA phonology with charts of the consonant systems of 11 NIA languages. (Why does the chart for Hindi/Urdu treat "[w]" - presumably really v - as having a "very low functional load"!?)

09.2.25.3:15: BENGALI CONSONANT CLUSTERS

have their own Wikipedia article. Recycling the title of my previous post, "why not?" My notes:

- The Wiki romanization has some unusual characteristics:

ţ [ʈ] is not to be confused with Romanian ţ [ts]

đ [ɖ] is not to be confused with Vietnamese đ [ɗ]

f instead of the expected ph for [pʰ] ~ [ɸ] ~ [f] within standard Bengali (see note 1 below this table); eastern dialects have weakened [kʰ tʃʰ pʰ] to [x ʃ f] (see here) and some have even backed [ʃ] to [x] or even [h] (see here).

- খ্রিস্টান khrishţan is obviously not a "native coinage"

- What happened to native final clusters other than *-NC which became nasalization plus -C?

- Bengali has no w- or native v:

- Hence English [v] of nerve was borrowed as Bengali bh as well as v. I presume v is only in loans. (Pre-Bengali *v merged with *b.)

- Yet Persian/Urdu v- (in borrowings; < Arabic w-) was borrowed as Bengali o!

09.2.25.2:20: WHY NO-TT?

Looking at Wikipedia's article on the Welsh alphabet, I wonder why the voiceless counterpart [θ] of Welsh dd [ð] isn't written tt. If the spelling th for [θ] was influenced by English, why wasn't dh used for [ð]?

ff [f] : f [v] seems strange at first, but it is parallel to ll [ɬ] : l [l]. (But voiceless [r] is rh, not rr, and as we have already seen, dd [ð] is voiced.)

(2.25.2:35: This chart organizes Welsh consonants in terms of 'mutations'. In terms of spelling, ll and rh are the odd men out among the 'radical' set and dd is the sole 'soft' digraph.)

It sounds as if Welsh orthography once had a Vietnamese-like distribution* of c and k for [k]:

The grapheme k was also used more commonly than in the modern alphabet, particularly before front vowels. The disuse of this letter [k] is at least partly due to the publication of William Morgan's Welsh Bible, whose English printers, with type letter frequencies set for English and Latin, did not have enough k letters in their type cases to spell every /k/ sound as k, so the order went "C for K, because the printers have not so many as the Welsh requireth"; this was not liked at the time, but has become standard usage.

The cluster gwr- in Welsh gwraidd /gwraið/ (see here) reminds me of the Old Chinese cluster gʷr-: e.g., 倦 OC *gʷrens 'tired'. Note that OC *gʷ is a unit phoneme and not a cluster.

*In Vietnamese, [k] is written as k before front vowels and as c elsewhere.

09.2.25.2:02: SPQ_

It's not just for Romans anymore.

Q stands for -que 'and'. Note that the acronym is not SQPR 'Senate and People Roman', but SPQR 'Senate People-and Roman'. -que is an enclitic that attaches to the second word of a sequence:

'X and Y' = X Y-que

Its Sanskrit cognate ca is similar:

'X and Y' = X Y-ca

Although que and ca (pronounced 'cha') sound quite different, they both come from Proto-Indo-European *kʷe. que preserves the PIE form whereas the Sanskrit form has undergone two changes:

*kʷe > *ce > ca

09.2.24.0:51: WHERE IS RZYM?

A hint: Polish rz [ʒ] is from earlier palatalized *r.

Another hint within a mystery: Why do Slavic languages have a nonlabial high vowel instead of o in

Old Church Slavonic Римъ

Bulgarian Рим

Macedonian Рим

Serbo-Croatian Рим Rim

Slovenian Rim

Czech Řím

Slovak Rím

Polish Rzym

Ukrainian Рим [rɪm]

Belarusian Рым [rɨm]

Russian Рим [rʲim]

These are reminiscent of Arabic ريم riim 'Rome' from روم ruum via an intermediate stage with [yy] (Kaye 1987: 671).

(I can't find riim in any Arabic dictionary. Can someone confirm its existence? Is it obsolete? In any case, I doubt it's the source of the Slavic -i-forms. Do any other European languages have -i- in this name?)

And why is the name masculine instead of an -a-final feminine?

The cognate Slavic male name has the expected o and a!

If you haven't guessed the English equivalent of Rzym by now, here's the answer.

09.2.23.23:57: 104 LANGUAGE LEARNING SITES

If 105 Tangut rhymes aren't enough for you, try 104 language learning sites (via my blogfather James Hudnall). I've visited or at least heard of some of these before, but not most of them.

The one that caught my eye was gulfarabic.com. I am always interested in phonetic descriptions of colloquial Arabic which might give me more ideas about emphasis in Chinese. On the two "Arabic Sounds" pages (1 2), I see that

- nonemphatic k has sometimes shifted to ch (are the exceptions conditioned, or loanwords from standard Arabic?)

- uvular voiceless q has generally become velar voiced g (there is no Chinese parallel; emphasis has no effect on voicing in Chinese)

- the voiced counterpart of emphatic t is ð (again, no Chinese parallel); there is no d, so did ض d and ظ ð merge into ð? I would have predicted that the two would merge into d which is relatively less unusual (and more common in standard Arabic) than ð. Old Chinese *l and *d merged into Middle Chinese *d, whereas Gulf Arabic keeps l (is it just in 'allaah?) distinct from d.

Another possible standard pronunciation of ظ is z which I assume is absent from Gulf Arabic. Old Chinese had no emphatic z, but that's to be expected since unlike Arabic, it also lacked a nonemphatic z. (Middle Chinese *z- is at least in part from Old Chinese nonemphatic *sl-.)

09.2.23.23:55: MCREFUGEES

Who are they? Can you guess from the Korean term for them?

햄버거 난민

hEmbOgO nanmin ('... difficulty-people')

If you know Chinese or Japanese, you probably guessed that nanmin = 難民. Answer here.

09.2.23.23:54: SHINING CLARITY OF KOREA

Who is 韓熙淑 Han Hi-sook? Hint: She's a famous American. The first letter is a clue. Answer here.

09.2.23.1:09: HOW MIXE-D UP IS TANGUT?

Although Tangut was spoken in Central Asia, typological parallels may turn up in the most unexpected places like Mexico:

Syllable nuclei are notoriously complex in Mixe and apart from the three lengths they can consist of one of two kinds of glottalization or aspiration; these vowel qualities are sometimes described as checked vowels, creaky voice vowels and breathy voice vowels.

I first took note of Mixe when I was looking for languages other than Estonian with three degrees of vowel length. Although I don't think Tangut had overlong vowels (and even long vowels are in doubt), its syllabic nuclei were probably complex. The 105 rhymes of Tangut did not end in consonants other than glides and therefore must have had complex vocalic nuclei. I've wondered if Tangut 'tense' vowels were creaky, and if the Tangut 'tones' actually referred to clear vs. breathy phonation.

The ancestor of Mixe had a simple phonology with only six vowels (disregarding length) - the same six I would reconstruct for pre-Tangut and Old Chinese (though I prefer the notation *ə to *ɨ for OC*):

*i	*ɨ	*u
*e	*a	*o

I presume that the creaky and breathy voice of Mixe reflect Proto-Mixe-Zoquean combinations of vowels with *ʔ and *h. Such combinations existed in Early Old Chinese, led to a Mixe-like phonation distinction in Late Old Chinese, and ultimately to tones in Middle Chinese and beyond. Hence Mixe may develop tones in the future. I suspect that the Tangut rising tone (breathy phonation or a pitch [originally?] associated with it?) could reflect pre-Tangut *-H, a merger of even earlier *-ʔ and *-h (assuming that pre-Tangut phonology was similar to that of [its sister language?] Old Chinese).

This sample of Mixe has some odd features that I have never seen anywhere, like prepalatalized stops including glottal stop ʲʔ (!?) and palatalized jʲ (!?). Perhaps those oddities are artifacts of an arcane phonological (as opposed to phonetic) notation.

*Vietnamese and Japanese evidence point toward a diphthong like *ɨə in Late Old Chinese and Early Middle Chinese. Such a diphthong may be from a nonemphatic Early Old Chinese *ə that partly bent upwards. *ə patterns more like a nonhigh vowel than a high vowel (which remained intact without bending):

Nonhigh nonemphatic vowels: first half bends up

EOC *ə > LOC *ɨə

EOC *a > LOC *ɨa

EOC *e > LOC *ie

EOC *o > LOC *uo

High nonemphatic vowels: no bending because they're already high

EOC *i > LOC *i

EOC *u > LOC *u

If EOC had a high nonemphatic *ɨ, it would have remained unchanged in LOC.

09.2.22.18:24: RE-RE-C-Õ-NSTRUCTING TANGUT RHYME 29

I was not seriously considering the possibility of three degrees of vowel length in Tangut, though I wanted to see how that hypothesis would fare if tested against Sanskrit transcriptive evidence. It bombed, just as I expected.

I realized another hypothesis bombed as I wrote this line of that post:

R29 -ʌ is not unlike Sanskrit short a

But R29 tangraphs were never used to transcribe any Sanskrit syllables. All Tibetan transcriptions in Nishida (1964: 50) and Tai (2008: 212) end in -i(H). Chinese transcriptions of R29 syllables include sinographs probably pronounced with *i or *ɨ in Tangut period NW Chinese: e.g.,

匙示 cf. Sino-Korean si

之.cf. Sino-Korean ci

but note also

事獅 cf. Sino-Korean sa < earlier SK sʌ

率 cf. Sino-Korean sol 솔

but these pronunciations may not have been found in the NW

Most of the evidence indicates that my reconstruction -ʌ for R29 must be incorrect. So how should I re-reconstruct R29?

Before I answer that question, I want to state some premises:

In the Tangraphic Sea, Tangut rhymes are organized into groups by vowel type.

Within each group, rhymes are arranged according to 'grade'.

There are four grades:

I. Nonhigh vowel

II. Lowered vowel

III. Nonpalatal high vowel

IV. Palatal high vowel

Until now, I've assumed that the base vowel of group VI was ə:

Grade I: -ə R28 (already nonhigh, so remains unchanged)

Grade II: -ʌ R29 (lowered schwa; could also be -ɐ)

Grade III: -ɨə R30 (schwa partly bent up)

Grade IV: -iə R31 (schwa partly bent up)

However, these reconstructions conflict with the evidence:

-ə R28 never transcribes Skt -a (or any Skt vowel), even though Skt -a is schwa-like.

-iə R31 never transcribes Skt -ya and in fact transcribes Skt -i

If group VI is reinterpreted with a high base vowel ɨ, its rhymes fit the evidence better:

Grade I: -əɨ R28 (barred i partly bent down)

This diphthong is unlike anything in Sanskrit, so it is not surprising that R28 is not used in Sanskrit transcriptions.

This diphthong also has no Tibetan counterpart, so Tibetan transcriptions may reflect either its first or second parts:

-a, -e, -o (based on the nonhigh first part?)

-i, -u (based on the high second part?)

Grade II: -ɤ R29 (lowered barred i)

This vowel is unlike anything in Sanskrit, so it is not surprising that R28 is not used in Sanskrit transcriptions.

-ɤ R29 was transcribed in Chinese as -ɨ and in Tibetan as -i(H)

just as Estonian õ [ɤ] (not nasal [õ]!) was transcribed in Russian as ы

(Is the tilde used in other alphabets to indicate nonlabiality?)

Grade III: -ɨ R30 (already nonpalatal and high, so remains unchanged)

ʃɨ R30 1.29 transcribed Sanskrit ṣi (there is no ʃi R31 1.30)

Generally transcribed in Tibetan as -i(H) (rarely -a)

Grade IV: -iɨ R31 (first part of barred i became palatal)

R31 transcribed Sanskrit -i

Generally transcribed in Tibetan as -i(H) (rarely -a, -e, -ii, -yi)

Other group VI rhymes can be similarly reconstructed with added length, tenseness, and/or retroflexion: e.g., long retroflex R100 should be -ɨɨʳ instead of -ɨəəʳ. I have updated my Excel file of Tangut rhymes to include this new interpretation of group VI: .xls / .htm.