Tangut has 105 rhymes that all end in vowels or semivowels (-j, -w). There is no agreement on what these 105 rhymes were, though I have my own ideas. Some Tangutologists believe that Tangut distinguished between short and long vowels: e.g., Arakawa reconstructed his rhyme group III as

Rhyme Arakawa's reconstruction Arakawa's grade Arakawa's vowel length Arakawa's medial
17 -a I short none
18 -ja II -j-
19 -aa III long none

Rhymes 19 and 20 generally have different initials. If they were identical, why would the Tangut split -aa syllables into two different categories? Since rhymes 18 and 19 share initials, could R19 be -jaa, the long counterpart of R18 -ja?

Estonian has three degrees of vowel length: short, long, and overlong. (This is not clear from Estonian spelling which only indicates a short-long distinction.)

Since the evidence for -j- in Tangut Grade II is weak and the three grades are transcribed very similarly, what if the three grades referred to three degrees of vowel length as in Estonian: e.g.,


Revised reconstruction

Grade Vowel length
17 -a I short
18 -aa II long
19 -aaa III overlong

If this reconstruction or Arakawa's were correct, I would expect Sanskrit syllables with long vowels to be transcribed by tangraphs pronounced (over)long vowels. But there are many counterexamples in Nishida 1964:

Skt short vowel: Arakawa Grade III long vowel, three-lengths Grade III overlong vowel

Skt mu > A -uu, 3L -uuu R4

Skt tsi, di, ni, mi > A -ii, 3L -iii R11

Skt va, ṣa > A -aa, 3L -aaa R19

Skt ga, ca, tha, da, pa, ma, sa > A -aa, 3L -aaa R20

Skt dha > A -aa', 3L -aaa' R24

Skt si > A -II, 3L -III R31

Skt ti > A -ee, 3L -eee R37

Skt pha > A -aaʳ, 3L -aaaʳ R87

Skt short vowel: three-lengths Grade II long vowel

I can't find any examples of Grade II in Tangut transcriptions of Sanskrit. This implies Grade II had some phonetic characteristic absent from Sanskrit. In my own reconstruction of Tangut, Grade II vowels

ʊ ɩ æ ʌ ɛ ɔ

are absent from Sanskrit (though R29 is not unlike Sanskrit short a).

Skt long vowel: Arakawa / three-lengths Grade I short vowel

Skt puu > A/3L -u R1

Skt dhaa, lo [loo], śo [ɕoo] > A/3L -o R50

(Skt -o is always long [oo])

The lack of correlation between Sanskrit and Tangut vowel length suggests that Tangut did not have a short-long vowel opposition. Although my reconstruction has such an opposition carried over from Gong's reconstruction*, I would like to replace it with something else.

*Long vowels in my reconstruction and Gong's often correspond to Arakawa's V'. RUSSTONIAN EESTI* WRONG!

This problem may not be limited to Estonian and Russian:

It should be noted that Estonian words and names quoted in international publications from Soviet sources are often incorrect back-transliterations from the Russian transliteration. Examples are the use of "ya" for "ä"** (e.g. Pyarnu instead of the correct Pärnu) and "y" instead of "õ" [ɤ] (e.g., Pylva instead of the correct Põlva). Even in the Encyclopædia Britannica one can find "ostrov Khiuma", where "ostrov" means "island" in Russian and "Khiuma" is back-transliteration from Russian instead of correct "Hiiumaa" (Hiiumaa > Хийума(а) > Khiuma).

*Eesti is Estonian for 'Estonian' and is meant to sound like Russian есть, Latin est, Proto-Indo-European *Hesti 'is', etc.

**Cf. the Japanese use of ya after velars (but not labials!) to approximate English [æ]: e.g.,

cat > kyatto

gal > gyaru

can also shift to *ja within a language: e.g.,

家 'house': Early Middle Chinese *kæ > Late Middle Chinese *kja > Mandarin jia [tɕja] "EVEN OUR DICTIONARIES HAVE BECOME BATTLEFIELDS"

I'm not going to take sides in the conflict over Serbo-Croatian, so I'll just quote some articles I've been reading:

A Serbian perspective:

In the nineteenth century it was the Croats who insisted on the idea of unity, in language and otherwise, and when unification actually occurred it turned out that they needed it only to solve some of the problems they were going through at the time. They did not intend to construct and develop a mutual state, their goal was to separate themselves from Yugoslavia. Toward that end, Croatian linguists began to attend to differences and not to similarities. The Serbs, who took the idea of unity seriously, were surprised by this attitude. Those struggling for secession found a stronghold among Croat language experts, who had gained in strength especially after the Declaration on the Name and Position of Croatian Literary Language (1967); they presented the desire for unity to the Croatian public as Serbian unitarian pressure. The Croatian literary language was proclaimed to be a separate entity, and discussions about that served as psychological preparation for secession. Massive numbers of new words were coined so that the Croatian literary language would differ as much as possible from Serbian ...

Political events and the war in 1991 and 1992 caused the break up of state unity in the area where Serbo-Croatian is spoken. Croatia is being flooded with a new wave of artificially created differences in language in relation to the Serbs, the greatest wave since the so-called Independent State of Croatia which existed under the protection of the Nazis from 1941 to 1945. However, on the Serbian side there were no similar changes, and the existing conditions among the Serbian public indicate that no such changes will occur.

A Croatian perspective:

In Communist Yugoslavia, Serbian language and terminology were prevailing in a few areas: the military, diplomacy, Federal Yugoslav institutions (various institutes and research centres), state media and jurisprudence at Yugoslav level ...

The methods used for this "unification" were manifold and chronologically multifarious; even still in the eighties, a common "argument" was to claim that the opponents of the official Yugoslav language policy were sympathising with the Ustaša regime of World War 2, and that the incriminated words were thus "ustašoid" as well. Another method was to punish authors who fought against censorship. Linguists and philologists, the authors of dictionaries, grammars etc., were not allowed to write their works freely and according to the best of their professional knowledge and competence. Hence, for example, the whole edition of the Croatian Orthography ('Hrvatski pravopis') edited by Babić-Finka-Moguš (1971) was destroyed in a paper factory just because it had been titled "Croatian" Orthography instead of "Serbocroatian" or "Croatoserbian" Orthography. No Croatian dictionaries (apart from the historical "Croatian or Serbian" dictionaries, conceived in the 19th century) appeared until 1985, when Communist centralism was well in the process of decay ...

Prior to 1991, the passive Croatian vocabulary contained many banished words equivalent to the actively used words of the politically approved vocabulary. For example, the officers of the JNA could be publicly called only oficir, and not časnik. For the usage of word časnik ('officer'), coined by a father of Croatian scientific terminology Bogoslav Šulek, the physician Ivan Šreter was sentenced to 50 days in jail in 1987.

There is a Croatian linguistic award named after Šreter. Is his name of German origin? It looks like Schröter.

Many Croats reacted by "expelling" all words in the Croatian language that had, in their minds, even distant Serbian origin.

Croatian linguists fought this wave of "populist purism", led by various patriotic non-linguists. Ironically: the same people who were, for decades, stigmatised as ultra-Croatian "linguistic nationalists" (Stjepan Babić, Dalibor Brozović, Radoslav Katičić, Miro Kačić) have been accused as pro-Serbian "political linguists" simply because they opposed these "language purges" that wanted to kick out numerous words of Church Slavonic origin (which are common not only to Croatian and Serbian, but are also present in Polish, Russian, Czech and other Slavic languages).

Daniel Solanović, a Croatian:

"The war in the former Yugoslavia is being fought on several fronts, some of which are more salient than others. Even our dictionaries have become battlefields, where mortar shells have been replaced by the coinage of nouns in the effort of separating us from them."

Sean McLennan, a Canadian (?) third party:

In fact, the [Croatian linguistic purist] movement has required the coinage of new words (using "Croatian" roots and affixation and compounding) for many meanings that had no word other than the "Serbian" ... Ironically, it appears that Croatians with language variants similar to Serbs are looked down upon as much as Serbs themselves ...

Croatians that are emphatically nationalistic have used some of these [so-called 'Serbian'] words all their lives ...

Unsurprisingly, the [supposedly or economically?] poorest speakers of Croatian are those with a variant closest to Serbian.

I always wonder how effective language reform movements really are. Prescription does not entail adoption.

Estonian language reformer Johannes Aavik and others proposed ex nihilo neologisms. According to Wikipedia, there are only 50-60 such roots, and many are not really ex nihilo but arbitrarily altered foreign words. How many are in actual use?

Written Dutch once had an artificial, German-like declension system that is now extinct:

It was never spoken by Dutch people, but was required as a formality in most forms of writing. It was generally unpopular, not only for being an arbitrary, enforced system of grammar but moreover, especially during the Nazi era, for deriving its grammatical case system from High German [I doubt this; see below]. The whole system was disavowed and annulled by the Taalunie (Dutch Language Union) in the early 1950s as a bad and regrettable mistake in prescriptive linguistics.

This system looks like a slightly modernized version of the Middle Dutch declension system: e.g., MD die 'the' has been changed to de, and the MD strong feminine singular adjective endings -e and -er were merged into -e, as in the spoken language. It is not a copy of the High German declension system, though the prestige of HG might have been a factor motivating the artificial retention of extinct inflexions. SERBIAN CYRILLIC - THE NEW GLAGOLITIC?

Is the Serbian Cyrillic alphabet doomed?

Globalism is not absent here, particularly in Belgrade. On nearly every street one can find the Latin script along with a plethora of billboard advertisements and slogans written in Latin. Along the main pedestrian-only street, Knez Mihaila ['Prince Michael'; knez is cognate to Eng king], English words are in abundance. English is widely spoken throughout Serbia, especially among the younger generations, going hand in hand with an extensive use of the Latin alphabet. It’s even quite difficult to find a T-shirt sold anywhere in Belgrade that is actually written in Cyrillic. The ubiquity of Latin is undeniable and the cultural purists may well use this as a tool in the near future to separate themselves from their adversarial, pro-western, liberal democratic types who the purists claim cave in to the US and the EU. After all, it is easy to associate the Latin script with the West, with America, with globalization, and ultimately claim it is un-Serbian ...

A look into the past gives a glimpse of what might be the inevitable doom of Serbian Cyrillic. Nikica Strizak, a student of Serbian language and literature, said, “If any script is going to die here it will be Cyrillic, sad but true. The reasons are purely practical. Cyrillic is going to become like the Glagolitic script used in Croatia until the XIXth century and now obsolete.” The new schooling system is based on learning two foreign languages from the first grade which will mean that Latin will be widely present in the future.

But Wikipedia says only English is compulsory in first grade; a second language doesn't start until fifth grade. And foreign languages with the Latin alphabet don't threaten Cyrillic in Russia. Early instruction in English doesn't threaten the scripts of China, Korea, or Japan.

Latin enables easier contact and communication in the sphere of technology and international matters. Indeed, in the technological age that is our time, Latin has a much higher chance of winning out over Cyrillic. Romania scrapped Cyrillic use altogether in favor of Latin as its only script more than a century ago.

But the Romanian Cyrillic letter џ lives on in Serbian Cyrillic!

Although I used to keep forgetting three Serbian letters, I would assume that learning two alphabets is a trivial task for a native speaker of Serbian. Serbian Cyrillic, with its one letter per phoneme principle, does not pose the same difficulties as, say, the Mongolian alphabet which is full of ambiguity or, worse yet, the dead Chinese-based Nôm script of Vietnam with thousands of vietographs*. However, the success or failure of a script cannot be predicted entirely on the basis of its complexity. Japanese has the most complex script in use today and it isn't going away any day soon. And Tangut was successful in its day, though I still suspect it had more principles than we realize.

*A Nôm Wikipedia has been proposed! But the first question that came to mind was, "Who would read it?" And who would convert existing Vietnamese-language articles into Nôm? The process could not be automated, as one Vietnamese syllable can correspond to many different Nôm characters.

Here's a complete list of requests for new Wikimedia languages. Tangut is missing. Jurchen is also theoretically possible, but I don't think we'll see Khitan any day soon until that language is far better understood. SO WHERE'S THE LIST OF NON-TANGUT TANGUT LEARNERS?

Is nationalism the motivation for this list? In any case, I don't think a language's importance is dependent on the fame of its foreign learners.

I wonder how accurate it is. Five out of nineteen names are followed by "citation needed".

It would be funny if some celebrity took up Tangut and started a fad. Nearly anyone would give up after trying to learn a few tangraphs. And how many could replicate the crucial distinction between plain, tense, and retrolfex vowels?

Since Tangut syllables have a very simple structure

C(w)V(G) + tone (level or rising)

with only two possible codas (the glides -j and -w), one has to get the vowels right to be understood. Most beginning Tangut learners would probably pronounce the following eight rhymes as -a:

-a R17 1.17

-a R17 2.14

-ɨa R19 1.19

-ɨa R19 2.16

-ạ R66 1.63

-ạ R66 2.56

-ɨạ R67 1.64

-ɨạ R67 2.57

(1.X = level tone; 2.X = rising tone)

No major world language has the diphthong ɨa. Retroflex vowels like might be mispronounced as -Vr vowel-liquid sequences. Some learners would not be able to distinguish between short and long vowels: e.g., -a R17 vs. -aa R22. Ћ Ђ Џ

I keep forgetting those three Serbian letters. (Not that I ever really tried to memorize them.) Unlike the other 'extra' letters (from my Russocentric perspective)

ј j [j] is like Latin j

љ lj [ʎ] looks like л l + ь (Serbian has no ь)

њ nj [ɲ] looks like н n + ь

they don't look like anything familiar, or their resemblance is misleading. ћ and ђ have nothing to do with Latin h, and џ (from the extinct Romanian Cyrillic alphabet!) is not voiceless or alveolar like ц. Maybe the following might help me remember them:

- ћ ć [tɕ] is like an inverted ч č [tʃ] with an added bar to distinguish it from h

- ћ ć [tɕ] sounds vaguely like English ch [tʃ]

- ђ đ [dʑ] has a tail indicating that it is the voiced version of ћ ć [tɕ]

(Serbo-Croatian đ [dʑ] is nothing like Vietnamese đ [ɗ]!)

- џ [dʒ] looks like ц c [ts] but is voiced and alveopalatal

- in Serbian alphabetical order, these three letters all follow letters for similar sounds:

- ђ đ [dʑ] after д d

- ћ ć [tɕ] after т t

- џ [dʒ] after ч č [tʃ] (not the ц c [ts] that it graphically resembles!)

Learning loanwords helps to reinforce a foreign alphabet: e.g, what is џез? (Russian has no letter џ, so its equivalent is джаз with a different vowel.)

I doubt there are Western loanwords with the letters ђ and ћ.

Quickly looking at the Serbian Wiktionary, I found Ђорђе Đorđe 'George'. (The tail of capital Ђ is so hard to see!)

I can't find any entries beginning with ћ in the Serbian Wiktionary, so I guess I'll have to regard the name ending -ић -ić as the archetypal example of ћ. EASY, SOFT COFFEE

I keep forgetting those three Serbian letters. Unlike the other 'extra' letters (from my Russocentric perspective)

ј j [j] is like Latin j

љ lj [ʎ] looks like л l + ь (Serbian has no ь)

њ nj [ɲ] looks like н n + ь

they don't look like anything familiar, or their resemblance is misleading. ћ and ђ have nothing to do with Latin h, and џ (from Romanian Cyrillic!) is not voiceless or alveolar like ц. Maybe the following might help me remember them:

- ћ ć [tɕ] is like an inverted ч č [tʃ] with an added bar to distinguish it from h

- ћ ć [tɕ] sounds vaguely like English ch [tʃ]

- ђ đ [dʑ] has a tail indicating that it is the voiced version of ћ ć [tɕ]

(Serbo-Croatian đ [dʑ] is nothing like Vietnamese đ [ɗ]!)

- џ [dʒ] looks like ц c [ts] but is voiced and alveopalatal

- in Serbian alphabetical order, these three letters all follow letters for similar sounds:

- ђ đ [dʑ] after д d

- ћ ć [tɕ] after т t

- џ [dʒ] after ч č [tʃ] (not the ц c [ts] that it graphically resembles!)

Learning loanwords helps to reinforce a foreign alphabet: e.g, what is џез? (Russian has no letter џ, so its equivalent is джаз with a different vowel.)

I doubt there are Western loanwords with the letters ђ and ћ.

Quickly looking at the Serbian Wiktionary, I found Ђорђе Đorđe 'George'. (The tail of capital Ђ is so hard to see!)

I can't find any entries beginning with ћ in the Serbian Wiktionary, so I guess I'll have to regard the name ending -ић -ić as the archetypal example of ћ. EASY, SOFT COFFEE

In my previous entry, I tried to guess variants of ijekavian words. Wikipedia provides actual examples under the misleading heading "Morphology" (oops): e.g., 'child':

ekavian dete

ijekavian dijete

ikavian dite

That entry also lists other kinds of phonetic variation involving h:

Bosnian and Croatian h : Serbian v

Bosnian h : Serbian and Croatian zero

The first reminds me of the correspondence between Ukrainian г [ɦ] and Russian г [v] in adjectival and pronominal declension:

Uk нового : Rus нового 'new' (masc. gen. sg.)

Uk його : Rus его 'him/his'

(But note that Ukrainian г is always [ɦ] whereas Russian г is normally [g].) I suspect pre-Serbian *h [x] voiced and labialized after the labial vowel *u:

*ux > *uxʷ > *uɣʷ > uv

'dry': Serbian суво < ?*suxo; cf. Russian сухой, Polish suchy, Czech suchý, Old Church Slavonic соухъ

'deaf': Serbian глуво < ?*gluxo; cf. Russian глухой, Polish głuchy, Czech hluchý (what's the OCS form?)́

(Polish and Czech ch = [x])

The second involves h before k and v (and any other consonants?).Comparison reveals that hC-clusters are conservative:

'easy': Serbian лако but Bosnian lahko; cf. Rus лёгкий, Pol lekki, Cz lehký, OCS льгъкъ

'soft': Serbian меко but Bosnian mehko; cf. Rus мягкий, Pol miękki, Cz kký, OCS мѩкъкъ

Note that Rus гк is [xk] (just like Bosnian hk [xk]), not [gk]. I don't know why Russian doesn't have мяккий for 'soft'.

Even without such evidence, the change *C > hC is unusual, so I would guess that hC is older than C.

Bosnian kahva 'coffee' is not a native word unlike 'easy' or 'soft'. It's probably borrowed from Turkish kahve 'id.' (< Arabic قهوة‎ qahwah). Croatian kava could be simplified from earlier ?*kahva, but Serbian kafa might have an -f- due to the influence of foreign -f-forms (e.g., Italian caffè). I'm surprised that the Croatian form is closer to Bosnian and Turkish than the Serbian form, since the Croats were not ruled by Turks, unlike the Serbs. WHITE ONIONS AMONG THE SEATS OF A SUMMER FLIGHT

Slavic languages are full of palatalization and/or its artifacts. Let's call nonpalatalized consonants 'type A' and palatalized consonants 'type B': e.g.,

type A l in Russian: лук 'onion'

type B in Russian: лёт 'flight', лето 'summer'

Old Chinese had a two-way distinction between 'type A' emphatic (= pharyngealized) and 'type B' nonemphatic consonants: e.g.,

Type of initial Sinograph Old Chinese gloss Old Chinese Middle Chinese Mandarin
A escape *lit *det die
B stumble *lit *jit yi

The type A and B initials developed quite differently, obscuring the original reasoning behind the phonetic 失 OC *hlit (now Md shi!) 'lose' for both graphs. (All three words could be cognate: 'escape' involves captors losing someone and stumbling involves losing one's footing.)

Such split development has parallels in Slavic: e.g, Polish type A r and ex-palatal original type B rz [ʒ] < *rʲ no longer sound alike, though their spellings reveal their historical connection (cf. the shared phonetic 失 implying earlier phonetic similarity between Md 跌 die and 佚 yi).

Serbo-Croatian does not have a Russian-type consonant system full of type A/B pairs. It only has four sets of type A/B (unit) phoneme oppositions:

c [ts], č [tʃ] : ć [tɕ]

(no dz!), dž [dʒ] : đ [dʑ]

n : nj [ɲ]

l : lj [ʎ]

As one would expect, type A l- corresponds to Russian type A l-: 'onion' is лук / luk in both languages.

However, according to Harrap's Serbo-Croatian Phrase Book, type A l- and type B lj- both correspond to Russian type B lʲ:

'flight': SC let : Rus лёт

'summer': SC ljeto : Rus лето

SC type A m- and type B mj- (a nasal-glide cluster, not a unit phoneme like nj-) both correspond to Russian type B mʲ:

'among': SC među : Rus между

(note too how SC type B palatal đ corresponds to a Rus synchronic type A nonpalatal cluster жд!)

'seat': SC mjesto : Rus место

To confuse matters further, Browne (1993: 321) wrote the SC word for 'seat' as m(j)esto with an optional glide -(j)-! What's going on here? Do the SC C- : Cj- distinctions reflect different Proto-Slavic vowels?

I suspect that since the Old Church Slavonic cognates of the SC lj- and mj- words have the vowel ѣ: лѣто and мѣсто. The authors of Harrap's seem to be using a '(i)jekavian' SC variety with -(i)je- as a reflex of *ѣ. (An example of -ije- from *ѣ is SC bijel 'white'; cf. OCS бѣлъ.) There are also 'ekavian' and 'ikavian' SC dialects which presumably have forms like

'summer': ?leto, ?lito

'seat': ?mesto, ?misto

'white': beo (in Browne 1993: 379), bil (in Browne 1993: 386)

Beograd 'Belgrade' is the 'White City'

Since ѣ may have been low front [æ] (Schenker 1993: 79), I am surprised that ikavian dialects raised it all the way to high -i-. Could ikavian -i- from *ѣ be a compression of an earlier *-ije-? RUSSERBIAN OR CRUSSIAN?

I have no car, so I walk everywhere (except to the airport and to other people's houses). I love walking because I can exercise mentally as well as physically. I always bring a book to read as I walk. Much if not most of my reading is done on the go.

On Saturday, I needed a little book that would be appropriate for a short trip, so I took Harrap's Serbo-Croat Phrase Book. Tourist books are not reliable linguistic sources*, but they are a cheap way to get a feel for a language.

I have never heard Serbo-Croatian spoken, so I read the SC forms to myself as if they were Russian. I must have sounded silly. One of the first words in the book is пут put 'path' (cognate to the English word and Sanskrit pathin). I couldn't help but pronounce it as if it were Russian путь put' (ь does not exist in Serbian).

Polysyllables were worse. SC has nothing like Russian vowel reduction. So SC могу mogu 'I can' (cognate to English may and German mögen 'want'?) really is [mogu] with initial (?) stress, whereas Russian могу mogú with final (!) stress sounds like magu to SC speakers since the o of unstressed mo- approaches a.

I imagine I would stop speaking 'Russerbian' or 'Crussian' if I took a course in SC, but since that'll never happen, the temptation to mispronounce it (and other Slavic languages) will persist. Similarity can be a stumbling block.

*For example, Harrap's leads one to believe that Serbo-Croatian has a simple stressed/unstressed syllable distinction, but the reality is far more complex. ЕСТЬ RUSSIAN EASY?

When writing "J-silent j", I wanted to compare Czech js-verb forms with the obsolete forms of the Russian verb быть 'to be' which weren't in Starostin's online verb tables or in my modern Russian grammar. I found the old Russian words for 'am', 'art', etc. in Riola's (1878) How to Learn Russian:

Gloss Russian (obsolete in red) Polish Czech Sanskrit Latin
I am есмь jestem jsem asmi sum
thou art еси jesteś jsi asi es
he/she/it is есть jest je(st) asti est
we are есмы jesteśmy jsme smas sumus
you are есте jesteście jste stha estis
they are суть jsou santi sunt

Systematic correspondences in such paradigms indicate that those languages are related. There are no such paradigms in the languages that I work with: e.g., Chinese does not have verb inflection and Korean and Japanese have very different verb systems.

The title refers not only to but to a surprising assertion in the introduction:

... Russian is easy of acquirement by dint of average diligence and perseverance.

Is he kidding!? The Defense Language Institute considered Russian to be harder than Western European languages for English speakers in 1973 (and I have no reason to believe they've changed their mind since then). Russian in 1878 was even harder than it is now, due to a redundant alphabet (extinct letters in red):

i was и, і, and ѵ (cf. Greek ι and υ)

e was both е and ѣ

f was both ф and ѳ (cf. Greek θ)

and silent ъ was all over the place: e.g., былъ 'was' (now был)

Russian once had even more letters: e.g., ѡ (cf. Greek ω). Of course, spelling does not equal an entire language, but it is part of a language, and a particularly important part for learners who rely heavily on written materials.

Tangut fonts by Mojikyo.org
Tangut radical font by Andrew West
All other content copyright © 2002-2009 Amritavision