Amaravati: Abode of Amritas

12.12.29.23:59: HOW CAN NASAL(IZATION) RAISE AND LOWER VOWELS?

Why would [low] a raise to schwa before n, m, and even v?

In French, nasalized nonlow* vowels from earlier vowel-nasal sequences lowered:

*-en > [ɑ̃]

*-in > [ɛ̃]

*-on > [ɔ̃]

*-un > [œ̃]

How can nasal consonants (as in Avestan) and nasalization (as in French) condition opposite outcomes?

As for v, which at first appears to be the odd man out in Avestan, could it have been a glide like [ʋ] or [w] that was a sonorant like n and m?

I am reminded of how Early Middle Chinese *m became Late Middle Chinese *ʋ before *u: e.g.,

萬 EMC *muanh > LMC *ʋàn

12.30.2.10: But wait - according to Wikipedia, Proto-French *o before *n raised in Late Old French and then lowered:

PF *on > LOF ũ > Fr [ɔ̃]

I don't think nasality had anything to do with the raising. Proto-French had no *-un because Gallo-Romance *-un had become Proto-French *-yn. My guess is that *on rose to fill the gap left by *un. If *u had not fronted, *o might not have risen.

I wonder what the evidence for the height of Late Old French ũ is.

Tangut also lacked *ũ except in three Chinese loanwords with rhyme 104 (1.96):

1xəũ 'red' < Chn 紅

1təũ 'winter' < Chn 冬

1tsəũ 'transcription character for 宗, 總, 駿'

Would *õ have eventually raised to *ũ if Chinese loanwords had not filled that gap?

*12.30.1:30: Obviously low vowels couldn't be lowered because they were already low and had nowhere to go but up.

12.12.28.23:59: PARƏ̄-BLEMS (PART 3)

In part 1, I suggested that the schwa in Old Avestan parə̄ 'before' was from unstressed *-as or *-ō (from *-as). If this was correct, I would expect its Sanskrit cognate paras to have an accented first syllable. But the accent is actually on the second syllable (parás), not the first (*páras).

Sanskrit had pitch accent, not stress accent, so one might reconstruct a Proto-Indo-Iranian *ˈparás with a high pitch on the second unstressed syllable and low pitch on the first stressed syllable. But does any language have such combinations of pitch and stress?

Could the original form have been *páras with a high pitch on the first accent (like its Sanskrit cousins pára 'far' and pári 'from') that was shifted to the end in Sanskrit by analogy with other adpositions like tirás 'through' and the semantically and phonetically similar purás 'before'. On the other hand, Avestan could have shifted to a stress accent system:

*páras > *ˈparas > ˈparə̄

Other accentual shifts in either Avestan or Sanskrit would be needed to account for other mismatches like

'who': OAv yə̄m : Skt yás
'whom': OAv yə̄m : Skt yám

'with': OAv hə̄m : Skt sám

'strong': OAv ə̄mavaṇtəm : Skt ámavantam

In fact Jackson (1892: 10) listed only one example of Old Avestan ə̄ corresponding to an unaccented Sanskrit vowel:

'us' (enclitic): OAv nə̄ : Skt nas

This scenario does not predict that Old Avestan schwas correspond to Sanskrit and Greek accented vowels: e.g.,

'I': OAv azə̄m : Skt ahám : Gr ἐγώ

Would a correspondence like this imply that either Proto-Indo-Iranian or Greek had shifted its accent?

'us' (nonenclitic): OAv ə̄hma- : Skt asmá- (both from Proto-Indo-Iranian*asmá-) : Aeolic Gr ἄμμ-

And what is one to make of words like

'being': Av həṇtəm : Skt sántam

in which both vowels are schwa (and short)?

I don't think accent has anything to do with schwas in Avestan. However, the adjacent consonants are key. Jackson (1892: 9) pointed out that Avestan schwa

often corresponds to Skt. a before n or m - regularly so before the latter when final; occasionally also before v.

Why would a raise to schwa before n, m, and even v? Is there some common factor shared by that raising with the shift of *-as or *-ō (from *-as) to long -ə̄?

Why did schwa lengthen in nonfinal position: e.g., in 'I'?

Proto-Indo-Iranian *ajham > *azəm > OAv azə̄m

(12.29.0:05: Young Avestan, though newer, still has short schwa: azəm! Older and more innovative are not mutually exclusive categories. Khitan has innovations absent from Mongolian which is attested later: e.g., Mongolian tabun 'five' is more conservative than Khitan tau.)

To merge with more frequent (?) final long -ə̄? Has anyone calculated the frequency of Avestan segments? (I hesitate to write "phonemes" since Avestan phonemics are still not fully understood.)

12.29.00:59: AFTERWORD: I write entries like these about topics I don't specialize in with a degree of embarrassment. I have no pretensions of originality here. I could be reinventing the wheel, but it's more likely that I'm just forging a square that will never roll anywhere. I do enjoy the mental exercise, though.

I don't keep up with the literature and in many ways my knowledge is stuck in the 19th century, which is partly why I still write -y- and -v- for -ii- and -uu-. (I also favor the older transcriptions of medial glides because nonspecialists reading this blog might mistake the newer transcriptions for long vowels.) Jackson's 1892 grammar was my first introduction to Avestan 20 years ago. Maybe tremendous progress has been made in the 120 (!) years since then. I finally got around to ordering Beekes' 1988 A Grammar of Gatha-Avestan. I assume I could also learn a lot from the section on schwa (partially viewable online) in De Vaan's The Avestan Vowels (2003) which I forgot about until now. Alas, that book is not available for purchase.

12.12.27.23:59: PARƏ̄-BLEMS (PART 2)

When I wrote "Mysteries of the Magi", I glossed Old Avestan magāunō in

parə̄ magāunō 'before the master of the gift exchange*' (Yasna 33.7; tr. by Skjærvø 2006: 89)

as a genitive singular since

- parə̄ governs the ablative, genitive, or locative (Jackson 1892: 204)

- based on my understanding of Jackson (1892: 91) those case forms in the singular should be

abl. magāunat̚

gen. magāunō

loc. magāunī

and obviously only the genitive matches

But Skjærvø (2006a: 22, 27) identified magaonō (note the different spelling for the same word in Yasna 33.7!**) as an ablative singular and listed parə̄ as solely requiring the ablative, so I changed my entry.

How can magāunō (and/or magaonō) be an ablative singular if the ablative singular is supposed to be magāunat̚? I had overlooked the fact that the ablative singular of non-a-stem words like magavan- is identical to the genitive singular in Old Avestan (Jackson 1892: 65, Skjærvø 2006b: 3). In Young Avestan, the a-stem genitive ending -t̚ spread to the other declensions, so the Young Avestan ablative would be magaonat̚ with -ao- instead of Old Avestan -āu- (Jackson 1892: 21).

That raises another issue. I can only be sure that parə̄ only governs the ablative in Old Avestan if there are Old Avestan phrases in which it precedes unambiguous ablatives:

parə̄ a-stem + -t̚ ablative

parə̄ any stem + ablative plural (which is never the same as the genitive plural)

But none of the parə̄ that I can find in the Gathas (the core source of Old Avestan) are in such a context:

parə̄ + non-a-stem ablative-genitive:

parə̄ magāunō ~ magaonō (Yasna 33.7)

parə̄ + non-a-stem instrumental (! - with a difference of meaning):

parə̄ vå vīspāiš vaoxəmā daēvāiš-ca xrafstrāiš maṣ̌yāiš-ca (Yasna 34.5)

'over-and-above you-ACC PL all-INST PL declare-PF 1 PL gods-INST PL-and creepy-INST PL men-INST PL-and'

'We have (always) declared you (to be) over and above all the creepy old gods as well as (their?) men' (tr. by Skjærvø 2006: 45; no 'old' in the original Old Avestan)

Skjærvø (2006: 89) grouped parā 'before' together with parə̄ 'before' as examples of prepositions governing the ablative, but parā is* also followed by ambiguous non-a-stem ablative-genitives:

parā mazə̄ yåŋhō (Yasna 30.2)

before great-ABL/GEN SG audition-ABL/GEN SG

'before the great audition' (tr. by Skjærvø 2006: 89; see Jackson 1892: 83 on the abl.-gen. of consonant stems like maz-*** and yāh-****)

So how did Skjærvø determine that parə̄ governs the ablative? And how did Jackson determine that parə̄ also governs the locative?

12.28.3:06: As usual, I wrote in haste, and a possible answer came to me right after I was done: Sanskrit paras 'beyond' (not 'before'!) governs the ablative (and the accusative, instrumental, and locative, but not the genitive), so it's likely that its Old Avestan cognate parə̄ also governed the ablative (and in Yasna 34.5 definitely governed the instrumental which neither Skjærvø nor Jackson mentioned).

Similarly, Sanskrit para 'more than' which governs the ablative (and only occasionally the genitive) is cognate to Old Avestan parā 'before' (with secondary lengthening), so it's likely that the latter also governed the ablative, though the genitive cannot be completely ruled out. According to Jackson (1892: 204), para (= Old Avestan parā) governed four cases: accusative and instrumental (unlike Sanskrit para) as well as ablative and genitive (like Sanskrit para).

As tempting as it is to guess about Avestan on the basis of Sanskrit, I think guesses should be explictly marked as such in reference grammars (e.g., Jackson 1892 has hypothetical forms in parentheses) but perhaps not in pedagogical materials. "Perhaps" because it might be a good idea to let students know how uncertain the state of our knowledge is.

*12.28.3:26: Skjærvø's translation of magāunō (his magaonō) is different from Wikipedia's 'the religious caste of the Medes*' and 'possessing maga-' and Kanga's (1900: 388) 'one who undertakes difficult works (orig.); a righteous or devout man; a virtuous person who has full faith in religion [...] Latterly, it meant a celibate'.

**12.28.3:26: Is Skjærvø's spelling from a different manuscript? His -ao- is expected from a Sanskritocentric perspective as it corresponds perfectly to the -o- in maghon-, the Sanskrit cognate of magāun- ~ magaon-.

***12.28.3:45: The genitive singular mazə̄ is from Proto-Indo-Iranian *majhas which in turn is from Proto-Indo-European *megʕ-es. See part 1 of this series on the shift of *-as to -ə̄ in Old Avestan. I will revisit this topic in Part 3.

****12.28.3:39: The stem yāh- is from *yās-:

- When *yās- was followed by high vowels, it became yāh-.

- When *yās- was followed by nonhigh vowels, it became yåŋh-.

I don't understand how *-s- became -ŋh- with a nasal. I think rhinoglottophilia was involved.

12.12.26.23:59: PARƏ̄-BLEMS (PART 1)

The Old Avestan word magāunō from my last post on the word magi is from the phrase

parə̄ magāunō (Yasna 33.7)

Old Avestan parə̄ 'before' corresponds to Young Avestan parō. One might initially conclude that Old Avestan -ə̄ merged with -ō in Young Avestan, just as Old Japanese ə merged with o in Middle Japanese. But both Old Avestan final vowels go back to a Proto-Indo-Iranian *-as still intact in Sanskrit paras 'beyond' (whose sandhi variant is paro [pɐroː]).

Burrow (1973: 101) regards Old Avestan ə̄ as a "dialectal variant corresponding to the -e of Māgadhī [i.e., eastern Middle Indo-Aryan]" and the even earlier -e "found even in the Ṛgveda (sūre duhitā 'daughter of the sun' [instead of *sūras duhitā]),

What's going on here? My guess:

- In the dialects ancestral to Young Avestan and mainstream Sanskrit:

*-as > *-aš > *-až > *-av > *-aw > -ō

The change of *ž to *v can be compared to these changes in Xi'an (though they are conditioned by an *u absent in *-as):

書 *šu > fu 'book'

入 *žu > vu 'enter'

- In the dialect ancestral to Old Avestan (but not Young Avestan!), *-as might have become unstressed *-əs, lost its *-s, and lengthened in final position like all other Old Avestan vowels:

*-as > *-əs > *-ə > -ə̄

A simpler scenario is that unstressed -ō from *-as (see above) was reduced to -ə̄ in Old Avestan. Unfortunately, nothing seems to be known about Avestan pitch or stress accent, though I suspect it can be partly reconstructed on the basis of shifts like these.

- In the sūre-dialect of the Ṛgveda and the ancestor of Māgadhī:

*-as > *-aš > *-až > *-ay > *-ey > e [eː]

- Another possibility for those varieties of Indo-Aryan (cf. the fronting of a before coronals in Tibetan: -as > -ɛ):

*-as > *-es > *-eš > *-ež > *-ey > e [eː]

12.27.3:18: This is unlikely since other coronals did not condition fronting in Old Indo-Aryan: e.g., *-at did not become *-et.

Next: What Comes After 'Before'?

12.27.2:24: I wish I could see what W. Sidney Allen wrote about this in full in Sandhi (1962: 101). All I can see in Google Books is a snippet that led me to Bloomfield (1882) who proposed that the vowels of Proto-Indo-European *-es and *-os were preserved as -e in Māgadhī, -o [oː] in classical Sanskrit, and -ō in Young Avestan.

Here's just one problem with Bloomfield's proposal: on pp. 33-34 he notes that most Sanskrit *-as come from *-os, not *-es, and calculates that the former outnumber the latter by an 18 : 7 ratio in texts. Why would pre-Māgadhī speakers change the majority form *-os to the minority form *-es that according to him later became -e? I think it's more likely that different Old Indo-Aryan dialects changed *-as (< Proto-Indo-Iranian *-as < Proto-Indo-European *-es and *-os) in different ways.

I can't find anything on this topic in Kobayashi's (2004) Historical Phonology of Old Indo-Aryan Consonants.

12.12.25.23:39: MYSTERIES OF THE MAGI

The Wikipedia section on the Iranian origins of magi is confusing.

First, it is titled "In Persian sources" even though it largely deals with Avestan which is not ancestral to Persian.

Second, it mentions three Avestan words -

Old Avestan magāunō 'the religious caste of the Medes* which Zoroaster was born into'

Young Avestan moɣu-t̚ biš 'hostile to the moghu'

Old Avestan magavan- 'possessing maga-'

- and denies that the third is related to the first two even though magāunō is the ablative singular of magavan-! Would it be more accurate to say that magavan- and moɣu- are not related?

The differing first vowels and second consonants are not a barrier to relationship since Avestan short o is from short a before u** and Young Avestan -ɣ- is from Old Avestan intervocalic -g-, so moɣu- is from an earlier *magu-.

The Sanskrit cognate of Old Avestan magavan- is maghavan 'generous' from magha- 'gift' and the suffix -van. Magha- in turn is derived from the root √maṃh 'give' plus a suffix -a-; I assume its first -a- is from a syllabic nasal in zero-grade *mn̩gh-.

Old Avestan magāunō corresponds to Sanskrit maghonas. Normally Avestan āu corresponds to Sanskrit au from *āu, whereas Avestan ao corresponds to Sanskrit o from *au. Why isn't the Avestan form *magaonō with ao? The genitive singular*** of aṣ̌avan- 'righteous' in the same declension is attested in Old Avestan as both aṣ̌aonō and aṣ̌āunō. Could this variation have arisen during the oral transmission process?

*Although the author of that gloss in Wikipedia implies that Zoroaster (or at least his caste) was originally from Media. Encyclopædia Iranica is agnostic (emphasis mine):

“When and where Zaraθuštra lived, one does not know.” Those words of H. Lommel (1930, p. 3) ring as true today as they did when he wrote them. Despite many attempts to situate Zaraθuštra in historical time and geographic place, all we have are possibilities that may strike one as more or less reasonable.

[...]

There is really nothing in the Gathas which might give a clue where Zoroaster lived or the areas in which he was active. In the Avesta, the geography of the Vendīdād and of the Yašts make it clear that these texts locate themselves in eastern Iran. Even though there are later traditions which place him in Azerbaijan and Media [i.e., to the west], it is more reasonable to locate Zoroaster somewhere in eastern Iran along with the rest of the Avesta.

It is odd that Zoroaster's name is not regularized within the article. I wonder if it is rendered both ways throughout Encyclopædia Iranica.

**Proto-Indo-European short *o became *a in Proto-Indo-Iranian. All Avestan short o are secondary.

Avestan short o in the diphthong ao is from the diphthong *au.

Not all short a befor a consonant followed by u became o: e.g., Old Avestan pasu- 'domestic animal' remained unchanged in Young Avestan instead of becoming *posu-. (Kanga's 1900 Avestan dictionary has no entries beginning with po-.)

***12.27.20:07: In Old Avestan, the ablative and genitive singular are identical for all non-a-stem nouns (Skjærvø 2006: 3), so the ablative singular of Old Avestan aṣ̌avan- should also be aṣ̌aonō ~ aṣ̌āunō.

12.12.24.23:51: OSSETIC EJECTIVES

The only Iranian languages I'm acquainted with are Avestan and Persian.

Avestan is similar to its sister Sanskrit. It lacks retroflexes and aspirates, but is rich in fricatives.

Nonoverlapping consonants in Avestan (including allophones; blue) and Sanskrit (red)

	Labial	Dental	Alveolar	Retroflex	Alveopalatal	Palatal	Palatovelar	Velar	Labiovelar	Glottal
Voiceless stops		t̚ /t/*		ʈ
Voiceless aspirates	pʰ	tʰ		ʈʰ		cʰ		kʰ
Voiceless fricatives	f	θ			ʃ		xʲ	x	x^v	h
Voiceless nasals	m̥ /hm/**
Voiced stops				ɖ
Voiced aspirate stops	bʱ	dʱ		ɖʱ		ɟʱ		gʱ
Voiced fricatives	β	ð	z		ʒ			ɣ		ɦ
Voiced nasals				ɳ			ŋʲ		ŋ^v

Avestan's 'niece' Persian also has more fricatives than a typical Indo-Aryan language prior to Perso-Arabic contact: e.g., there are no native Hindi words with f, z, ʒ, x, ɣ which are in Persian.

None of the consonants of Avestan or Persian strike me as exotic.

Ossetic, on the other hand, is an Iranian language with ejectives which are relatively rare in the world's languages. They are in only 15% of the languages in UPSID, though they are common in the Caucasus where Ossetic is spoken.

The ejectives of Ossetic are not inherited from Proto-Indo-European even if the glottalic theory is correct: e.g., PIE *tʼéḱm̩t 'ten' became Ossetic dɐs, not *tʼɐs. Are they like Indo-Aryan retroflexes which have both native and borrowed sources: e.g., Sanskrit retroflex ṇ [ɳ] is from

- native dental *n assimilating to a preceding *r: e.g., karaṇa- 'action' < *kara-na- (cognate to karma and Sanskrit)

- Dravidian retroflex ṇ: e.g., koṇa- 'corner' (cf. Tamil kōṇam 'id.'; the resemblance to English corner is coincidental, as the true Sanskrit cognate of that word is śṛṅga- 'horn')

Or are Ossetic ejectives only in loanwords?

12.25.5:30: JTL Cheung answered my question in his 2000 PhD dissertation Studies in the historical development of the Ossetic vocalism which also describes the history of Ossetic consonants (p. 13):

The ejective consonants [of Ossetic] are frequently found not only in Caucasian loanwords, but also in borrowings from Russian, where they represent voiceless stops and affricates, e.g., pʼalet 'epaulette' (cf. Russ. èpolet), parti [pʼarti] 'party' (cf. Russ. partija). Sometimes even inherited [i.e., native] forms contain an ejective consonant, see § 0.7.5.11.

I went there (p. 46) and found this passage:

The ejective pronunciation [in native words] is possibly connected with the accent: it can hardly be accidental tht the ejective is mostly found in the second syllable, to which the accent shifts, if the vowel in the first syllable is "weak" (æ [ɐ], [I.] y [ ɨ], [D.] i and u) [i.e., ɐ; also ɨ in the Iron dialect but i and u in the Digoron dialect?].

This explanation cannot account for word-initial ejectives in the Russian loanwords above. I presume that Russian voiceless stops and affricates sound like Ossetic ejectives to Ossetic ears.

*12.25.4:25: The unreleased Avestan stop t̚ is usually a word-final allophone of /t/. I presume other final stops in Avestan were also unreleased, but they lacked special letters in the Avestan alphabet because final /t/ was more common. (The ablative singular and secondary third person singular suffixes end in /t/. I don't know of any suffixes ending in other stops.)

**12.25.5:10: I presume /hm/ had a special letter because it was more frequent than other /h/ + nasal sequences. I cannot find any instances of /hn/ in Skjærvø's Old Avestan Glossary and Index. Oddly I can't find any reference to the letter <m̥> in Jackson's Avesta grammar, though Jackson does mention it on p. 33 of his book on the Avestan alphabet.

12.12.23.23:59: AN INTRODUCTION TO TANGUT PRONUNCIATION

2mi 1ŋwəəu 1ɣɪ ̣0lho 1ɣa 2ʔo (LR: Mi ngwu ghi lho gha o)

(lit. 'Tangut language sound go out gate enter*)

0. Introduction

The most important thing to know about Tangut pronunciation is that no one knows how Tangut was actually pronounced. Thinking one knows is not the same as actually knowing.

Nonetheless, it is not as if anyone's guess is as good as anyone else's. Tangut pronunciation can be reconstructed on the basis of evidence such as:

- The native Tangut linguistic tradition:

- tonal terminology (see section 5 below)

- classes of initial consonants (see section 2 below)

- rhyme lists, tables, and dictionaries

- fanqie phonetic formulae describing a syllable in terms of one syllable with its initial consonant and another syllable with its rhyme

- Phonetic components in the Tangut script (tangraphy)

- Tangut transcriptions of foreign languages (Chinese and Sanskrit)

- Foreign (Chinese and Tibetan) transcriptions of Tangut

- Comparison of Tangut with languages thought to be related to Tangut (which invites the danger of circularity, as probable relationships are determined by ... reconstructions!)

Despite this wealth of evidence, the clues are ambiguous as well as numerous, and are open to interpretation. Hence there is not much of a consensus about the reconstruction of Tangut, and it seems as if almost every linguist working on Tangut has his own reconstruction.

There are four reconstructions that those who read about Tangut are likely to encounter:

- Nishida's 1964-66 reconstruction

- Sofronov's 1968 reconstruction

- Gong's 1997 reconstruction

- Arakawa's reconstruction

The reconstruction on this blog has been an extensive modification of Gong's for nearly five years.

It is not possible to write a comprehensive yet nontechnical guide to all of the reconstructions of Tangut or even the big four. Instead, I will focus on what they have in common: i.e., what is likely to be true, as opposed to some detail that might be discarded tomorrow.

My lay romanization (LR) for use in nontechnical English-language writing about Tangut largely reflects that consensus, though it still has some 'proprietary' features that others might reject. An understanding of the LR is a first step toward understanding Tangut phonetics, so I will concentrate on describing Tangut through the prism of the LR while also mentioning key elements absent from the LR.

Technical phonetic notation in the International Phonetic Alphabet is in brackets: e.g., IPA [j] is equivalent to LR y.

1. Tangut syllable structure

Most Tangut roots consist of one syllable.

Tangut syllables in turn consist of

- an obligatory initial consonant (see section 2 below)

- an optional medial consonant (see section 3 below)

- an obligatory vowel or sequence of vowels (i.e., a diphthong; see section 4 below)

- an obligatory tone (see section 5 below)

Tangut syllables had few if any final consonants. The LR has only one final consonant -w. Another possible final consonant is -y, usually written -j (pronounced as in German ja, not as in English jaw). Still another possible consonant is a glottal stop -, the sound between the two syllables of English uh-oh.

At an earlier stage Tangut must have had more final consonants like -k, -m, -n, -ŋ (= -ng), -p, and -t, but it is not clear when these were lost in Tangut. However, foreign transcriptions indicate that at least some varieties of Tangut had lost most if not all final consonants by the 12th century if not earlier.

Lost final consonants left traces in later Tangut: e.g., earlier -k sometimes became -w, and vowel-nasal (-m, -n, -ŋ) sequences may have become diphthongs and/or nasal vowels.

2. Initial consonants

There are nine classes of initial consonants in the native Tangut linguistic tradition:

I. 'Heavy lip sounds' (bilabials): LR p-, ph-, b-, m-

II. 'Light lip sound(s)' (labiodental[s]): LR v- (perhaps also f-?)

Most reconstruct the consonant represented by LR v- as bilabial w-, but I reconstruct it as v- (perhaps [ʋ] if not [v]) since

- it is transcribed in Tibetan as ww- and b(h)- as well as w- (Tai 2008: 177-178)

- w- is bilabial, yet Tangut linguists put the consonant represented by LR v- into a separate category used by traditional Chinese linguists for labiodental f-, implying the Tangut consonant was more like f- than bilabials like p-

III. 'Tongue tip sounds' (dentals): LR t, th-, d-, n-

IV. 'Tongue top sounds' (?): These consonants appear only in a small number of words and are controversial.

V. 'Tooth sounds' (velars despite the Tangut term*): LR k-, kh-, g-, ŋ-

VI. 'Tooth tip sounds' (alveolars): LR ts-, tsh-, dz-, s-

VII. 'True tooth sounds' (?): LR ch-, chh-, j-, sh-

The exact pronunciation of these consonants is uncertain: they could have been

- retroflex [tʂ tʂʰ dʐ ʂ]

- alveopalatal [tʃ tʃʰ dʒ ʃ]

- palatal [tɕ tɕʰ dʑ tɕ]

VIII. 'Throat sounds' (laryngeals): LR zero, h-, gh-

Initial glottal stop ʔ- is unwritten in LR: e.g., the LR equivalent of

2ʔo 'enter'

is o.

h- and gh- could have been velar [x ɣ] or glottal [h ɦ].

IX. 'Flowing wind sounds'*** (liquids [and ...?]): LR l-, lh-, r-, z-, zh-

These seem to be a mix of liquids with z-type fricatives. Perhaps the latter were more liquidlike (e.g., [ɮ ʐ]) than English z and zh.

These consonants can more or less be pronounced as in English with some exceptions below.

H in Tangut initial consonants

LR ch and sh are pronounced roughly as in English chair and share, though LR ch is unaspirated unlike English ch.

LR zh is pronounced roughly as in English genre.

LR gh is [ɣ] (like Belarusian Г) or [ɦ] (like Ukrainian Г). English has neither sound, so they could be approximated as h. (12.24.2:52: Or one could ignore the h and pronounce gh as [g] since this consonant was often transcribed in Tibetan as <g>.)

LR lh might have been a voiceless l (which sounds like "hl" to English speakers) or a voiceless alveolar lateral fricative [ɬ] like Welsh ll.

Other LR sequences of consonant + h are aspirates: LR chh, kh, ph, th. These are pronounced roughly as in

beachhead

backhand
uphold

boathouse

They are not pronounced as kh in Russian romanization, ph in English physics, or th in English thin or this.

3. Medial consonants

The only certain medial consonant is -w-, which is the only permissible medial consonant in LR.

Another possible medial consonant is -y- or -j- depending on one's style of notation.

There may have even been a third medial like -ɰ-.

The high vowels that I reconstuct as the first elements of diphthongs could be reinterpreted as medial consonants: e.g.,

1lɨə 'wind'

as 1lɰə, etc.

4. Vowels

It is widely believed that Tangut had a rich vowel system because the native Precious Rhymes of the Tangraphic Sea dictionary lists 105 rhymes which ended in few if any final consonants (see section 1 above). It is unlikely that Tangut had 105 distinct simple vowels differentiated only by height, frontness, and rounding. Its vowels must have had other characteristics such as

- length (not indicated in LR; different reconstructions have length in different rhymes)
- nasalization (LR -n)

- 'tenseness' (possibly sphinctered voice as in Yi and Bai [link not working as of 12.24.4:30]; not indicated in LR since it is difficult for English speakers to perceive and pronounce; most wrote it as a subscript dot, but Arakawa wrote it as -q)

- retroflexion (LR -r)

My reconstruction has many diphthongs, but there is no agreement on which rhymes had diphthongs, so I avoid them in LR which has only six vowel symbols followed by the aforementioned letters for nasalization and retroflexion:

Basic vowel	Nasalized	Retroflex	Nasalized retroflex
a	an	ar	(none)
e	en	er
i	in	ir
o	on	or	orn
u	un	ur	(none)

A, e, i, o, u are pronounced as in Spanish, Italian, etc.: i.e., as "ah", "eh", "ee", "oh", and "oo".

The choice of y for the sixth vowel in LR was influenced by the y for ы in Russian romanization. LR y represents nonpalatal, nonlabial vowels like [ɨ ə].

5. Tones

Tangut had two basic tones, 'level' and 'rising'. These names were carried over from the names for the first two tones of traditional Chinese phonology and may or may not have described the contours of Tangut tones.

These tones are not indicated in LR but are often written as numerals (1 for level and 2 for rising) after or even before a syllable: e.g.,

ŋwəəu¹ ~ ŋwəəu1 ~ 1ŋwəəu 'language' is ŋwəəu with the level tone

ŋwəəu² ~ ŋwəəu2 ~ 2ŋwəəu 'fragrant' is ŋwəəu with the rising tone

Syllables with unknown tones may be marked with zero: e.g.,

0lho 'go out'

might have had tone 1 or tone 2.

There is evidence for more tone categories in Tangut, but most research assumes a two-tone paradigm.

12.24.4:12: ADDENDUM: Beyond Pronunciation

Having mentioned tones and rhymes, I should note that Tangut syllables are often accompanied by numbers in the format X.Y. X is the tone (1 or 2) and Y is the rhyme number (1-97 for tone 1 and 1-86 for tone 2): e.g.,

mi 2.10 'Tangut'

has the tenth rhyme of the second tone.

There is another system for naming rhymes with the format RZ. R stands for rhyme and Z is a number from 1 to 105. How can Tangut have 105 rhymes if it only has 97 rhymes with the level tone and 86 rhymes with the rising tone? These 105 rhymes ignore tone. Some rhymes occur only in one tone but not the other. For example, R1-R7 consists of 1.1-1.7 and 2.1-2.6 (not 2.7!):

Rhyme number (ignoring tone)	Level tone	Rising tone
R1	-əu 1.1	-əu 2.1
R2	-ɨu 1.2	-ɨu 2.2
R3	-iu 1.3	-iu 2.3
R4	-ʊ 1.4	-ʊ 2.4
R5	-əəu 1.5	-əəu 2.5
R6	-ʊʊ 1.6	(no -ʊʊ with the rising tone)
R7	-ɨuu ~ -iuu 1.7	-ɨuu ~ -iuu 2.6 (not 2.7!)

Traditional Tangut phonologists combined rhymes with -ɨ- and -i- if there were no minimal pairs distinguished only by -ɨ- and -i- (e.g., in R7).

Conversely, rhymes which had minimal pairs distinguished only by those vowels were separated: e.g.,

bɨu 2.2 'border' and biu 2.3 'elephant'

Rhymes can be placed into groups (攝) and into very large groups called 'cycles' which have similar patterns of vowel sequences.

The membership of rhyme groups and cycles varies by reconstruction.

I generally follow Gong's reconstruction and posit 11 rhyme groups (in LR) without regard for tension or retroflexion

I. u-rhymes

II. i-rhymes

III. in-rhymes
IV. a-rhymes

V. an-rhymes

VI y-rhymes

VII. e-rhymes

VIII. en-rhymes

IX. iw-rhymes

X. o-rhymes

XI. on-rhymes

and three cycles with vowels more or less in the sequence above (u first and on last):

I. rhymes with plain vowels: R1-R60

II. rhymes with tense vowels: R61-R76

III. rhymes with retroflex vowels: R77-R103

R104-R105 are late additions to the system; they are in Chinese loanwords with plain vowels.

*12.24.2:39: 1ɣɪ ̣0lho 'sound go out' and 1ɣa 2ʔo 'gate enter' are my calques of Chinese 發音 'pronunciation' (lit. 'go out sound') and 入門 'introduction' (lit. 'enter gate') with Tangut object-verb order. Burmese အသံထွက <asaṃ thwak> 'pronunciation' has the same structure as my Tangut calque.)

**12.24.1:02: Velar consonants are not pronounced with the teeth. Although traditional Chinese phonology is influenced by Indian phonology, velars are kaṇṭhya- 'throaty' in the latter, not 'toothy'. Chinese 牙 had an initial velar *ŋ-, so the term 牙音 'tooth sounds' may have been intended to mean 'sounds like ŋ-'. The Tangut translation of 牙音

2kõʳ 1ɣɪ̣ (LR: korn ghi) 'tooth sounds'

also begins with a velar k- and could be interpreted as 'sounds like k-'.

***12.24.1:33: The Tangut term

1lɨə 2ʐɨẹ 1ɣɪ̣ (LR: ly zhe ghi) 'wind ? sounds'

has been translated as 流風音 'flowing wind sounds', so one might expect the second tangraph to be an adjective 'flowing', but as far as I know, it is a surname character 'Zhe' (Nishida 1966: 489), Li Fanwen 2008: 855) or means 'alone' (Li Fanwen 2008: 855) or 'decoration' (Kychanov and Arakawa 2006: 599).

I think 1lɨə 2ʐɨẹ 1ɣɪ̣ means 'l- and zh-sounds' (corresponding to 來 and 日 in traditional Chinese phonology).