The main Tangut word for shooting arrows


goes back to a pre-Tangut *khra or *q(r)a:

- *-ra became (as in Chinese - perhaps due to Chinese influence?)

- *-a after q became (no Chinese parallel)

- the first tone (1-) developed if there was no final consonant -H

I can't find anything like *k(h)ra for 'shoot' at STEDT (Sino-Tibetan Etymological Dictionary and Thesaurus), but I do see potential cognates of the Tangut word without *r in at least three branches of Sino-Tibetan:


Muya qho⁵⁵ lə̱⁵³ 

Namuyi qha³¹

Lanping Pumi khə¹³ tʂhɑ⁵⁵ 

Mawo Qiang qhuʴ (with vowel retroflexion!)

Taoping Qiang qhɑ³³ 

Shixing khi⁵³ 


Daofu fqe

(I presume the root of forms like Caodeng kɐ-lɐt is the second syllable; lɐt vaguely looks like the Early Old Chinese root √*l-k 'to shoot' as in 弋 *lək 'to shoot' and 射 *mi-lak-s 'to shoot', but the consonants -t and *-k don't match.)


Yongning Na qʰæ˧˥ 

Naxi (locality unspecified) khæ⁵⁵ 


Pa-O kháʔ 

STEDT lists a protoform *gaːp 'shoot' with various initial reflexes:

zero ~ k- ~ g- ~ ŋ- ~ hw-

Could this protoform be reconstructed with *q-? q- might have

- shifted to glottal stop (as proposed for Old Chinese by Baxter and Sagart) and then zero

- fronted to k

- fronted and voiced to g (cf. q > ɢ ~ ɣ in modern Persian)

- the voicing might also be from a voiced prefix

- lenited to x and backed to h (but what about the -w-?)

- fused with a nasal prefix to become ŋ-

depending on the language.

Could the aspiration of some forms

- be from a medial *-r- (as in some possible pre-Tangut reconstructions)

(cf. *-r- > -h- in Pittayaporn's Proto-Tai reconstruction)

- that is the source of the retroflex vowel in Mawo Qiang

- or is the Mawo Qiang retroflexion the trace of a rhotic prefix or uffix?

- be from a voiceless prefix: e.g., *sq- > qh- > kh-?; cf. the voiceless f- of Daofu fqe

The mechanics of vowel raising are a mystery to me:

Fronting No fronting or backing Backing
Raising to high Shixing khi⁵³  (no examples with ɨ) Mawo Qiang qhuʴ 
Raising to mid Daofu fqe Lanping Pumi khə¹³ tʂhɑ⁵⁵ Muya qho⁵⁵ lə̱⁵³ 
No raising Tangut 1khæ Namuyi qha³¹ Taoping Qiang qhɑ³³ 

I would expect final consonants like *-p to be lost in Qiangic and Naxi. The final glottal stop of kháʔ  could be a remnant of *-p.

9.29.11:59: I can't explain the variety of tones: e.g.,

Rising Level Falling
High Yongning Na qʰæ˧˥  Muya qho⁵⁵ lə̱⁵³  Shixing khi⁵³ 
Taoping Qiang qhɑ³³ 
Low Lanping Pumi khə¹³ tʂhɑ⁵⁵ 
Namuyi qha³¹

I used to assume that the Chinese model of tonogenesis was universal:

Phase Segments Phonation Tones
1: Toneless + - -
2: Phonation + nonphonemic tones - + + (predictable)
3: Phonemic tones - - + (unpredictable)

The -k-s of Early Old Chinese 射 *mi-lak-s 'to shoot' became the 'departing tone' of Middle Chinese (symbolized by a grave accent):

Phase 1: Early Old Chinese *mi-lak-s

Phase 2: Early Middle Chinese *ʑià̤ (*mi-l- > *ʑi, -k-s > *-h > breathy phonation [symbolized by a subscript diaeresis] with predictable 'departing tone')

Phase 3: Late Middle Chinese *ʑià (breathy phonation conditioning 'departing tone' lost; 'departing tone' now unpredictable and hence phonemic)

For convenience, I have been writing Middle Chinese phonation with segmental symbols: e.g., *-h for breathy voice.

However, Evans (2001ab) had a different explanation for Qiang tones. In Kirby's (2011: 1) words,

Evans (2001a, 2001b) argues that modern Southern Qiang (SQ) developed tones through a somewhat typologically unusual pathway: after developing pitch accent from earlier lexical stress, the languages became increasingly 'tone-prone' following phonological reduction of syllables and the segmental inventory (Matisoff, 1998), developing tonal systems after heavy borrowing from Mandarin.

Northern Qiang, on the other hand, never developed tones: e.g., Mawo Qiang qhuʴ.

According to Evans (2001b as quoted by Kirby 2011: 3), Southern Qiang is

the first documented case of which I am aware in which tonogenesis has occurred without any concomitant loss of segmental information.

This is not to say that Southern Qiang never lost any segments. Segmental loss has occurred after tonogenesis, but is not the cause of tonogenesis and apparently never led to any phonation stage. Evans (2001a: 79) wrote that the Southern Qiang language Taoping (emphasis mine)

is more phonologically complex than either Longxi or Mianchi [both Southern Qiang languages with only two major tones each*], but has been presented as having a clear-cut six tone system [with three major tones**]. When Liu and H. Sun gathered data in the 1950’s, elderly speakers of Taoping still maintained complex initial clusters, which, according to QYJZ [Qiangyu jianzhi = Brief Descripiton of the Qiang Language], they pronounced with the same tones as did the younger speakers [who no longer pronounced initial clusters].

For example, elderly Taoping khsi⁵⁵ 'new' has the same tone as younger Taoping tshi⁵⁵ combined with the same initial cluster as atonal Mawo khsə.

It seems that Taoping underwent two waves of segmental loss:

Wave 1: shared with Northern Qiang: loss of final consonants: e.g., *khsik > khsi

cf. Tangut

1siw 'new' < *sik

for other -k cognates see STEDT

Between the waves: tonogenesis: e.g., khsi > khsi⁵⁵

Wave 2: Initial reduction during the 20th century: e.g., khs- > tsh-

Until now, I've assumed that Tangut underwent Chinese-style tonogenesis and since the mid-90s I've suspect that Tangut might have been at the phonation stage; its 'level tone' might have been clear phonation and its 'rising tone' might have been some marked type of phonation (creaky? breathy?).

Phase 1: Pre-Tangut: *-Ø vs. *-H (absence or presence of final glottal segment *-H)

Phase 2: Tangut clear vs. marked phonation after loss of *-H conditioning the latter

However, this fails to account for the mysterious 'entering tone' in the Precious Rhymes of the Tangraphic Sea and the even more enigmatic 'departing tone' (see Clauson 1964: 69-70). Could Tangut have been more like Southern Qiang with major and minor tones (largely) not conditioned by segments?

"Largely" is in parentheses because Kirby (2011) reconstructed segmentally conditioned tone splits in Taoping: e.g.,

*low tone + high tone syllable sequence >

tone 31 + tone 33 if second syllable has a voiced initial

tone 31 + tone 55 if second syllable has a voiceless initial

Someday I'd like to look at tones in Tangut disyllabic words like the Tangut autonym

2miə 2nɨaa (cf. Written Tibetan mi-nyag 'Tangut')

and see if there are any distributional patterns.

*Longxi has three minor tones in 1% of its vocabulary. Two of the three are partly associated with borrowings from Mandarin.

Mianchi has three minor tones in 5% of its vocabulary. One is partly associated with borrowings from Mandarin.

**Taoping has three minor tones in 9% of its vocabulary. One is partly associated with borrowings and one is only in borrowings from Mandarin. ARROWS STRIKING AN E-NIGMA

In honor of Jim Shooter's birthday, here's the character for 'to shoot' in the script devised c. 1036 AD for the extinct Tangut language in Central Asia:

The late 11th century AD Tangut dictionary Tangraphic Sea analyzed 'to shoot' as being a composite of three other Tangut characters:


'to shoot' = left of 'to strike a target' + right side of 'arrow' + right side of 'earth': i.e.,


It makes sense to write 'to shoot' with 'to strike a target' and 'arrow', but what is the function of 'earth'?

*9.28.8:24: ADDENDUM: The vertical line on the left side of

'to strike a target'

could represent an arrow, and the right side of


is the element


so the two parts together may stand for an arrow held by a hand:


'arrow' + 'hand'?

Another possibility is that the left and center parts of 'to shoot' are simply the character 'hand' consisting of a vertical line plus the element 'hand' which cannot stand alone:


'hand' (independent) = 丨 + 'hand' (dependent)?

Could 'to shoot' be a compound of 'hand' with a drawing of a bow (匚) and arrow (丿)?


'to shoot' = 'hand' + 'bow and arrow'?

If so, then the Tangraphic Sea explanation is wrong.

But if the right side of 'to shoot' is a bow and arrow, what is the same shape doing on the right of 'earth' and other Tangut characters like

1ɣạ. 'sword' or 2khæ 'plowshare'

which are not related to archery?

'Plowshare' (2khæ) is almost homophonous with 'to shoot' (1khæ) except for the tones indicated by numbers (2 and 1). Could 'to shoot' consist of 'hand' as a semantic element plus an abbreviation of 'plowshare' as a phonetic element?


1khæ 'to shoot' = 'hand' + 2khæ 'plowshare'?

One might expect 'plowshare' or an abbreviation to appear in other characters pronounced khæ, yet 'to shoot' and 'plowshare' are the only two khæ-characters sharing an E-shaped right-hand element.

There are five more characters with that element pronounced with the rhyme:

æ 'sand' (borrowed from Chinese 沙 'id.'; 'water' next to the E-nigmatic element)

æ (first half of 2ʂæ 2ʂɨu 'small fruit growing on grass'; 'grass' next to the phonetic 'sand' above)

æ 'Chinese fir' ('tree/wood' over phonetic 'sand')

1phæ 'rake' ('tree/wood' atop 'mouth' next to the E-nigmatic element)

1kæ 'pottery' ('hand', 'water', 'earth', and the E-nigmatic element)

Could the E shape be a partial phonetic: i.e., a symbol for a rhyme rather than an entire syllable? Would a Tangut reader see 'to shoot' and think, 'it stands for a word involving a hand and ending in -æ'?


1khæ 'to shoot' = 'hand' + -æ

The trouble is that the E shape also appears in 216 other characters (roughly one out of 30 Tangut characters) whose readings do not end in -æ: e.g.,

1tsəʳ 'earth' and 'sword'.

The E-nigma resists all attempts to crack it. FINAL VOICED GLOTTAL FRICATIVES IN TANGUT RECONSIDERED

Final voiced glottal fricatives (-ɦ) are a feature of the late Nishida Tatsuo's reconstruction that are not in any other Tangut reconstructions I have seen. His sometimes corresponds to -འ -H in Tibetan transcriptions of Tangut.

Since Sofronov (1968), there has been a trend toward regarding rhyme groups in the Tangraphic Sea as consisting of pairs of rhyme types which I will call A and B. The B rhymes are less common than their A counterparts. Here are the A and B rhymes for the first rhyme group in Tangraphic Sea in reconstructions that have a clear A/B distinction (with Nishida's reconstruction for comparison). In the last column, I experiment with reconstructing as the defining characteristic of B rhymes.

Nishida's / Gong's  / Arakawa's / my rhyme group Rhyme type Rhyme Nishida 1964 Sofronov 1968: B has -n Sofronov 2012: B has Arakawa 1999: B has -' Gong 1997: B has long vowels This site until recently: B has long vowels This site for the moment: B has
I A 1 -u -u -u -u -u -əu -əu
2 -ǐu -i̭u -û, -ju -yu -ju -ɨu -ɨu
3 -ǐuɦ -iu -iu
4 -uɦ -uC -u: -u
B 5 -un -uˁ -u' -uu uu -əuɦ
6 -ʊɦ n ˁ -yu' -juu -ʊʊ ɦ
7 -ǐʊɦ -i̭un -juˁ, -ʏˁ -u:' uu, -iuu -ɨuɦ, -iuɦ

I will explore the implications of reviving and revising Nishida's and invert my long-held assumptions about vowel length in a sequel post. NISHIDA TATSUO (1928-2012)

西田龍雄 Nishida Tatsuo, Japan's greatest Tangutologist, passed away today.

2le 1rar 1vɪ 2ȵi

(A Tangutization of 西田龍雄 Nishida Tatsuo in his reconstruction: 'west field dragon male'.)

It would not be an exaggeration to say that his 1964-66 西夏語の研究 A Study of the Hsi-hsia [= Tangut] Language (2 vols.) has had an enormous impact on my life. That work is a substantial revision of his 1962 PhD dissertation 西夏文字の分析並びに西夏語文法の研究 An Analysis of the Tangut Script and a Study of Tangut Grammar which as far as I know might be the first outline of Tangut grammar. Nishida published the first full-scale phonetic reconstruction of Tangut with two features that are still in some reconstructions, including mine:

はり母音 tense vowels (indicated with subscript dots: Ṿ)

-Vr rhymes (which Gong and I reconstructed as retroflex vowels rather than vowel-r sequences: Vʳ)

That reconstruction was brought to life when he was a consultant for the movie 敦煌 Tun-huang (1988) which was my very first exposure to Tangut.

His article 西夏語「月月樂詩」の研究 "A Study of the Tangut 'Ode on Monthly Pleasures'" (1986) opened the door to the mysterious Tangut 'ritual language' whose nature remains controversial.

Nishida wrote for the general public as well as for scholars. People outside academia could learn about the Tangut from

西夏文字―その解読のプロセス Tangut Characters: The Process of Their Decipherment (1967, reprinted 1994, translated 1979 by James A. Matisoff as The Structure of the Hsi-hsia [Tangut] Characters)

西夏文字の話 シルクロードの謎 The Story of Tangut Characters: The Mystery of the Silk Road (1989)

西夏王国の言語と文化 The Language and Culture of the Tangut Kingdom (1997)

He edited 言語学を学ぶ人のために For Students of Linguistics (1986) which was the textbook for my linguistics class at Doshisha. I think it contained my first serious (albeit brief) introduction to Tangut, Jurchen, and Khitan.

Nishida was far more than a Tangutologist; his interests encompassed the Tibeto-Burman language family and writing systems throughout East Asia. His books for a lay audience included

生きている象形文字 モソ族の文化 Living Pictographs: The Culture of the Moso (1966)

アジアの未解読文字 その解読のはなし Undeciphered Scripts of Asia: The Story of Their Decipherment (1982)

漢字文明圏の思考地図 東アジア諸国は漢字をいかに採り入れ、変容させたか A Mental Map of the Chinese Character Civilizational Sphere: How Did the Nations of East Asia Adopt and Modify Chinese Characters? (1984)

生きている象形文字 Living Pictographs (2001)

アジア古代文字の解読 The Decipherment of Ancient Asian Scripts (2002)

Needless to say, his interests overlapped with mine. My debt to him is immeasurable. There is a great void in the small world of Tangutology. To quote Daniel Kane's (2009) translation of the refrain of 張琳 Zhang Lin's Chinese-language memorial for 宣懿皇后 Empress Xuanyi of the Liao Dynasty, 嗚呼哀哉 "Alas, alas, how sad it is." PONDERING OVER WATER

Zhuang is a Tai language, so it's not surprising that Zh naenghnaŋ 'to sit'. Only the tones are different: Zh naengh has a mid level tone and Thai naŋ has a falling tone. But both tones are descendants of the same proto-tone.

[naŋ] 'to sit' is very similar to its Thai cognate นั่ง

I would expect the Zhuang cognate of Thai น้ำ naam 'water' to be naemx [nam]. The long vowel of Thai naam is irregular, and the Zhuang high falling tone written with -x corresponds to the high tone of Thai naam. However, the actual Zhuang cognate is raemx [ɣam]* with r-. I don't know what the Yue varieties of Guangxi are like - I've been using standard Cantonese from Guangdong as a stand-in - but I assume that like standard Cantonese, they don't have initial r-, so the Zhuang should write r-syllables with graphs for Chinese l-syllables: e.g., raeh 'sharp' (a borrowing from 利 Old Chinese *rits 'id.'?) was traditionally written as

利 Ct lei 'sharp' atop 刀 'knife'

金 'metal' next to 利 Ct lei 'sharp'

The native Zhuang word for 'bird' (cognate to Thai นก nok 'id.') is roeg [ɣok] which used to be spelled with l-phonetics:

六 Ct luk

六 Ct luk next to 鳥 'bird'

口 'mouth' (indicating 'colloquial: i.e., 'non-Chinese'?) next to 六 Ct luk

虫 'bug' next to 六 Ct luk

犭 'dog' next to 六 Ct luk

鳥 'bird' next to 彔 Ct luk

However, raemx was once usually written with the n-phonetic 念 Ct nim 'think':

淰: 氵 'water' + 念 Ct nim

氵+𢗨: 氵 'water' + 水 'water' + 心 < 念 Ct nim

冰/心: 冰 'ice' (< 冫 'ice' + 水 'water') + 心 < 念 Ct nim

𢗨: 水 'water' + 心 < 念 Ct nim

𣲙: 氵 'water' + 水 'water' (no phonetic)

Maybe 念 is pronounced with l- in Guangxi. But if it isn't, why would r- be written as n-? Li Fang-Kuei (1977: 131) reconstructed the Proto-Tai word for 'water' with *nl/r-. Could the Zhuang word have been something like *nlam or *nram at the time the 念 nim-spellings were devised? Perhaps 念 was pronounced more like *niam back then. Zhuang spellings could illuminate both Zhuang and local Chinese historical phonology.

The Tai word for 'water' has been associated with Proto-Austronesian *daNum 'water'. Taking the PAN form as a starting point, the vowel of the second syllable could have assimilated to the first:

*daNum > *danom > *danam

(I don't know when *N became *n, but I'll place that shift here.)

(Thongkum 1992: 67 reconstructed Proto-Lakkja *num 'water'. Did Lakkja preserve an *u that became *a in most of Kra-Dai?)

Then the first vowel was lost, and *d- assimilated to the sonorant *n-, becoming a sonorant *l- or *r- that then metathesized with *n- (i.e., swapped places with it):

*dnam > *rnam or *lnam (cf. Peiros' Proto-Kra-Dai *R-nam) > *nram or *nlam (which may underlie the 念-spellings of 'water')

The *dn- to *nr-metathesis is inspired by my proposal of Early Old Chinese *TV-presyllables as one source of Middle Old Chinese *Cr-clusters.

This account fails to explain why forms for 'water' in the Tai, Be, Kam-Sui, Hlai, and Lakkja (but not Kra!**) branches of Kra-Dai have a tone that arose from a final glottal stop absent from PAN *daNum. Was this glottal stop a Kra-Dai innovation, or has Kra-Dai preserved a consonant lost in PAN?

*The spelling of standard Zhuang [ɣ] as r may have a historical basis. Zhuang sawndip (traditional script) spellings as well as comparative evidence point toward *r or clusters with *r (see above) as one source of modern standard Zhuang r.

9.26.19:40: There are still Zhuang varieties with r-like sounds: e.g., 岜皓 Bahao [ɾ] and 大果腊 Daguola [ɹ̝] (Johnson 2011: 67-68).

**Ostapirat (2000: 231) reconstructed an unrelated word *ʔuŋ for 'water' in Proto-Kra. Could that be the original Proto-Kra-Dai word for 'water'? Did Kra split off before the ancestor of the rest of the Kra-Dai family borrowed an Austronesian form like *daNum? What other languages have borrowed 'water'? Or was a *daNum-like Proto-Kra-Dai word replaced by *ʔuŋ in Proto-Kra? 3ITTING ON FIRE

(3 is supposed to resemble a cursive Z which almost sounds like S. It reminds me of the obsolete Zhuang tone letter З that has been replaced by J.)

Why do some spellings of Zhuang naengh [naŋ] 'to sit' contain a non-Chinese shape resembling the numeral 3? Moreover, why is this 3-shape atop the 灬 'fire' radical' (Cantonese fo; sometimes simplified to what looks like 一 Ct yat 'one' and possibly also simplified as 凵)? There are no Chinese characters pronounced like naengh in Cantonese (or probably any other Chinese language) that resemble 3, 灬 'fire', or 一 'one'. Then again, could 3 be from the left side of 能 Ct nang in cursive? (At first I thought 3 might be a pictograph of someone sitting down, but I doubt that.)

Other puzzling spellings of naengh are:

丸 Ct yun 'ball' atop 一 < 灬 'fire'?

几 Ct gei 'small table' atop 一 < 灬 'fire'?

are 丸 and 几 derived from 能 in cursive?

辶 Ct cheuk 'to walk' under 等 Ct dang [taŋ] 'to wait', Zh daengj [taŋ] 'refuse to accept'

was this devised for a dialect with d- instead of n- in 'to sit'?

Straightforward spellings all contain phonetics:

能 Ct nang

能 Ct nang atop 土 'earth', an abbreviation of 坐 'to sit'?

足 'foot' next to 能 Ct nang

足 'foot' next to 南 Ct naam, Zh nam 'thorn', namh 'earth'

坐 'to sit' next to 宁 Ct ning, Zh ningq [niŋ] 'young'

坐 'to sit' next to 能 Ct nang

The choice of an -m phonetic (南) for naengh is inexplicable. A-NOMALY ON THE SCENE

I discovered the old spelling scæne for scene which presumably reflects Latin scaena. Latin ae normally corresponds to Greek ai, yet the Greek word is σκηνή <skēnḗ>. Is the unexpected a in scaena due to Latin borrowing from a nonstandard Greek dialect (cf. the borrowing of Ulixes) or some intermediary language (cf. the borrowing of Ajax from Greek via Oscan)? SPEAKING OF HEAVEN IN ZHUANG

Why is Zhuang (Zh) mbwn [bɯn]* 'heaven' (cognates) written as

云 'say' Cantonese wan, Zhuang voenz [von]** next to

天 'heaven' Cantonese tin

in the first article of the Declaration of Human Rights? 天 'heaven' is obviously semantic, but 云 Zh voenz isn't a close phonetic match for mbwn. It took me thirteen hours to remember that 云 is the old character for 雲 'cloud'***, so it's probably semantic. D'oh!

Other variant spellings of mbwn are

天 over 門 ~ 门 Ct mun 'gate'

雨 Ct yu 'rain' over over 門 ~ 门

雨 is semantic and may be an abbreviation of 雲 'cloud'

門 ~ 门 must have been chosen as a phonetic after Cantonese lost *b, leaving m as the closest voiced match to Zhuang [b]. Or the spellings with 門 ~ 门 were devised for a Zhuang dialect with m- instead of [b].

尾 Ct mei 'tail' + 奔 Ct ban [pan] 'run'

奔 is phonetic, but 尾 doesn't seem to be either semantic or phonetic

9.25.8:07: Could this be a Vietnamese nom-style double phonetic character representing a Zhuang dialect word with initial mb-: 尾 Ct m- + 奔  ban for something like [mbɯn]? Cf. nom spellings like 巴 ba atop 賴 lại for Middle Vietnamese blái 'fruit'.

The preceding word, laj [la] (-j is a tone marker) 'under'. laj mbwn 'under heaven' must be a calque of Chinese 天下 'heaven under' = '(all) under heaven' = 'the world'.

In the passage as written in Wikipedia, the character for laj is 𨑜 with 下  ha 'under' atop the radical 辶  cheuk 'walk', but other spellings are

semantic (something above) + semantic ('under')

天 'heaven' over 下 'under'

日  yat 'sun' over 下 'under'

雨 'rain' (< an abbreviation of 雲 'cloud'?) over 下 'under'

卡 < 上  seung 'above' + 下 'under'

presumably unrelated to the modern character 卡  kaat

semantic + phonetic

天 'heaven' over 拉  la 'to drag'

拉 over 下 'under'

semantic + ?

忑 with 下 'under' over 心  sam 'heart' (why?)

口  hau 'mouth' (why?) next to or over 下 'under'

semantic with alteration

丅 < 下 'under' minus a stroke


拉  la 'to drag' (phonetic loan)

喇 Ct la (phonetic symbol) (phonetic loan)

Although Zhuang characters have a much more transparent structure than Tangut characters, their diversity provides a different dimension of complexity.

*Zhuang mb represents standard Zhuang [b] because Zhuang b represents standard Zhuang [p], just as Pinyin b represents standard Mandarin [p].

I belatedly realized I should have included Zhuang as an example of an orthography with w as a vowel in section 4 of "Xenomastics".

**9.25.8:15: 云 Ct wan is a phonetic loan for a Zhuang word voenz [von] 'smoke from a fire'. I presume voenz is native and unrelated to Chn 云/雲 'cloud' (in the sky); the vowels don't match.

***The old graph 云 for 'cloud' is most often used to write an unrelated homophone 'to say'. See "Sino-Tangut Retroflex Vowels" for cognates of 'to say'. XENOMASTICS II: A MINIMAL CONSONANT SYSTEM

In "Xenomastics", I wrote that

An Arabic-like a-i-u system may be the best bare minimum [for fictional languages] since two-vowel systems are controversial.

But I didn't suggest a bare minimum consonant system, so I'll do so below. Consonants are arranged according to places and manners of articulation rather than alphabetical order.

Voiceless stops p (83%) t (9X%?) k (89%)
Nasals m (94%) n (9X%?)
s (88%)
Approximants w (74%) or v (21%) l (78%) or r (8X%?)

The percentages of languages containing each sound in the UCLA Phonological Segment Inventory Database (UPSID) are in parentheses*.

The eight consonants above overlap considerably with the small consonant inventory of Hawaiian (in phonetic notation below rather than standard Hawaiian orthography):

Voiceless stops p t ~ k ʔ (< k)
Nasals m n

h (partly < s)
Approximants w ~ v l ~ r

I do not intend to imply that your language should have only those consonants. You can and probably should add more: e.g.,

Voiced stops: b, d, g

Fricatives: f, sh, z, zh

Affricates: ch, j

Approximant: y

But you almost certainly shouldn't have less.

Suppose you decide that your language is spoken by aliens only capable of producing nasals. Readers will have trouble distinguishing Mnmn from his sister Nnmm. (It doesn't help that the Roman letters for M and N are similar.)

Or suppose you decide that your language will have only three consonants, q, x, and z. It's unlikely that any human language would have only those three consonants. And even if it were an alien language, readers will soon tire of names like Qax, Xix, Zux, etc.

The easy way to expand your inventory is to simply more or less replicate the inventory of English. That might be fine for one language, but perhaps not for two or more, as you run the risk of making your languages sound alike.

To break out of the English box, you could draw upon the inventory of a foreign language you've studied. However, you may not want to imitate its orthography, even if you're comfortable with it, as readers will probably think 'German' if you use sch in your alien language. (That letter sequence is also Dutch, but far fewer people are familiar with Dutch.) For the convenience of your readers, you could Anglicize the spellings: e.g., sh instead of sch, etc.

If you're adventurous, you could look up languages on Wikipedia, find a consonant system that appeals to you, and modify it (e.g., by replacing phonetic symbols with regular letters and removing diacritics). How many science fiction writers have been inspired by, say, Hmong with sixty consonants (combining the inventories of two dialects) or Ubykh with eighty-four consonants? You could also find real-life inspirations for your languages' vowels. Languages here on Earth are more alien than anything most writers can imagine.

The advantage of using a real language as a phonetic baseline is that you avoid the pitfalls of an improbable consonant (or vowel) system without having to study linguistics.

*9.24.8:20: UPSID lists multiple types of t, n, s, l, and r, so I have conflated them and provided approximate figures:

- 26% have a voiceless dental plosive (one type of t), 35% have a voiceless dental/alveolar plosive (another type of t), and 43% have a voiceless alveolar plosive (yet another type of t); these figures add up to 104%, meaning that some languages must have more than one t-like sound. My guess is that 9X% have at least one t-like sound.

- 18% have a voiced dental nasal (one type of n), 36% have a voiced dental/alveolar nasal (another type of n), and 45% have a voiced alveolar nasal (yet another type of n); these figures add up to 99% and may include some languages with more than one n-like sound. My guess is that 9X% have at least one n-like sound.

- 88% have a voiceless sibilant fricative (s-like sound)

- 78% have a voiced lateral approximant (l-like sound)

- 31% have a voiced tap/flap (one type of r-like sound), 35% have voiced trills (another type of r-like sound), 7% have voiced dental, dental/alveolar, and alveolar approximants (yet more types of r-like sounds), and 11% have an "r-sound" (how does this different from the other types? is it a category for r-sounds not fully described in the sources for the database?); these figures add up to 84% and may include some languages with more than one r-like sound. My guess is that up to 8X% have at least one r-like sound.

Tangut fonts by Mojikyo.org
Tangut radical and Khitan fonts by Andrew West
Jurchen font by Jason Glavy
All other content copyright © 2002-2012 Amritavision