The function of

ʔjwɨr 2.77 ŋjow 2.48 dza 1.17 djịj 1.61

Mixed Categories of the Tangraphic Sea (more commonly known by its Mandarin title Wenhai zalei)

is still unknown. Clauson (1964: 67) wrote,

"... no [Tangut] character occurs both in the 'Sea of Characters' [Tangraphic Sea] and the 'Mixed Categories' [of the Tangraphic Sea]. Only a close study of these two authorities could disclose the reason, but it is possible to hazard the conjecture that the first authority lists characters with simple initials and the second characters with initial consonant clusters."

If Mixed contained cluster-initial tangraphs, I would predict the following:

1. Mixed tangraphs would have been transcribed with sinograph combinations rather than single sinographs; two sinographs would have been needed to represent their complex initials

2. Mixed tangraphs would not represent Chinese loanwords with simple initials

3. Mixed tangraphs would have been transcribed with Tibetan clusters

4. Mixed tangraphs would not have been used to transcribe Sanskrit syllables with simple initials

I can't really test prediction 1 without my copy of Sofronov (1968). However, Shi et al. (2000) does contain pronunciations for Tangraphic Sea and Mixed tangraphs in sinographic spellings based on the transcription system in the Pearl. Disinographic spellings were used to represent both TS and Mixed tangraphs.

Nishida's (1964) edition of the Pearl contains disinographic transcriptions of TS tangraphs - e.g.,

魚骨 ?*ŋgy kwə

for TT0460 HEAVEN ŋwər 1.84 (TS 1.87B61; Nishida reconstructed a cluster: ŋɣʉr)

and monosinographic transcriptions of Mixed tangraphs - e.g.,


for TT5479 ONE ʔa (MCTS 09A13; Nishida's reconstruction is identical)

cognate to a 'one' in Qiang languages

領o ?ljej (I don't know the function of the subscript circle)

for TT1594 COUNTRY lhjịj (tone unknown; only in the Precious Rhymes edition of Mixed Tangraphs: 3.07.2401)

cognate to Written Tibetan gling 'land mass surrounded by water' (not the best semantics)?

but has nothing g-like in its Chinese transcription

Thus it appears that prediction 1 is false.

The third tangraph in the title of Mixed Categories disproves prediction 2, since

TT1250 MIXED dza 1.17 (MCTS 05A31)

is a loanword from Late Middle Chinese 雜 *dzap. As far as I know, this Chinese word has never had a cluster initial. It was later reborrowed as

TT1248 MIXED tsha 1.17 < Tangut period NW Chn ?*tsha < LMC *dzap

which was listed in Tangraphic Sea (1.23B12), not Mixed Categories.

As for prediction 3, I know of one Mixed tangraph transcribed with a simple initial:

TT5479 ONE: ʔa (MCTS 09A13; identical to Gong's ʔa)

The other two whose transcriptions I can find in my limited resources are:

TT1594 COUNTRY: lde (MCPRTS 3.07.2401; cf. Gong's lhjịj)

TT4105 PERSON 2.44: bdzo (MCTS 15B22; cf. Gong's dzjwo)

Tangraphs from the level and rising tone volumes of (Precious Rhymes of the) Tangraphic Sea were also transcribed with clusters: e.g.,

TT3619 HEAD/TOP 1.4: dguH, bgu (TS 1.10B32; cf. Gong's ɣu)

TT1075 ONE 1.43: gliH, gli, kli (TS 1.53A71; cf. Gong's lew)

TT4282 WATER 2.85: gzi (the rising tone volume of TS has not survived, but this is in PRTS 2.17.2701; cf. Gong's zjɨɨr)

One could claim that the clusters of transcriptions of non-Mixed tangraphs represented Tangut tones, whereas the clusters of transcriptions of Mixed tangraphs represented real clusters. One could also explain away the simple initials of transcriptions of Mixed tangraphs as representing more innovative forms/dialect(s) or rapid speech phenomena.

These explanations seem improbable to me. How would the creator(s) of the transcriptions be able to distinguish between

- the segmental b of bdzo 'person' and the tonal b of bgu 'head/top'

- the level tone g of gli(H) 'one' and the rising tone g of gzi 'water'

I continue to suspect that the clusters in the Tibetan transcriptions may be real, though I cannot deny the possibility that the apparent lack of correlation between preinitials and tones may be a byproduct of complex tonal sandhi rules which have yet to be discovered. I would like to explore this idea when I get around to writing about Arakawa's 1999 article "A Study on the Tone of Tangut from Tibetan Transcriptions" which a reader kindly sent to me. Arakawa proposed that preinitials generally indicate level tones, though he did not deny the existence of rising tone words transcribed with preinitials.

Prediction 4 is incorrect if Grinstead correctly identified the following Mixed tangraphs transcribing Sanskrit C(V)(C):

TT0357 (Skt hi) ʔjɨr 2.77 (PRTS 20A22)

TT1187 (Skt je) dzjii 2.12 (PRTS 16A23)

TT1346 (Skt ju) dzjuu 1.7 (PRTS 05B41)

TT3757 (Skt huṃ) xo (tone unknown; PRTS 09A52)

TT4882 SUPPLY/(Skt -j-) dzjɨ 1.30 (PRTS 03A72)

Conversely, the only tangraphs that Grinstead identified as representing Skt CCV* were in the level tone volume of Tangraphic Sea rather than Mixed Categories:

TT0584 (Skti tva) twa 1.17 (TS 1.24B12)

TT2551 (Skt jñi) gji 1.11 (TS 1.17B11)

Although my investigation has been severely impeded by my lack of resources, the preliminary results above do not indicate that Mixed contained cluster-initial words as opposed to single-initial words in the rest of Tangraphic Sea.

There is, however, one clear difference between Mixed and the rest of TS: tangraphs pronounced with dz- occur exclusively in Mixed.

Here are the exceptions according to Gong's reconstruction in Li Fanwen (1997):

Absent from any part of (Precious Rhymes of the) Tangraphic Sea:

TT0382, TT1849, TT3054, TT3160, TT3742, TT3988, TT4449

(I don't know how Gong reconstructed readings for TT0382, TT3054, or TT4449, which do not appear in any native Tangut dictionary.)

TT3621 and TT3947 are in the TS version of Mixed but not the PRTS version.

In the level tone volume of TS and PRTS:

TT5062 dzji 1.10

In the rising tone volume of PRTS (the rising tone volume of TS has not survived):

TT2920 dzjiw 2.40

(TT0764 dziã 2.23 and TT4445 dziã 2.23 have been excluded because they seem to be typos for dʑiã 2.23. They appear in the palatal chapter of Homophones [36A52 and 36A55] and were grouped with TT2888 dʑiã 2.23 [36A53] and TT1769 dʑiã 2.23 [36A54].)

Both of these apparent exceptions occur in the enigmatic 'retroflex' chapter of Homophones (which doesn't actually contain any retroflex initials!).

TT5062 dzji 1.10 (20A36) belongs to a homophone group (20A33-20A37) with four other members:

TT1808 dʑji 2.9 (20A33)

TT5634 dʑji 2.9 (20A34)

TT5224 dʑji 1.10 (20A35)

TT5137 dʑji 1.10 (20A37)

TT5062 dzji 1.10 must be a typo for dʑji 1.10.

TT2920 dzjiw 2.40 (20A54) has no homophones in Homophones. (I do not know why Li Fanwen [1986: 295] treated [20A52-20A54] as a homophone group.) I suspect that its dz- is also a typo for dʑ-.

I do not understand how dʑ-initial words ended up in the 'retroflexes' chapter, or why they were not grouped as homophones with dʑ-initial words in the palatals chapter.

I also do not understand why dz-initial words could not be listed in the level or rising tone volumes. I once assumed that the Mixed volume organized by initial class was reserved for words with variable rhymes (due to ablaut?) that could not be indexed in the tonal volumes organized by rhyme. However, it is unlikely that all words with initial dz- would have variable rhymes. Moreover, the variable rhyme hypothesis cannot account for Mixed tangraphs intended to transcribe a specific foreign syllable (e.g., Skt huṃ), unless they were intended to transcribe ranges of foreign syllables (e.g., Skt hu, huu, huṃ).

This quotation by Gong ("Phonological Alternations in Tangut") concerning a member of the 'hot' word family (see here)

TT3089 HOT tsja 1.19

drew my attention to the special treatment of dz-initial words:

"The character is erroneously placed in the WHTL [Wenhai zalei = Mixed Categories of the Tangraphic Sea] W.2691), where the words with initial dz- are placed."

Why wasn't this tangraph placed in the level tone volume along with tangraphs for other 1.19 words? Did it have an alternate reading dzja 1.19 which would qualify it for inclusion in Mixed? I have two reasons to doubt that. First, its fanqie in Mixed has the initial speller TT4254 tsji 2.10. Second, the alternation of ts- and dz- is unknown, though Gong has found alternations of ts-/tsh- (PAiT #33) and tsh-/dz- (PAiT #21). This second objection may not be valid if dzja 1.19 were derived through a two-stage process:

tsja > tshja > dzja

The 'hot' family contains tsh-initial members:

TT5499 HOT tshja 1.20

TT5498 MAKE-HOT/TO-HEAT tshjwa 1.20

did the infix -w- arise via metathesis?: *tshja-w > tshjwa?

I wonder if any other initials are exclusive to Mixed Categories.

*I have excluded one tangraph from this list:

TT3126 (Skt brahma) xiwã 1.25 (TS 1.32A61)

is not really a Skt transcription tangraph but a borrowing from Tangut period NW Chn 梵 ?*fã 'Brahma'. THE SJI-PLIT

Nishida's grapheme 116 appears in tangraphs for wordFs dealing with knowledge: e.g.,

TT4495 WISE tjɨj 1.42

TT4496 WISDOM sjịj 2.54

TT2058 KNOW sjɨ 2.28

TT4498 WISDOM khu 1.1

TT4501 WISDOM tsha 2.14

Yet it doesn't appear in

TT2605 WISDOM ʑjɨr 1.86 which is semantic in

TT1272 KNOWLEDGE sjij 2.33

the meaning of Nishida grapheme 75 on the left is unknown

obviously cognate to WISDOM sjịj 2.54 and KNOW sjɨ 2.28 even though it is not written with grapheme 116 - why has this word family been split in tangraphy?

The bottom of TT2605 is phonetic and is taken from the bottom of a near-homophone:

TT1934 (a surname) ʑjɨr 1.86

The top of the Mojikyo glyph for TT2605 looks strange to me. Both Nishida (1966: 344) and Li Fanwen (1986: 464) have METAL on top instead (as in the right side of TT1272):

Why write TT2605 WISDOM with METAL instead of grapheme 116, the obvious choice for a semantic element? Three possibilities:

1. METAL is arbitrary and meaningless

2. METAL is semantic (how?)

3. METAL is phonetic (how?)

but why would a tangraph need two phonetics?

because it's a Tangut B phonetic?

if 'metal' was X in Tangut B, did TT2605 represent 'the word that sounds like 'metal' in Tangut B but is ʑjɨr in Tangut A'?

and if it's a Tangut A phonetic, is it short for

TT2581 ARMOR zjir 2.72 or

TT2591 CAULDRON rjɨr 1.86

which are the best matches I could find with METAL on top?

(The initials reconstructed by Gong as z and ʑ were grouped together with l and r in the ninth chapter of Homophones. I suspect that they were phonetically similar to l and r: e.g., [ɮ] and [ʐ].)

Tangut B could explain why the tangraphs for the Tangut A s-words for knowledge are not all written with the same element:

X, Y = two nonhomophonous Tangut B roots for 'know'

TT4496 WISDOM sjịj 2.54 : Tangut B Y-A

I assume TT2369 ONE-OF-TWO dzjij 2.33 is a Tangut B phonetic

another possibility is that grapheme 116 is purely semantic, and only ONE-OF-TWO is phonetic, so the Tangut B words for 'wisdom' and 'one of two' could have been homophonous; alternately, dzjij 2.33 is supposed to evoke sjịj - are there any clear-cut cases of dz-phonetics for s-words, and could this imply that dz- in part derived from s- plus prefixes? (I will examine the unusual status of dz- in Tangraphic Sea in a future post.)

TT2058 KNOW sjɨ 2.28 : Tangut B B-Y-A

B-Y-A (or just B-A-?) has a vprefix B-

TT1272 KNOWLEDGE sjij 2.33 : Tangut B C-X

C-X is cognate to X, written with TT2605

could C have been Tangut B *m-? - cf. its use as a phonetic for Tangut A in TT1268, 1271, 1273, and 1282. THE ILLUSION OF OMNISCIENCE

In "I'm Boar-ed", I wrote that the meaning of

TT2191 (something to do with 'three'?) gju 2.3

was "uncertain". Could the components of the tangraph indicate which, if any, of the proposed meanings is correct? Nishida (1966: 553) wrote,

...it is possible to infer the meaning of a given character element or a given combination of character elements. By a comparison with already known characters, the meaning of a given character element can be determined or inferred, and in this way the meaning of an as yet unreadable character (made up of these various character elements) can gradually be deciphered.

I have always questioned this claim because of the large number of tangraphs whose structures do not correlate with their known meanings: e.g.,

TT0870 WOOD + PERSON + vertical line + PERSON = rerj 2.66 THREE (why!?)

Nonetheless, it is true that some tangraphs are semantically transparent, though TT2191 doesn't look like one of them. Even though its meaning has something to do with 'three', it shares nothing in common with TT0870 or the totally different normal tangraph for 'three'

TT1718 sọ 1.70

or any numeral tangraph that I can think of. Janhunen (1994: 119) pointed out that

... there is no observable semantic indicator (numeral radical) that would unite the [Tangut] numeral 'characters' into a coherent system.

No analysis for TT2191 has survived, so the sources of its parts are unknown.

The meaning of its left side (Nishida radical 242) is unknown*. NR242 appears on the left side of only six tangraphs:

1. TT2188 (the Chinese surname 趙 ?*tʃhjew) tɕhjiw 1.45

analyzed as left of

2. TT2193 (a Tangut surname) gju 2.3

gju on left + SURNAME on right

plus TT5802 基 FOUNDATION/源 SOURCE ɕju 2.2 (why? - were the Gyus the source of the Chhyews, or vice versa? Did some Tangut named Gyu found a new clan with the Chinese name Chhyew, or did some Chhyews assume the Tangut name Gyu?)

3. TT2189 CANAL gju 1.3

a compound of the phonetic gju + the right side of TT4197 DITCH

4. TT2190 ALL tjạ 1.64 < *C-tja

cf. Md 都 dou 'all' < Old Chinese ?*ta; Old Chinese 諸 *tja 'all'

5. TT2191 ? gju 2.3

6. TT2192 WORLD gju 2.3

The last three form a set.

The left side of

TT2190 ALL is derived from


the center (ソ+丨) of TT4475 TO-CROSS dzjịj 1.61 +

the PERSON on the left of TT3866 MANY ʔji 1.11

rather than any of the other graphs with NR242!

ヒ in TT2190 was derived from

TT3539 有 EXIST

so ALL = CROSS (why?) + MANY + EXIST.

The function of ヒ in TT2192 WORLD is unknown. It may be a diacritic to distinguish it from TT2191 ?THREE: 'not three itself, but something to do with three' - i.e., the 三界 three worlds of Buddhism.

The right of TT2191 ?THREE and the center of TT2192 WORLD is Nishida radical 116 'knowledge'. I don't know why.**

NR242 appears in the middle of

TT4218 PASS-OVER gju 1.3

with WATER on the left; cf. Chn 渡 'to cross' which has 氵 'water' on the left

TT4623 SUMMON gju 1.3

with SPEECH on the left

TT2191 ?THREE is obviously phonetic in both tangraphs.

I don't have Grinstead's right-hand element index with me, so I could only find two tangraphs with NR242 on the right:

TT2338 SUFFERING gju 2.3

with Nishida radical 145 on the left (meaning unknown)

TT2609 SEAL/MUDRA tjɨ̣j 2.55

looks like 3/4 of

TT2608 SEAL nwə (one of the 11 entering tone words in Precious Rhymes of the Tangraphic Sea!)

looks like METAL atop PERSON

plus the left of TT2190 ALL tjạ 1.64 as a phonetic

Using Nishida's methods, I might guess that

TT2191 ?THREE gju 2.3

means 'omniscient' because it looks like ALL + WISDOM (which nicely fits Tangut object-verb word order):


left of TT2190 ALL tjạ 1.64 +

left of TT4496 智 WISDOM sjịj 2.54 < *C-sjej-H

cognate to TT2058 KNOW sjɨ 2.28 and TT1272 KNOWLEDGE sjij 2.33; also cf. Written Tibetan shes-pa 'know'; the Proto-Tibeto-Burman *syey-s 'know' in Matisoff (2003: 206) is almost identical to my *C-sjej-H (particularly since I consider *-s to be one probable source of *-H).

But TT2191 ?THREE has no such meaning, and Nishida's gloss 三途 'three roads' (the Chinese Buddhist counterpart of the river Styx) has nothing to do with ALL or WISDOM. NR242 is simply phonetic. (Nishida did not mention the phonetic function of NR242 in TT2191 or TT2189, though he did state it in his entries for TT2192 and TT2193.)

My inability to derive the meaning of TT2191 ?THREE from the sum of its parts should not lead one to deny the existence of transparent semantic compounds in Tangut such as


TT4106 RIDE dzeej 2.34 = PERSON atop HORSE

12.29.1:06: cognate to TT5208 HORSEMAN dzeej 1.37

But most accounts of tangraphy imply that these compounds are more plentiful than they actually are, and overlook tangraphs whose structure is impervious to semantic analysis.

*Nishida (1966: 479) implied that NR242 means みぞ 'ditch', but did not give NR242 a definition on the previous page or include it in his list of tangraphic elements with known meanings on p. 245.

**The only connection that I can make between NR116 KNOWLEDGE and THREE is highly unlikely:

- the Tangut period NW Chn word for three' was 三 ?*sã

- Tangut words for 'knowledge' like sjij 2.33 (written without NR116!) also had initial s-

But why choose KNOWLEDGE as a cryptophonetic for ?*sã if (parts of) other tangraphs pronounced sa or sã could have been used instead? It is difficult to believe that the Tangut could have seen an element associated with high vowels -

TT4496 智 WISDOM sjịj 2.54

TT2058 KNOW sjɨ 2.28

- and associate it with a low-vowelled Chinese ?*sã.

Tangut -ji is partly from *-a, and in a few cases (see Matisoff, "'Brightening' and the place of Xixia [Tangut]"), Tangut -jij could also come from *-a (via *-j suffixation: *-a > *-je + *-j?). However, the creator(s) of tangraphy would not have known about these sound changes (unless they knew of a nonstandard dialect which still had a instead of a high vowel), so they would not have used a sji(j)-phonetic to represent ?*sã. I'M BOAR-ED

I was going to write about this year's animal, but it turns out that I already did that on New Year's Day. Looking at that post, I noticed that Gong's reconstruction of

TT2191 meaning uncertain* giu 2.3, the phonetic of

TT5355 BOAR gju 1.3

ends in -iu, a rhyme that does not exist in his reconstruction system. Andrew West changed this -iu to -ju. This allows the homophone group (23A17-23A37) of TT2191 to be reconstructed as gju 1.3/2.3 (level and rising tone words are mixed) rather than as a mix of giu 2.3 and gju 1.3/2.3. Li Fanwen (1986: 309-310) reconstructed the whole group as gju 1.3/2.3.

Why is there no -iu? Could the medial element that Gong reconstructed as -i- be something that would be hard to perceive next to u: e.g., -w-? It cannot simply be -w-, since there clearly was a -w- in other rhymes that Gong did not reconstruct with -w-, and -iV rhymes were not transcribed with -w-.

Moreover, there was a contrast that Gong reconstructed as -i- vs. -iw-, and it is highly unlikely that this was actually -w- vs. -ww-. (Does any language have CwwV syllables?) I have wondered if -i- was really a velar glide -ɰ-, but that would imply that -iw- was really -ɰw-, a sequence I have never seen in any language.

In Gong's reconstruction, medial -i- only occurs before nonhigh vowels:

mid: i(w)e, iẽ, i(w)ə, io (there is no -iwo)

low: i(w)a, i(w)ã

Since -i- does not correspond to Tibetan or Sanskrit -y-, perhaps it should be reintepreted as a vowel quality: e.g., ia was a fronter or higher version of a. The high vowels had no higher variants because they were already high.

*The glosses I have on hand indicate something numerical:

Nishida (1966: 479): 三途 'three roads' (the Chinese Buddhist counterpart of the river Styx)

Li Fanwen (1986: 309): 第 '-th' (ordinal suffix)

Shi et al. (2000: 207): 三 'three'

I suspect that this is the second half of a disyllabic ritual language word for 'three':

TT0870 2191 rerj 2.66** gju 2.3

In Homophones, each of these two tangraphs has the other as its clarifier.

I have no idea what the etymology of this word is. Nor do I understand the structure of the first tangraph, which consists of WOOD and two PERSONS around a vertical bar. There is a very similar-looking homophone

TT0994 道 ROAD/典 CLASSIC rerj 2.66

cognate to TT3467 ?行 GO rerj 2.66?

Could WOOD + vertical line be a phonetic for rerj 2.66? Other tangraphs containing those elements have very different pronunciations: e.g.,

TT0844 (a place name) ɕjĩ 1.16

TT0857 (a kind of tree) tu 1.1 (its phonetic is not WOOD + vertical line, but TT3392 蠢 STUPID tu 1.1 beneath WOOD)

TT0871 (a kind of tree) kjiw 1.45 (like TT0857 above, its phonetic is not WOOD + vertical line, but TT3548 YEAR kjiw 1.45 beneath WOOD)

TT0916 JACKAL khu 2.1 (why are WOOD and WATER in this graph in addition to BEAST on the bottom right?)

**Gong's reconstruction is lherj 2.66. I have adopted Andrew West's correction of the reconstruction.

Gong split a homophone group (49B28-49B41) in two, reconstructing some with initial r- and others with initial lh-.

Li Fanwen (1986: 449) kept this group together, reconstructing them all with r-.

Shi et al. (2000: 301-302) also kept this group together, symbolizing its pronunciation in sinography as 冷 Md leng rather than 冷 with the rhotic diacritic.

The absence of may be a typo. Nishida (1964: 137) found that this group was transcribed in Chinese with the rhotic diacritic (冷) and transcribed in Tibetan as re. Since no fanqie have survived for this homophone group, the transcriptions are our only source of information for its initial. HAPPY NIỌ YEAR

2008 is the year of the

TT3016 RAT xjwi 1.10

cf. TT3018 xjwaa 1.23 below

Both sides of that tangraph might mean 'rodent'. As far as I know, neither component appears in any other position.

TT3017 RAT/MOUSE niọ 2.63 < ?*niọw < ?*C-niak-H (?*s-niak-s)

cf. Old Chinese 鼠 ?*hnaʔ < ?*sn- 'rat/mouse' (initial uncertain*)

has the same left side as TT3016 with an obscure right side.

The only other tangraph with a similar (but not identical) left side is

TT3018 xjwaa 1.23

this looks like a cognate of RAT xjwi 1.10 which hadn't undergone the a > i shift found throughout Tangut

how could Gong reconstruct its unique reading if it didn't appear in any dictionary?

whose meaning is unknown. Presumably it has something to do with water, since WATER is on its right side. Was there an amphibious mammal in the Tangut Empire? I don't know of any Asian analogue to the Australian rakali or the European or American water vole.

The right-hand component of TT3016 appears in

TT2873 RAT tsej (tone unknown)

the function of the left-hand element is unknown

TT5793 MOUSE pia 1.18

with INSECT on the left

This final rodent tangraph has neither side of TT3016:

TT3295 RAT ɕiwə 1.28

borrowed from Middle Chinese 鼠 *ɕɨəʔ 'rat/mouse'?

but the -w- has no Chinese source and there are no other instances of rhyme 1.28 corresponding to the rhyme category of 鼠

15:23: looks like Proto-Tibeto-Burman *syow 'rat' (Matisoff 2003: 228) - what TB forms underlie this reconstruction?

In the Homophones dictionary, TT3016 has TT3295 as its subscript clarifier:

i.e., RATRAT

Does this represent a disyllabic word xjwi ɕiwə or does this simply indicate that xjwi 1.10 and ɕiwə 1.28 were (near-)synonyms?

*The Old Chinese initial of 鼠 'rat/mouse' is uncertain. Its mainstream Middle Chinese initial was *ɕ-, but southern languages have aspirated affricate initials. MC *ɕ- has multiple possible sources including *hn- < *sn- which matches the C-n- (?*sn-) of the Tangut word. I know of no phonetic derivatives of 鼠 with n- which would confirm a nasal initial. The aspirated affricates of southern Chinese languages might reflect Old Chinese *C-hn- with a stop prefix (*k- for small animals?). FOUR DEGREES OF HEAT

Yesterday, I mentioned this word family:

TT3089 tsja 1.19 'hot'

TT5505 tsja 1.20 'hot'

TT5499 tshja 1.20 'hot'

TT5498 tshjwa 1.20 'make hot, to heat'

Some word families were written with totally unrelated tangraphs: e.g..,

TT1052 ɕjɨ 1.29 'go, reach, enter'

has the MOTION element on top; derived from ɕjɨ 2.27 'extremely few' (phonetic) + 'expel' (semantic)

TT1957 ɕjwɨ 1.29 'go, reach, enter'

looks like LITERATURE + the mysterious right-hand element ヒ, but analyzed as 'region' + 'to cross'.

But all four members of the 'hot' family share the FIRE element:

FIRE on the left:

TT5505 tsja 1.20 'hot'

TT5499 tshja 1.20 'hot'

TT5498 tshjwa 1.20 'make hot, to heat'

FIRE on the right:

TT3089 tsja 1.19 'hot'

The significance of the position of FIRE is unknown. FIRE was generally written on the left, implying that there was something unusual about the exceptions with FIRE on the right. Did anyone ever miswrite TT3089 with the left and right elements reversed?

Gong's glosses for three of the four words are identical. Can their tangraphs help us to distinguish their meanings?


TT5505 tsja 1.20 'hot' =

FIRE = bottom [left] of TT0991 məə 1.31 'fire' +

SUN = all of TT5728 be 2.7 'sun'

no analysis available; looks like WAIST/BIRD + SAGE

implying 'hot like the sun' or 'heat from the sun'?


TT3089 tsja 1.19 'hot' =

FEAR = left of TT3080 kjạ 1.64 'fear' +

cf. the shape of Chn 布 'cloth', the phonetic of 怖 'fear'

FIRE = left of TT5498 tshjwa 1.20 'make hot, to heat'

What does FEAR have to do with 'hot'? 'Frighteningly hot'?

And why was this word placed in the Mixed Categories volume of Tangraphic Sea instead of the Level Tone volume?


TT5499 tshja 1.20 'hot' =

FIRE = left of TT5505 tsja 1.20 'hot' +

CROSS = right of TT4475 dzjịj 1.61 'to cross'

distortion of Chn 走, left of 越 'to cross'?

越 can also mean 'exceed', so could this mean 'exceedingly hot'?

The remaining word was written with HAND on the right:


TT5498 tshjwa 1.20 'make hot, to heat' =

FIRE = bottom [left] of TT0991 məə 1.31 'fire' +

HAND = left of TT2779 tjị 1.67 'to place'

Is HAND supposed to imply human activity: e.g., someone using his hand to place something on a fire? ALTERNATIONS, ARAKAWA-STYLE

Gong's 1989 article "The Phonological Reconstruction of Tangut" illustrated how previous reconstructions could not elegantly account for Tangut word families. This assumes that WF relationships should be transparent. There is no guarantee that this was the case: e.g., in Chinese,

二 'two': Md er < Middle Chinese *ɲi < Old Chinese *nis

次 'next': Md ci < Middle Chinese *tshih < Old Chinese ?*s-hnis

Granted, that WF is disputable, so it's not the best example. Here's a better example of opaque morphology (but within paradigms rather than word families):

buy - bought

bring - brought

catch - caught

fight - fought

seek - sought

teach - taught

think - thought

work - wrought (archaic)

Nonetheless, I cannot object to a reconstruction that matches the extant transcriptive data and is morphologically transparent. So far, Gong's reconstruction comes the closest to doing both.

But what about Arakawa's reconstruction which postdates Gong's? I do not have time to examine all 105 rhymes in his reconstruction, but I can look at his reconstructions for the alternations that I mentioned in the previous post:


vowel alternation, or lowering of vowel due to influence of lost affix?vowel loweringvowel alternation*C-prefix, lowering, and *-N suffix*C-prefix
R10 -iiR30 -II(would expect R36 -ee, but unattested)R53 -ooR64 -ẹ̃(would expect R70 -ịi, but unattested)
R11 -iiR31 -IIR37 -eeR70 -ịi

I still don't know what the prime symbol and 2 mean in Arakawa's notation:


vowel lowering




*C-prefixation, *-e



R14 -ii'

R40 -ee'

R55 -jo'

(why is there no R70 -ịi?)

R79 -jẹ'2

R84 -iir

In Gong's reconstruction, the above alternations can be derived with a small set of affixes (see here and here). Arakawa's reconstruction, on the other hand, requires more methods of derivation.

Arakawa can more neatly account for the R14-84 alternation than Gong:

Gong: R14 -jii : R84 -jir (with unexpected vowel shortening*)

Arakawa: R14 -ii' : R84 -iir (the long vowel remains intact in both rhymes)

However, I cannot see any other advantages in Arakawa's reconstruction on the basis of this limited sample. Hence I will continue to use Gong's reconstruction, even though I think Arakawa's reconstruction fits the transcriptive data better in some cases**.

*Gong reconstructed a long retroflex vowel in R101 -jiir, corresponding to Arakawa's -jer2. This rhyme alternated with R70 and R84 (Gong, "Phonological Reconstruction", #94-96):


In Gong's reconstruction, the three rhymes share a common core (-ji). In Arakawa's reconstruction, R70 and R84 clearly share -ii and R101 might be derived from a suffix *-e: *r-Cjii-e > Cjer2. Arakawa's -e matches the only Tibetan transcription listed in Nishida (1964: 67): rtse. R101 tangraphs were not used to transcribe Sanskrit, since Sanskrit had no retroflex i or e. The Tangut period northwestern Chinese transcriptions of R101 are difficult to reconcile:

?*tsjẽ or ?*tsĩ




I am not certain if the first graph had *-j-, and the others did not. *-aw is nothing like

Gong's -jiir

Arakawa's -jer2

Sofronov's -ɪ̣

Huang Zhenhua's -iæn

Li Fanwen's -jəi

though it does fit Hashimoto's (1965: 125) -jawN.

Even the initial tsh- of 賊 and 草 is unexpected, since no other evidence points toward an initial tsh- in the 10 words that belong to R101:

rhyme \ initiall-ts-k-ʔ-

The Chinese transcriptions with *tsh- may not be mistakes. They could reflect otherwise unattested aspirated variants of one or more of the four R101 ts-words. Gong ("Phonological Alternations") has an example of a ts-/tsh- alternation (#33, 55, 67):

TT5505 tsja R20 1.20 'hot'

TT3089 tsja R19 1.19 'hot'

TT5499 tshja R20 1.20 'hot'

TT5498 tshjwa R20 1.20 'make hot, to heat'

Although Gong reconstructed both of the first two words as tsja, they were obviously not homophonous to the compiler of the Tangraphic Sea, who assigned them to different rhymes (1.19 and 1.20). The difference between R19 and R20 is obscure. The Tibetan and Sanskrit data point toward -a for both rhymes, whereas the Chinese data point toward -a, -ja, and even -aw. (Were there dialects of Tangut with o-suffixed forms of a-words: *-a + *-o = -aw?)

**R36-37 were transcribed as Tibetan -e and R36 was used to transcribe Sanskrit -e. This data points toward the mid vowel of Arakawa's -ee rather than the high vowel of Gong's -jij. I compromise by interpreting -jij as [jej].

Conversely, R9 was transcribed as Tibetan -i and R8-9 were never used to transcribe Sanskrit -e. This data points towad the high vowel of Arakawa's -(j)i rather than the mid vowel of Gong's -(i)e. I suspect that the vowel of R8-9 was something like Nishida's ɪ for R8 - i.e., somewhere between high i and mid e.

R8-9 alternate with R34-35 (Gong, "The Phonological Reconstruction of Tangut", #73-77):

Gong: -j suffixationArakawa: vowel alternation

Tibetan transcriptions of R34-35 contain -e as well as -i, unlike the transcriptions of R9 which end in -i. Nishida (1964: 44) listed no transcriptions for R8. This seems to favor Arakawa's reconstruction. If I am correct in interpreting Gong's e as [ɪ], then perhaps there was a lowering rule before -j:

My guessTibetan transcription
R8none listed in Nishida
R34-ɪ̞j-i, -e
R35-jɪ̞j-i, -e (I assume the one case of -oH reflects *-o suffixation)


I have been assuming that the distinction between the pairs of rhymes reconstructed by Gong with -j- was neutralized before tense vowels.

*Cj1V > CjV (first j-rhyme)

*Cj2V > CjV (second j-rhyme)


*C-Cj1V, *C-Cj2V > CjṾ (only one tense j-rhyme)

I would predict that R70 -jị should be the tense counterpart of the lax pair R10 -ji and R11 -ji. Gong (1988) found alternations confirming a relationship between R11 and R70: e.g.,

TT3592 bji R11 1.11 'low, below, down, inferior'

TT3591 bji R11 2.10 'low, below, down, inferior'

TT5019 bjị R70 1.67 'to lower, to bend, to hang down'

That word family (#117) also contains

TT0501 bjɨ R31 1.30 'low, below, bottom' (more examples: #160-176)

which is homophonous with one member of #133/178

TT0494 bjɨ R31 1.30 'high' (!)

TT1309 bjij R37 2.33 'high'

TT1232 bjị R64 1.61 'to heighten, to elevate, to promote'

Gong did not find any alternations between R10 and R70. R10 alternated with

R30 -jɨ (#103-104; cf. the R11-R31 alternation above)

R53 -jo (#76-78; R11 -ji and R36 -jij also alternated with R53: #79-85 and #86-87)

obviously due to *-o suffixation

R64 -jịj (#113-114; also one case of R11-R64 alternation [#115])

due to *-j suffixation?

The chart below sums up the alternations involving R10-R11:


vowel alternation, or depalatalization of vowel due to influence of lost affix?*-j suffix*-o suffixation*C-prefix and *-j suffix*C-prefix
R10 -jiR30 -jɨ(would expect R36 -jij, but unattested)R53 -joR64 -jịj(would expect R70 -jị, but unattested)
R11 -jiR31 -jɨR37 -jijR70 -jị

I don't know whether the absence of R10-R36 and R10-R70 alternations is due to chance.

Although I have preferred to reconstruct R36-R37 as -jej, Gong's reconstruction -jij is more morphologically transparent: *-ji + *-j = -jij. I suspect that a dissimilation rule applied when -j was added: *-ji + *-j = -jej. I still think it would be very difficult to distingush between R36-37 -jij and R14 -jii. Transcriptive evidence also supports a nonhigh vowel for R36-37:

R36: transcribed in Tibetan as -e(H) (rarely -i)

R36: used to transcribe Sanskrit -e (rarely -i)

R37: generally transcribed in Tibetan as -e

the exceptions have -uH and -o - could these reflect -o-suffixed readings?

R37: not used to transcribe anything in Sanskrit (so was this rhyme unlike anything in Sanskrit, or is this an accidental gap?)

The distinction between R10-R11 -ji did not seem to exist before long -ii, as there is only one -jii (R14). I would expect the tense counterpart of R14 -jii to be R70 -ji, since tense vowels lacked a length distinction in Gong's reconstruction. (Arakawa, on the other hand, reconstructed long as well as short tense and ị.) But in fact R14 apparently did not alternate with R70:

base*-j suffixation*-o suffixation; length of root is transferred into suffix*C-prefixation*r-prefixation and *-j suffixation*r-prefixation
R14 -jiiR40 -jiij (phonetically [jeej]?)R55 -ioo (after g-), -joo (after w-, d-) (why is there no R70 -jị?)R79 -jirj (there was no -jiirj with a long vowel)R84 -jir (with a short vowel! - why not R101 -jiir?)

(For examples of R14-R40 alternations, see #79-82 in Gong's "The Phonological Reconstruction of Tangut". That article also presents examples of alternations between R70-R101, R79-84, and R84-101 [#90-96].)

Does this mean that pre-Tangut had no words like *C-Cjii which would have become Cjị R70? Y TWO Y RHYMES?

I left out one detail from yesterday's cycles-within-cycles paradigm: Gong's reconstruction of pairs of j-rhymes in the lax cycle:

R2-3 -ju, R6-7 -juu

R10-11 -ji

R19-20 -ja, R21, 24 -jaa

R30-31 -jɨ

R36-37 -jij [jej]?

R46-47 -jiw

These pairs only contain four short lax vowels (three high [i ɨ u] and one low [a]) and two long lax vowels (uu, aa). No pairs contain mid vowels: there is only one kind of -jo, etc. In all but one instance (R21 and R24), these pairs were grouped together in Tangraphic Sea, implying phonetic similarity.

Here is a generic paradigm for a Tangut rhyme group incorporating the j-pairs:

Vowel lengthShortLong(no vowel length distinction)ShortLong
-i--iV-iVV-iṾ-iVr(no -iVVr)
(no -j1-/-j2- distinction before any long vowels except -uu, -aa)

a has a nearly complete paradigm:

Vowel lengthShortLong(no vowel length distinction)ShortLong
-i--ia-iaa(no -iạ?*)-iar(no -iaar)

I have not been able to figure out the differences between R2-3 -ju (see here) and R6-7 -juu (see here). I will look at the other pairs in future posts.

Gong described the zero/-i-/-j- medial distinction in terms of 'grades':

Grade I: zero

Grade II: -i-

Grade III -j-

I would divide Grade III into IIIa and IIIb.

According to Gong, Tangut grades were based on the 'grades' of Middle Chinese phonology. Unfortunately, there is no consensus on what the MC 'grades' represented. Here is my interpretation:

Grade IVGrade III
palatal glides / higher front vowelsvelar glides / higher central/back vowels
Grade IIGrade I
lower front vowelslower back vowels

The Tangut may have adapted this system to their own language. Vowels were grouped into low-high pairs. Relatively low vowels were assigned to grades I and II and relatively higher vowels (or glides + lower vowels) were assigned to grades III and IV: e.g.,

Grade I: R17 (low back vowel)

Grade II: R18 (low front vowel)

Grade III: R19 -ɰɑ (velar glide resembling the high back vowel ɯ + low back vowel)

Grade IV: R20 -jæ (palatal glide resembling the high front vowel i + low front vowel)

Compare the above reconstruction with Gong's:

Grade I: R17 -a

Grade II: R18 -ia

Grade IIIa: R19 -ja

Grade IIIb: R20 -ja

Neither reconstruction can explain why R20 (my -jæ and Gong's -ja)

was transcribed in Tibetan as -a (not -ya!)

was used to represent Sanskrit -a (not -ya!)

Perhaps R20 had no -j- at all. If so, then there was no pair of -ja rhymes, and perhaps other pairs of j-rhymes may also turn out to be illusory.

*There is no -iạ in Li Fanwen's (1997: 2) table of Gong's reconstructed rhymes.

I only know of two -iạ words, and both are dubious.

In Tangraphic Sea, TT4763 biạ 1.64 'thigh' has the bizarre fanqie

TT2321 bjị 1.67 + TT5084 tsjwar 1.82

which should indicate bjar 1.82. (One might expect bjw-, but no such cluster exists in Gong's reconstruction.) Yet biạ 1.64 was listed in Tangraphic Sea under rhyme 1.64, not rhyme 1.82. Maybe there was a root ?*bja 'thigh' with two derivatives that were conflated in the TS entry:

*C-bja > bjạ 1.64 (I assume biạ 1.64 is a typo)

*r-bja > bjar 1.82

The word may be cognate to Old Chinese 髀 *peʔ 'thigh'. The voicing of the Tangut form might reflect a nasal prefix: *N-p- > b-.

The other word, TT3571 ŋiạ 1.64 'broad', presumably should be ŋjạ 1.64. It was listed in Tangraphic Sea as a homophone of TT5580 ŋjạ 1.64 'stutter' with the unexpected fanqie

TT2027 xiwa 2.15 + TT2171 ljạ 1.64

The use of x- to spell ŋ- reminds me of how Middle Chinese *x- partly derived from Old Chinese *hŋ-. Perhaps TT3571 and TT5580 had two readings with different initials:

ŋjạ 1.64 < *C-ŋja

xjạ 1.64 < *hŋja < *s-ŋja

In Homophones, TT3571 ŋiạ 1.64 and TT5580 ŋjạ 1.64 are in the velars chapter, whereas TT2027 xiwa 2.15 is in the glottals chapter. (x- is a 'glottal' initial and I suspect it was actually glottal [h] rather than velar [x].)

xiwa is a single syllable, even though it looks like xi + wa. Gong's reconstruction distinguishes between -iw- and -jw-, even though I know of no language that makes such a distinction.

