I have uploaded an Excel file of all 17 anomalously transcribed sinographs that Gong (2002a: 456-457) found in the 1181 AD Tangut translation of the 類林 Forest of Categories. I will refer to the transcriptions below by their Li Fanwen (LFW) numbers.

The Early Middle Chinese (EMC) reconstructions represent a composite of early 7th century dialects including those which are not ancestral to the Tangut period northwestern Chinese (TPNWC) dialect being transcribed.

The Tibetan transcriptions are of the 9th to 10th century ancestor (or at least close relatives) of TPNWC.

The TPNWC reconstructions are forms I assembled from my revisions of Gong's (2002b) reconstructed initials and finals. I am not very confident about this column.

Not all mismatches are unexpected:

Consonantal mergers in TPNWC

In TPNWC, voiced obstruents devoiced and aspirated or deaffricated, and retroflex stops affricated:

Pre-TPNWC TPNWC Tangut transcription
*ph- *ph- ph-
*d- *th- th-
*ʈh- *tʃh- tʃh-
*dʑ- *ʃ- ʃ-

Gong (2002b: 260) proposed that some *dʐ- shifted to *ś- (= my *ʃ-), but that shift did not occur in the reading of 鉏 *dʐɨə, so I have excluded it from the table.

I wonder if the TPWNC alveopalatals were actually retroflex. Could the retroflexion of LFW2025 ʃɨəʳ be an attempt to imitate the retroflexion of a TPNWC *ʂi?

The retroflexion of LFW4633 thwəʳ and LFW5412 ləʳ corresponds to pre-TPNWC *-n and this reminds me of how early transcriptions of Japanese used *-n sinographs to write Japanese *-r-.

The retroflexion of LFW4798 mɔʳ happens to correspond to an Old Chinese *-r- in 邈 and 茆 (not in Leilin; details here), but that *-r- was absent in the Tibetan transcriptions and must have been absent from TPNWC. Moreover, LFW4798 also transcribed 妙 (also not in Leilin; details here) which never had an *-r- in Chinese.

Rhyme shifts in TPNWC

In TPWNC, a *-ə or *-ø-like rhyme corresponding to EMC *-ɨə raised and backed to *-u (Gong 2002b: 374) and was transcribed in Tangut as -ɨu. (The Grade III vowel -ɨ- is obligatory in Tangut between an alveopalatal initial and -u.) The Tibetan transcriptions -u, -e, and -i for this rhyme category are attempts to represent a nonlow vowel absent from Tibetan.

I do not know why 澤 (Gong's TPNWC *tśhiej, my TPNWC *tʃhɛ) was transcribed as LFW1796 tʃhɨụ.

LFW3324, 4042, 1426, and 4867 also do not match TPNWC vowels very well, though the gap is narrower than between Tangut -ɨụ and TPWNC *-ɛ. This is surprising since I presume Chinese-to-Tangut translators were fluent in Chinese and much better transcriptions were possible in Tangut..

The grade mismatches in red are also puzzling.

I did not mark the grade of LFW4042 in red since I assume 伋 had shifted from Grade III *kɨi to Grade IV *ki by the 12th century. If 伋 were still Grade III, it could have been transcribed with

LFW3006 1kɨi 'to sing'

which according to Tangraphic Sea was also for use (as a transcriptive tangraph?) in (Chinese) classics and (Sanskrit) mantras. That definition is odd since Sanskrit does not have a syllable kɨi. Could it have stood for Skt kṛ? (Cf. the Thai borrowing of Skt as ฤ rɨ(ɨ) [Gedney 1947: 89, which uses the symbol ʌ for ɨ!].) But wouldn't kəʳ with a retroflex vowel have been a better match, assuming that the Sanskrit the Tangut heard even remotely sounded like the real thing?) Are there any examples of LFW3006 in transcriptions?

Did Tangut still have 105 rhymes by 1181?

Could at least some of the above anomalies reflect mergers in late twelfth century Tangut? Was the intricate 105-rhyme system recorded in Tangraphic Sea over a century earlier collapsing: e.g.,

Tangut 11th c. Tangraphic Sea Tangut Late 12th c. Forest of Categories Tangut
Grade Rhyme Tangut Should transcribe TPNWC Tangut Transcribes TPNWC
I 51 -o *-o (Grade I) -o *-o (Grade I)
II 52 *-ɔ (Grade II) *-ɔ (Grade II)
I 56 -õ *-õ (Grade I) -õ ~ -o *-õ, *-o (Grade I)
II 57 -ɔ̃ *-ɔ̃ (Grade II) -ɔ̃ ~ ~ -õ ~ -o *-ɔ̃, *-ɔ (Grade II), *-õ, *-o (Grade I)

Maybe TPNWC rhymes were also merging (e.g., losing nasalization). A Tangut -ọ/-ɔ̣ merger would account for the vowel height mismatch between

LFW1426 2phọ

and TPNWC 鮑 and 樸, both *phɔ. But it would not account for the Tangut tense : Chinese lax correspondence (unless Tangut was beginning to lose tenseness - and retroflexion?). WASN'T THERE A MƆʳ APPROPRIATE TRANSCRIPTION?

In the Tangraphic Sea, nearly all Tangut rhymes were organized into three groups:

1. Lax (plain): V (rhymes 1-60)

2. Tense (glottalized?): (rhymes 61-76)

3. Retroflex: Vʳ (rhymes 77-103)

The two exceptions were lax rhymes that were listed at the end of the level tone volume of Tangraphic Sea:

R104 -əũ

R105 -ya

These rhymes were in readings for tangraphs transcribing Chinese syllables. Thus they are not part of the core native rhyme inventory (R1-103) which lacks nasalized u and may lack the high front rounded vowel y. (R3 -iu and R7 -iuu may really be -y and -yy.)

The three-way lax/tense/retroflex distinction is absent from all modern Chinese languages and was presumably also absent from the Chinese dialect known to the Tangut. (Pulleyblank 1984 has reconstructed retroflex vowels in Early Middle Chinese, but they were lost in his Late Middle Chinese which predated Tangut period Chinese.) Therefore one would expect Tangut period northwestern Chinese (TPNWC) syllables to be transcribed with lax rhyme tangraphs. However, Gong (2002: 456-457) found 17 sinographs in the Tangut translation of the 類林 Forest of Categories that were transcribed with tense or retroflex rhyme tangraphs:

Tangut rhyme class Tangut rhyme Reconstruction Number of transcribed sinographs Transcribed sinographs
Tense 62 -ɨụ 7 鉏初楚杼處褚澤
64 -ɨẹ 1
72 -iə̣ 1
73 -ọ 2 鮑樸
Retroflex 90 -əʳ 2 盾論
92 -ɨəʳ 3 寔涉什
96 -ɔʳ 1

Here are four possible interpretations:

1. TPNWC did have tense and retroflex rhymes like Tangut.

2. Tangut did not have tense and retroflex rhymes.

3. Tangut tenseness and retroflexion correspond to other phonetic qualities in TPNWC.

4. These are random transcription errors.

I think 4 is correct, though I am puzzled by the transcription of the last sinograph 邈:


1mɔʳ (transcription character)

alphacode: biocirges

top of 1mi 'to listen; to hear' (cf. 1mio below)

alphacode: biopokdex +

left 2/3 of 1tʃɔʳ 'mud' (< 'water' + 'earth' + E-shape)

alphacode: cirgescok (there is no cirges, so the E-shape seems to be a filler)

I found the tangraph biocirges yesterday when I checked to see if the rhyme -ɔʳ could follow labial initials.

biocirges is a fanqie character made from two other characters representing its initial (bio) and final (cirges).

Tangraphic Sea defined biocirges as being for use in mantras and rhymes. However,

- Sanskrit has no retroflex vowels

- The closest Sanskrit syllable (mor with a consonant r, not retroflexion of o) is of low frequency

- Nishida (1964: 67) lists no instances of its use to transcribe Sanskrit

- What does "rhymes" refer to?

- There is no independent evidence for TPNWC *mɔʳ (or for retroflexion in TPNWC)

Why was a transcription tangraph designed for a syllable that occurred in neither of the two languages that were commonly transcribed?

Li Fanwen (2008: 759) lists biocirges as a transcription of three sinographs (including the one from Leilin noted by Gong):

妙 Middle Chinese *miewh (Grade IV) < Old Chinese *mews

Tibetan transcriptions HbyeHu, HbyoHu (for *mbjew, *mbjow?)

茆 Middle Chinese *mæwʔ (Grade II) < Old Chinese *mruʔ

No Tibetan transcriptions, but its near-homophone 貌 was transcribed as Hbeg (sic; for *mbæk?; the *-k ending is not entirely unexpected since 貌 is phonetic in 邈 with *-k - was there an alternate *-k reading of 貌?)

邈 Middle Chinese *mæwk (Grade II) < Old Chinese *mrawk

Tibetan transcription Hbyag (for *mbæk?)

The TPNWC readings are uncertain, so I have listed the MC readings from 4+ centuries earlier and the Tibetan transcriptions from 1-2 centuries earlier.

妙 has a Chinese grade (IV) that does not match the grade (II) of 1mɔʳ. Why wasn't 妙 transcribed as Grade IV

1mioʳ 'true; close'

Tibetan transcription: rmo (twice), rmu (once)

or as any of the Grade IV tangraphs read mio without retroflexion: e.g.,

1mio 'to listen; to hear' (extended stem of 1mi- [see above] + -o- before first or second person subject suffixes)

And why wasn't a tangraph for Grade II (sans retroflexion) created for the transcription of Grade II 茆 and 邈? Although Tangut had no mɔ, this is probably a chance gap since Tangut did have other labials before -ɔ: pɔ, phɔ, bɔ < pre-Tangut *pro, *phro, *bro. *mro must have been missing.

-ɔʳ is a rare rhyme in Tangut. It occurs in only five other tangraphs:

Tangraph Li Fanwen number Reconstruction Gloss
4064 1vɔʳ second half of 1vɛʳ-1vɔʳ 'to cherish; to treasure'
2005 1tʃɔʳ mud
5697 dirty; filthy
2113 2tʃɔʳ filthy (has 'person' on the left)
3061 putty; to spread on (has 'skin' on the right)

Their readings may go back to *vor (or *Cʌ-Por) and *tʃor(-H) or *r-tʃo(-H) with lowering of *o before *v- and *tʃ-.

The *tʃor(-H) words may all be related (or at least the first three are). THE R-ELATIVE CHRONOLOGY OF RETROFLEXION IN TANGUT

Tangut rhymes are all vowel (V)-final and can be grouped into three classes:

1. Default: V (rhymes 1-60, 104-105)

2. Tense (with subscript dot): (rhymes 61-76)

3. Retroflex (with small r): Vʳ (rhymes 77-103)

I initially hypothesized that an *r anywhere in a pre-Tangut syllable would condition retroflex rhymes:

a. preinitial/presyllable *r-:*rCV > CVʳ

b. initial *r-:*rV > rVʳ

c. medial *-r-: *CrV > CVʳ

d. final *-r: *CVr > CVʳ

However, scenario c predicts that Tangut words cognate to non-Tangut *Cr-words should have retroflex rhymes. This is not the case.

For example, Guillaume Jacques (2006) linked Japhug rGyalrong ku ɣrum < *pr- 'white' to

1phɔ̃ 'white'

with a default class rhyme, not a retroflex class rhyme. But perhaps this is not the best counterevidence since there is no Tangut rhyme -ɔ̃ʳ. Nasalized retroflex vowels are very rare, and retroflex *-ɔ̃ʳ could have merged with default -ɔ̃.

One could raise a similar objection to

1bɪ̣ 'willow'

which may be cognate to Japhug rGyalrong qa-ʑmbri 'willow'. Retroflexion and tension cannot coexist, so perhaps *-ɪ̣ʳ simplified to *-ɪ̣.

However, no such objection applies to

2phɔ 'snake' (Golden Guide graph #38)

a likely Tangut cognate of Japhug rGyalrong qa-pri 'snake'. The rhyme -ɔʳ exists and is attested after the labial initial m- (more on this in my next post), so why isn't 'snake' 2phɔʳ with a retroflex vowel?

Hence I now think that original medial *-r- did not condition retroflexion:

a. preinitial/presyllable *r-:*r(V-)CV > secondary *CrV > CVʳ

b. initial *r-:*rV > rVʳ

c. medial *-r-: original *CrV > CV (not CVʳ!)

d. final *-r: *CVr > CVʳ

Instead, original medial *-r- led to Grade II in low vowel syllables and Grade III or IV (depending on the initial class) in high vowel syllables, whereas secondary medial *-r- from earlier preinitial or presyllabic *r- led to retroflexion.

I hypothesize that original medial *-r-loss predates retroflexion. Note that 'medial' really means 'postinitial' and does not refer to initial *r following a presyllable:

The basic pattern

*r class Preinitial/presyllable *r- > secondary medial *-r- Initial *r- Original medial *-r- Final *-r
Pre-Tangut *r(V)-CV *rV *CrV *CVr
Original *-r-loss *CV
Secondary *-r- *CrV
Retroflexion CVʳ rVʳ CV CVʳ

The complete set of patterns

*r class Preinitial/presyllable *r- > secondary medial *-r- Initial *r- Original medial *-r- Final *-r
Vowel class Low High Low High Low High Low High
Pre-Tangut *(r(ʌ)-)Ca *rʌ-Ci *rɯ-Ca *(r(ɯ)-)Ci *(Cʌ-)ra *Cʌ-ri *Cɯ-ra *(Cɯ)-ri *(Cʌ-)Cra *Cʌ-Cri *Cɯ-Cra *(Cɯ-)Cri *(Cʌ-)Car *Cʌ-Cir *Cɯ-Car *(Cɯ-)Cir
*-r-loss *(Cʌ-)Cæ *Cʌ-Cɪ *Cɯ-Cɨa *(Cɯ-)Cɨi
Vowel bending *rʌ-Cəi *rɯ-Cɨa *Cʌ-rəi *Cɯ-rɨa *Cʌ-Cəir *Cɯ-Cɨar
Presyllable loss *Cra *Crəi *Crɨa *Cri *ra *rəi *rɨa *ri *Cæ * *Cɨa *Cɨi *Car *Cəir *Cɨar *Cir
Retroflexion *Caʳ *Cəiʳ *Cɨaʳ *Ciʳ *raʳ *rəiʳ *rɨaʳ *riʳ *Caʳ *Cəiʳ *Cɨaʳ *Ciʳ
*ɨ-fronting after most initials Caʳ Cəiʳ Ciaʳ Ciʳ raʳ rəiʳ riaʳ riʳ Cia Ci Caʳ Cəiʳ Ciaʳ Ciʳ

Grade II retroflex rhymes reflect an original medial *-r- plus another *r in pre-Tangut: e.g.,

CVʳ < *r(V-)CV, *rCrV, *CrVr

Grade II rhymes never occur with initial *r-. This is what we would expect if Grade II were conditioned by medial *-r-.

*rV-rV(r) and *rVr would merge with *rV as rVʳ with a non-Grade II rhyme. Tangut has 4 1raʳ tangraphs and 12 2raʳ tangraphs. Those 16 raʳ may go back to *ra(r) with various lost *Cʌ- presyllables: e.g., *pʌ-, *tʌ-, *kʌ-, *sʌ-, *rʌ-, etc.

Some preinitial *r- may be from other coronal stops: e.g.,

*tV-m- > *tm- > *dm- > *rm-

Although *Cr- is a source of Grade II, not all Grade II syllables originally contained *-r-. Grade II syllables with labiodental (v-) or alveopalatal initials (tʃ- tʃh- dʒ- ʃ- ʒ-) may not have had *-r-: e.g.,

1tʃæ < *tʃ(r)a 'to collapse' (cf. gDong-brgyad rGyalrong kɤ-tʂaβ 'to roll')

2vɛ̃ '< *Cʌ-peN to fart' (cf. gSar-dzong rGyalrong tɯ-phe 'to fart')

1tʃhʌ < *Cʌ-tʃhə 'to take' (cf. gDong-brgyad rGyalrong kɤ-tɕɤt 'to take')

The association of alveopalatals and Grade II in Tangut is like the association of retroflexes and Grade II in Middle Chinese. Perhaps Tangut alveopalatals were really retroflexes: tʂ-, tʂh-, dʐ-, ʂ-, ʐ-.

Although there is no association between labiodentals and Grade II in Chinese, Tangut labiodental v- is assoicated with Grade II like the alveopalatals (retroflexes)? Perhaps v- was once *w- which is Elmer Fudd's substitute for r: wascally wabbit for rascally rabbit, etc. In the Xi'an dialect of northwestern Mandarin, *ʂ- and *ɕ- have become f- before -u. PST *P(R)-L 'TO SEPARATE'?

Years ago on this blog, I proposed that Old Chinese zero ~ alternations originated from zero ~ *a grade alternations:

*m....ŋ (zero grade) > *ma 'not'

*maŋ (*a grade) > *maŋ 'not'

Cf. Sanskrit alternations such as

*m...n-ti- (zero grade) > mati- 'mind' (Eng mind is cognate)

*men- (*e grade) > mantra- 'mantra'

I abandoned this idea, but it came to mind again last night when I noticed the mismatch between pre-Tangut *Cʌ-briH and Matisoff's (2003: 423) Proto-Tibeto-Burman *bral 'to separate'. Could they be reconciled by proposing a root *br-l?

*brl (zero grade) > *bri > root of pre-Tangut *Cʌ-bri-H > Tangut 2bɪ 'to open'

*bral (*a grade) > root of Written Tibetan H-bral- 'to be separated', H-phral- 'to separate'

The shift of *-l to the vowel -i has parallels elsewhere in Sino-Tibetan:

- the shift of initial and final *l to *j in Old Chinese

- medial *-l- to -y- in Written Tibetan after ph- and between b- and -i (Jacques 2004: 9)

- medial *-l- to *-y- in Burmese

- preinitial *l- to j- in Japhug rGyalrong (Jacques 2004: 271)

Could Old Chinese have both zero and *a-grade forms of 'separate'?

*brl (zero grade) > root of OC 仳 *briʔ 'to be separated'

*bral (*a grade) > root of OC 披 *ph(r)aj < ?*s-br- 'to separate'

Gong (2002: 199) would add OC 離 *raj 'to leave, depart; be separated from' which he derived from *br-.

Could OC 別 *prat 'to divide, separate' be related via *pral-t?

This scenario has several problems:

1. Are there any other examples of OC *-i ~ *-Vj alternations?

2. 仳 has another *-r-less reading *phiʔ.

3. Is 仳 'to be separated' attested outside the two-syllable words 仳離 and 仳別, both 'be separated'? (Does the latter have any attestations in OC texts, or is it a later coinage?) Could 仳 have orignated as a reduplication of the following root (離, 別) or a part of an earlier disyllabic root? Did its high vowel *i block the development of emphasis in the low-vowelled second syllable?

4. The reconstruction of *-r- in 披 is uncertain. If it was *phaj, could it be cognate to 破 *phajs 'to break' (< 'cause to separate') as suggested by Schuessler (2007: 416)? Was there a Proto-Sino-Tibetan root *P-l with an infixed variant *Pr-l (from *r-P-l?)?

(5.27.7:59: Could OC 分 *pən 'to separate' be a grade derivative of this root via *pəl-n?)

5. There is no Chinese-internal evidence for the reconstruction of *b-in 離. The only stop initial associated with the phonetic 离 is Middle Chinese *ʈh- which could be from *hr- (Sagart 1999, Schuessler 2009) or *t-hr-. The transcription of Alexandria as OC 烏弋山離 *ʔa-lək-ksan-(d)raj could imply *dr- but not *br-.

At the moment, I see Tangut 2bɪ 'to open' as a potential cognate of OC 仳 *briʔ 'to be separated' if and only if the latter were ever an independent word. This relationship need not entail the reconstruction of a zero ~ *a grade alternation. THE GOLDEN GUIDE: LINES 7-8: TANGRAPHS 31-40

(Fell asleep before I could post this last night. Added a lot tonight.)

Since I ran out of time last night to do a new line of the Golden Guide, I'm doing two tonight:

Lines 7-10 have the format

1 2 3 4 5
1 of 4 seasons (varies) animal/Earthly Branch name animal/Earthly Branch name animal/Earthly Branch name
the months of the season in 1

The Tangut borrowed the Earthly Branches system from Chinese, though they use native names for the branches. The Tangut also borrowed the Heavenly Stems system, but those are absent from the Golden Guide with the exception of 1lɨẹ which also means 'seedling'.

Tangraph # 31 32 33 34 35
Li Fanwen number 0185 2162 1477 0281 0083
My reconstructed pronunciation 2nwiə 2bɪ 2ləi 1tsəiʳ 1vəi
Tangraph gloss spring to untie; to open tiger (month 1) rabbit (month 2) dragon (month 3)
Translation Spring opens: tiger, rabbit, dragon,

31 'spring' looks like 36 'summer' (see below) plus a radical

that also appears in

1ʃɨõ 'season'

32 2bɪ 'to open' has no phonetic though it is a phonetic in a homophone

2bɪ 'grape'

Nishida (1966: 243) identified their middle radical

as 'knowledge'. The left radical is 'person' and the right radical could be an abbreviation of 222 other tangraphs.

21:15: 2bɪ 'to open' may be cognate to Written Tibetan Hbye-ba 'to open; to separate', though the vowel of the Tangut form implies pre-Tangut *Cʌ-briH with a medial *-r- absent from WT. Did Tangut have an infix *-r-? Matisoff's (2003: 423) Proto-Tibeto-Burman *bral 'to separate' has *br- but with a nonfront vowel.

33 'tiger' contains a radical

that may be from Chinese 乕 'tiger'.

Only two of these tangraphs have analyses in the surviving level tone volume of Tangraphic Sea:

34 'rabbit' consists of all of 'wild animal' plus the right side of a homophonous phonetic:


35 'dragon' consists of the top of 'to crawl' plus the bottom of a homophonous phonetic:


Tangraph # 36 37 38 39 40
Li Fanwen number 0071 1876 0080 1115 5504
My reconstructed pronunciation 2dʒwɨe 2phəəu 2phɔ 1gie 2mio
Tangraph gloss summer luxuriant; flourishing snake (month 4) horse (month 5) sheep (month 6)
Translation Summer flourishes: snake, horse, sheep.

36 may be from Chinese 夏 'summer' plus a ヒ of unknown function. I wonder if the common final radical ヒ represented a final consonant in a nonstandard Tangut dialect that was not the basis of the rhyme system of Tangraphic Sea.

37 is listed as an adjective in Li Fanwen (2008: 312). Hence 36-37 could be translated as 'flourishing summer'. But I've chosen to translate it as an intransitive verb to parallel 32 'opens'.

37 'flourishing' shares its left side


2ziəəʳ 'water'

The right side of 'flourishing' is

2phəu 'tree'

which is phonetic and possibly even cognate via vowel lengthening (2phəu 'tree' > 'full of trees' > 2phəəu 'flourishing'?).

38 'snake' contains

1do 'poison'

The Tangraphic Sea derives 'poison' from 'snake', but I suspect the derivation was actually the other way around:


'snake' < 'poison'

39 'horse' contains the radical


derived from Chinese 馬 'horse'. The regular word for 'horse'


has the *r-root also in 馬 Old Chinese*mraʔ 'horse' (and the ヒ also in 'summer').

The Earthly Branch 'horse' is the only tangraph in this line with a known Tangraphic Sea analysis:


1gie 'horse (Earthly Branch)' = left of 1rieʳ 'horse' + left of 2riaʳ 'horse'

The last two words may be cognate.

2riaʳ 'looks like the right of 1gie combined with all of 1rieʳ.

In Tangraphic Sea, 1rieʳ is defined as 1gie-1rieʳ (a redundant compound?) and 2riaʳ.

1gie is also defined as 1gie-1rieʳ and 2riaʳ.

The TS definition for 2riaʳ is lost, but in the Homophones dictionary (53B27), it is clarified by a preceding 1rieʳ.

40 'sheep' has the unexpected radical

'pass' < 戉 < Chinese 越

alphacode: pax

instead of something like 'beast'. The right side (alphacode: caigus) is unique to 'sheep' and cannot be taken verbatim from some other tangraph. The combination paxcai is also unique to 'sheep'. 'Sheep' may be a combination of three tangraphs: one with pax, one with cai (ユ), and another with gus (ソ+几+ノ). TANGRAPHIC RADICALS 1: PRIMARY, SECONDARY, PSEUDO

Although there are over 6,000 Tangut characters (tangraphs), they are not 6,000 totally different line configurations. They contain recurring elements which are usually called 'radicals'. I've found that term unsatisfying because some 'radicals' are derivatives rather than true roots: e.g., I recently proposed that the 'steal' radical

Li Fanwen radical 369

alphacode: yam (on left), yal (on right)

might be derived from

1tʃhɔ̃ɔ̃ 'to steal'

alphacode: yamdil

which in turn might've been derived from Chinese

'to rob'

yam/yal corresponds to the 完 part of 寇 which by itself without 元 means 'complete'.

I doubt that 完 'complete' was directly distorted into 'steal':

完 >

Instead, I think 'rob' as a whole was distorted into yamdil 'steal', which was then abbreviated as yam/yal,

寇 > >

which then became a radical in tangraphs for other words meaning 'to steal':


Calling the component yam/yal a 'radical' implies to me that the creator of tangraphy designed yam/yal and then used it to build other characters like yamdil:

(? >) >

Nonetheless, the term 'radical' is so commonly used that I'll use it with Andrew West's caveat: "The Chinese Radical model does not seem to apply to Tangut." Perhaps I could make a distinction between

1. primary radicals: elements that seem to have originated as wholes

2. secondary radicals: elements derived from primary radicals


primary <> secondary?

3. pseudoradicals: parts of tangraphs used as indexing items in modern times which probably do not correspond to units in tangraphy

I could divide 'steal' into two pseudoradicals (its top and bottom halves) and view

alphacodes: qun and foa

as combinations of those two halves with two other elements

But is there really any relationship between

yam/yal and qun


yam/yal and foa

I doubt it. Nishida (1966) was unable to gloss qun and foa.

Elements with a common origin may share lines, but elements that share lines may not have a common origin. The top half of yam/yal may be from part of 寇, but the top of qun may have an entirely different origin.

I wonder how many learners of Chinese perceive pseudoradicals. If we knew little about sinography, we might think that 鹿 'deer' is derived from the radicals 广 'roof' and 比 'compare'. But in fact 鹿 'deer' is itself a radical. It is a modernized drawing of a deer that cannot be broken into further meaningful parts. What currently appear to be 广 'roof' and 比 'compare' were once depictions of antlers and legs.

David Boxenhorn sent me a list of the frequencies of radicals in his Tangut Search tool. I have already discussed the most common radical

'person' (as an independent tangraph, but not necessarily as a component)

alphacode: dex

in "How Many People Are in the Tangut Script?" David found that the second most common radical is

alphacode: bae

which appears in 585* tangraphs - roughly 1 out of 10 of the ~6,000 tangraphs in common use.

Here are three cases in which I think bae is a pseudoradical.


1lew 'one'

alphacode: baadexbae

seems to function as a single indivisible unit and in fact is treated as a single radical 'jas' in 8 other tangraphs: e.g.,

1ʃʌ (a measure word; a surname)

alphacode: cirjas

I long suspected a connection between baadexbae 'one' and Chinese 一 'one' (resembling the Tangut radical baa), but couldn't explain why dex 'person' and bae were added until tonight. Could the tangraph be very loosely based on Chinese弌 'one' with bae beneath a horizontal line instead of a curved stroke intersecting a horizonal line?

Second, I don't think

1lɨəəʳ 'four'

alphacode: dexbaabaedexbae

is really a five-radical tangraph:

1. 'person' (dex)

2. horizontal line (baa)

3. vertical line (bae)

4. 'person' again (dex)

5. vertical line again (bae)

Ever since I first started learning tangraphs in 1996, I thought that the right half was a distortion of Chinese 四 but with an open 'box' since tangraphy avoids closed boxes (Grinstead 1972: 57). The left half 'person' may have been added to

'four', 'six', nine'

by analogy with fraud-proof numeral sinographs with 'person' on the left:

伍 'five', 佰 'hundred', 仟 'thousand'

But note that the fraud-proof sinograph for 'four', 'six', and 'nine' are

肆 陸 玖


亻+四 亻+六 亻+九

And the tangraphs for 'five', 'hundred', and 'thousand' don't contain 'person' on the left or resemble the sinographs (though 'five' has a 亻-like element: alphacode bii):

Third, I suspect

2thie (transcription character for Tangut period northwestern Chinese *thje, *thjẽ)

alphacode: basdeobae

is a distortion of Tangut period northwestern Chinese 定 *thjẽ with the bottom left-hand 丿 turned into a 丨 (bae) on the right side.

Pseudoradicals have pedagogical value: e.g., if I were to teach someone the sinograph 鹿 'deer', and he already knew 广 'broad' (广 doubles as a simplified form of 廣 from 广 'roof' + phonetic 黃) and 比 'compare', I would tell him to think of it as 广 and 比 plus an unusual middle part. It would be too confusing to say that 鹿 'deer' looks like it contains 广 and 比, but it actually doesn't. Users don't have to understand a character's true structure to learn it.

The Tangraphic Sea 'analyses' may be pedagogical devices which may or may not match the actual structures or origins of tangraphs. Nowhere does the Tangraphic Sea state that tangraph X is derived from sinography Y; instead, one may see a derivation of X from a tangraph XZ that must have been created after X: e.g.,


'four' = bottom of 'fourth son' + right of 1lɨəəʳ, second syllable of the surname məi-lɨəəʳ

In that derivation, X = 'four' and XZ = 'fourth son' (with Z being the 'horned hat' on top). It's highly unlikely that the tangraph for 'fourth son' was devised before the tangraph for 'four'. Moreover, the surname syllable is homophonous with 'four' and Tangraphic Sea states its right side is from the right side of 'four', though it resembles 1lew 'one':


1lɨəəʳ = left of the surname riuʳ (a clan related to the məi-lɨəəʳ?) + right of1lɨəəʳ 'four'

So both 'sources' of 'four' in Tangraphic Sea are derivatives of 'four' and must postdate it.

How many instances of bae are 'true' as opposed to 'pseudo'? I have no idea. Here's one probable case of a 'true' bae:

1lạ 'hand; arm'

alphacode: baepik

has bae as a 'filler' since pik 'hand/arm', like many Tangut radicals, cannot stand alone. (The reason for the distinction between independent and dependent radicals is unknown.) pik is probably a distortion of Chinese 手 'hand' which has no vertical stroke on the left, so I presume bae was added to pik:

手 > >

bae is an abbreviation of baepik in other tangraphs: e.g.,


I regard such instances of bae as secondary radicals which do not indicate what bae originally meant, if anything.

I just realized that one Chinese source of bae might be 扌, the abbreviated left-hand form of 手 'hand'. Hence

may be a redundant compound of 扌 'hand' + 手 'hand'. Cf.

1məə 'fire'

alphacode: bosbeaqex

which is also a redundant compound of 'fire' (bea) + fire' (qex) beneath 'wood' (bos; why?). Oddly, the double 'fire' sans 'wood' is

1ɣɛ 'to cook'

which I doubt was created before 'fire'. The Tangraphic Sea derives it from the bottom left of 'fire' plus the left of 1ɣii, the second half of the reduplicated word 1ɣɛ-1ɣii 'to cook':


1ɣii has the 'fire' elements of 1ɣɛ reversed:


The C-shaped element of the first 'fire' element becomes a 3-shape on the right side of tangraphs and the 'tail' of the second 'fire' element is cut off when it would be in the middle of a tangraph..

*Li Fanwen 5997 doesn't contain bae in the Mojikyo font based on Li Fanwen 1997:

1lhə (meaning unknown; only found in Sofronov's index)

However, 5997 in Li Fanwen 2008 is a different tangraph read 1tʃwɨe that does contain bae. Unfortunately, it's not in the Mojikyo font.

