The actress Persis Khambatta (1948-1998) was from Bombay, so I assume her name is a romanization of Marathi. Yet her Marathi Wikipedia article is titled

पर्सिस खंभाता

<parsis khaṃbhātā>

with <bh> and only one <t>. Are those errors? The longer (!) Bengali Wikipedia article title is closer to the English spelling:

পার্সিস খামবাট্টা

<pārsis khāmbāṭṭā>. Ah, I think I know what's going on now. Googling in Hindi, I find her name spelled as

पर्सिस खंबाटा

<parsis khaṃbāā>.

The tt of the English spelling could be like the TT of the Unicode names for Indic retroflex -characters: e.g., ट DEVANAGARI LETTER TTA for ṭa. The doubling represents retroflexion, not gemination.

The Bengali spelling may be based on English. It is not a letter-for-letter transliteration of the Hindi spelling. The long <ā> [a] is needed to transcribe English e and a since short <a> [ɔ] is not a good phonetic match. The Bengali <ṭṭ> reflects

- how English /t/ sounds like retroflex to Indic language speakers

- the assumption that English -tt- represented an original geminate rather than a single retroflex stop HEAVENLY WOMAN, EARTHLY MAN?

The Tangraphic Sea defined the first graph of

1ɣʊ 2lõ 'Ghulon' (son of the father of the Black-Headed Tangut)

as 'among the sons of

2lɨə̣ 1bə 'Lyby'

which Li Fanwen (2008: 452) glossed phonetically as 勒胡 (Mandarin Lehu) even though 2lɨə̣ means 'earth'. Why was 1bə glossed as 胡 hu? It has no phonetic or semantic resemblance to 胡 hu < *ɣo 'dewlap, why, foreign' which might have been borrowed into Tangut as

1ɣʊ 'northern barbarians' (gloss from Grinstead 1972: 116).

I suspect that 胡 hu was originally meant to be a phonetic gloss for 1ɣʊ.

Li Fanwen (2008) phonetically glossed 'Lyby' elsewhere as 勒博 Md Lebo (p. 53), 勒卜 Md Lebu (p. 365), and  勒泊 Md Lebo (p. 795).

Shi et al. (1983: 407) translated 2lɨə̣ 1bə as 先人 'ancestor' in the Tangraphic Sea definition of 1ɣʊ.

Kychanov and Arakawa (2006: 188) similarly defined 2lɨə̣ 1bə as 'name of an ancestor' and as 'name of a son of Father of Blackheaded' in their definition for 1ɣʊ 2lõ on p. 546 (hence my gloss above).

A note in the D text of the Homophones equates 2lɨə̣ 1bə with the father of the black-headed:

1ɣʊ 1nɨaa 1ʔie 1vɨa 'head black -'s father'

This phrase also appears in the Tangraphic Sea definition for 1bə.

Does 'earth' imply 'black'? The other type of Tangut was 'red-faced'; was their ancestor the

1mə 1miẹ 'heaven woman'

paired with Lyby in a Tangut proverb (Li Fanwen 2008: 795)? Was the 'heaven woman' the wife of Lyby, whose second syllable might mean something like 'man' and be from *N-pə, a cognate to

1və < *Cɯ-Pə 'husband'

1vɨa < *Cɯ-Pa 'father'

(the unknown *C may be the unknown nasal in *N-pə)

and 夫 Old Chinese *Cɯ-pa 'man'?

However, according to Kepping (2003: 126), in Tangut odes

the black-headed were connected with the Heaven, while the red-faced were associated with the Earth

which is precisely the opposite of what is implied by the association of 'earth father' with the black-headed. Could the earth father be the ancestor of the red-faced rather than the black-headed? Kychanov and Arakawa (2006: 554) list a surname combining 1bə with 2nie, a near-homophone of 1nie 'red':

1bə 2nie 'Byne' ('red man'?)

but 2nie also appears as the first syllable of the surname

2nie 2dʊ 'Nedu'

so it may not necessarily always be an adjective which should normally appear as a second syllable following the noun it modifies.

Kychanov and Arakawa (2006: 188) also list what looks like a longer name for the 'earth father':

2lɨə̣ 1bə 2lhie 2lõ  'Lyby (earth father) Lhelon'

The tangraph for 2lhie apparently exists solely to transcribe the first half of the name 'Lhelon' (which can also be a surname according to Kychanov and Arakawa 2006: 744), whereas 2lõ can also appear in the Tangut surname

2ʔiə 2lõ 'Ylon'

with a common first element (prefix?)* and in what Li Fanwen (2008: 365) considered to be a transcription of Chinese 陵** in a Tangut equivalent of 丘陵 'hills and mounds'

2lõ 2lõ (the first 2lõ means 'hill').

There is a phonetic problem: 陵 was pronounced *lĩ in Tangut period northwestern Chinese, not *lõ. I suspect that 2lõ 2lõ̃ may be a native reduplication or a compound 'hill-hill', though there is one problem: the two 2lõ belong to different homophone groups in Homophones: IX 47 and IX 61. Sofronov (1968: 361) reconstructed them with different initials as 2lwon and 2ldwon. 2ldwon is even further from Chinese 陵 *lĩ than 2lõ. Perhaps 2ldwon is from a prefixed derivative of 2lwon.

2lhie < ?k-leH is close to

2liẹ < *s-leH 'great'

which can precede nouns unlike normal Tangut adjectives (but like its Chinese equivalent - and cognate? - 大***). Could 'Lhelon' have once been 'great hill'?

10.13.22:22: Could 2lhie be cognate with

1lha 'sage, god'

which could be a loan from Tibetan lha 'god'? 2lhie could even be a compression of Tibetan lha-Hi 'god's'. Then 'Lhelon' would have once been 'divine hill'.

*10.13.2:31: Li Fanwen (2008: 365) actually has the spelling

1tiọ 2lõ 'Tolon' with 1tiọ 'to ferment'

but his phonetic gloss 夷龍 Md Yilong implies the spelling

2ʔiə 2lõ 'Ylon' with the surname first element or prefix 2ʔiə

which appears on p. 782.

**10.13.20:48: Li Fanwen (2008: 365) defined 2lõ as 'a transliteration' phonetically equivalent to Chinese 龍 and 陵.

***10.13.20:49: 大 Mandarin da is from Old Chinese *lats. The pre-Tangut *le-words could be from an even earlier *lats with a presyllable that conditioned the raising of *a to *e. ANCESTRAL HILL

The name

1ɣʊ 2lõ 'Ghulon, son of the father of the Black-Headed Tangut'

from the last entry could be a noun-adjective construction: 'the 2lõ head' with a special spelling of

1ɣʊ 'head'.

But is there an adjective 2lõ that also had a special spelling as part of the name Ghulon? Possible candidates (with their locations and Sofronov homophone groups in Homophones) are

49A31 IX 47 2lõ 'wide'

49B45 IX 61 2lõ 'straight, correct'

49B46 IX 61 2lõ 'wide'

I don't know why Homophones has two homophone groups for 2lõ: IX 47 and IX 61. Should these groups be reconstructed differently: e.g., with different initials as 2ld and 2dó̃ (cf. Sofronov's 2ldwon and 2lwon)? I'm still reluctant to reconstruct ld- for reasons I'll discuss later. The second half of 'Ghulon' belongs to the second homophone group (IX 61), so I doubt it is another spelling of 49A31 in the first group (IX 47).

I also thought 'Ghulon' could be a noun-noun construction with 1ɣʊ 'head' modifying

49A25 IX 47 2lõ 'relatives'

49A26 IX 47 2lõ 'origin'

but both of these words belong to the first 2lõ homophone group, not the second like the '-lon' of 'Ghulon'. A noun of the second group that might work is

49B51 IX 61 2lõ 'slope, hillock'

if 'Ghulon' were originally a toponym: '(person from) Head Hillock'.

Any of the above interpretations cast Li Fanwen's (2008: 259) definition of '-lon' as 'ancestor, forefather' into doubt. Are there any attestations of '-lon' as an independent word without 'Ghu-' preceding it?

I wish there were a searchable database of Tangut texts - a Scripta Tangutica like Scripta Sinica - which could help determine whether tangraphs represented monosyllabic words or halves of disyllabic words. I have yet to see a Tangut dictionary that has definitions like 'first/second half of ...'

No Tangraphic Sea analysis of '-lon' is unknown. I imagine that the analysis in the lost second volume of Tangraphic Sea was something like


2lõ '-lon' = top of 2dziụ 'forefathers, ancestry' (semantic) + left of 2lõ 'slope, hillock' (phonetic)

The shared top element of '-lon' and 2dziụ (Boxenhorn code: caibox) could mean 'ancestor' and be derived from Chinese 宗 'ancestor'. However, only one out of the three other caibox tangraphs


1dã 'Dan'* = left of 2lɨə̣ 'land' + bottom of 2lõ '-lon'

might be relevant to ancestry.


1dã 'to kick' =  left and top right of1dã 'Dan' + right of 2dzəuʳ 'to kick'

contains 'Dan' as a phonetic. And I have no idea what caibox is doing in


1zõʳ 'sexual desire, lust' (caiboxfixhax) =

top of 1dʐɛ 'lascivious' (baxboxdexbaltoa)

left of 2mie 'not yet' (fixtun)

bottom left of 2bəiʳ 'to meet' (bamhaxtuncin)

whose analysis has bax rather than cai.

One might expect the bottom element of '-lon'

(Boxenhorn code: fei; Nishida: 'steep')

to be phonetic in other lõ-tangraphs besides 'slope', but none of the twenty other fei-tangraphs were pronounced lõ.

Lastly, one might expect 'slope' to be from 'steep' plus 'earth', but its right side is the extremely common and hence uninformative

(Boxenhorn code: dex; Nishida: 'person')

which appears in one out of five tangraphs.

*10.12.19:54: 'Dan' is an onomagraph for writing names or parts of names. By itself it represents a place name 'Dan'. (I wonder where that was.) It also forms the first halves of the surnames

1dã-2lhə̣i 'Danlhi'

1dã-1ka 'Danka'

(Kychanov and Arakawa 2006: 162). Can 'Dan' also be the second syllable of a surname? A reverse index of Tangut would be handy. TRACING THE LINE-AGE OF THE HEAD OF THE FAMILY

Both Khitan scripts have sets of characters that are identical except for the presence or absence of one or two strokes: e.g.,

Large script (none of the readings are known to me)

Small script

<a> <?>

<ɣo> <?> <?>

In some cases, the significance of the strokes are known, and relationships between members of pairs may be doubtful: e.g.,

Large script

<is> 'nine' and <ɣo>

Small script

<pu> and <fu> (f was a Chinese consonant not native to Khitan)

'one' (feminine) and 'one' (masculine)

<w> and <ong>

This method of derivation must have been known to the inventor(s?) of the Tangut script who might have belonged to the Khitan Yelü clan. I have recently mentioned two pairs of related Tangut characters with and without extra strokes:

1gwi and 2gwi, both 'word, sentence, language'

2tʂhɨi and 1tʂhɨi, both 'base, origin'

The second character for 'head' from my last two posts also has a lookalike (and homophone) with an additional stroke:

1ɣʊ 'head' and 1ɣʊ 'the surname Ghu'

Are these strokes truly arbitrary additions, or are Tangraphic Sea analyses like


1ɣʊ 'the surname Ghu' = the left of 1sị 'pure' + the center and right of 1ɣʊ 'head'

correct? Why was the horizontal stroke of 'pure' atop the vertical bar omitted in 'Ghu'? And why is 'pure head' in an un-Tangut adjective-noun order?

1ɣʊ 2lõ

is the name of an ancestor of the Tangut people, the son of the father of the Black-Headed (Kychanov 2006: 546). 2lõ means 'ancestor, forefather', so the name could mean 'head ancestor* and the first character could have been retroactively derived from 'pure' as an honorific component distinguishing it from 'head'.

Is the absence of derivations in the Tangraphic Sea of the type 'all of X plus/minus stroke Y' due to a decision to derive all tangraphs from other tangraphs regardless of their true derivations?

*10.11.1:54: The word order 'head ancestor' is acceptable because 'head' is a noun, and nouns precede the nouns they modify, whereas adjectives follow them. DOUBLE HEADER 2: ANOTHER *KL-USTER?

It occurred to me this morning that it might be possible to derive both Tangut words for 'head'

2lɨụ < *SluH  and 1ɣʊ

from a common root √lu. I've proposed that *kl- was the source of Tangut lh-*, but perhaps Tangut has different reflexes for early and late *kl-, and the latter merged with *kr-: e.g.,

Stage 1 2 3 4 Tangut
Early *kl- *kVl- *kl- *lh-
Late *kl- *kVl- *kl- *kr- k- + Grade II vowel
Late *kr-** *kVr- *kr-

1ɣʊ could be from an earlier *Cʌ-kʌ-lu:

*Cʌ-kʌ-lu > *Cʌ-klu > *Cʌ-kru > *Cʌ-kʊ (with Grade II vowel) > *Cʌ-gʊ > *Cʌ-ɣʊ > 1ɣʊ

In any case, I have yet to work out the development of *Cl-clusters in Tangut. I would not be surprised if medial *-l- sometimes conditioned Grade II, as the large number of Grade II syllables would otherwise imply that pre-Tangut had a lot of *Cr-clusters and initial uvulars.

10.10.1:11: If Pre-Tangut *SluH 'head' is cognate to Old Chinese *hluʔ 'head', then *-H (most likely *-ʔ) may be part of the root, and the lack of a final glottal in *Cʌ-kʌ-lu would be irregular.

*10.10.1:54: I think Tangut lh- could be from *kl- because it corresponds to the following gDong-brgyad rGyalrong (gDrG) presyllable and initial sequences (Jacques 2006):



kɤ-ɣɤ-j- < *-lj-


tɯ-ɣl- < *-kl-

Note, however, that Tangut lh- can also correspond to gDrG


tɯ-d- < *-tl-


tɯ-pj- (not < *-plj- which became -βɟ-)

zɣ- (not < *zl-?)

which do not contain any Kl-type sequences. I am skeptical about the last two correspondences since I don't think they would have *-l- in Guillaume Jacques' 2004 Proto-rGyalrong reconstruction.

**10.10.2:00: Could Pre-Tangut early *kr- have fused into an aspirate kh-? Pittayaporn (2009) proposed Proto-Tai *-r- as a source of aspiration in Tai. *DO-UBLE HEADER

While looking at Gong's (1983: 168-169) list of tangraphs (Tangut characters) with the top element

that he glossed as 'above', it occurred to me that

2lɨụ < *SluH  'head'

could have two possible etymologies.

The first is straightforward:

*SluH (cf. Old Chinese 首 *hluʔ 'head' which may be from *sluʔ; more cognates at STEDT)

*S- conditioned the tenseness of the vowel symbolized by a subscript dot:

*SCV > *C̣V > *C̣Ṿ > *CṾ

*u broke to *ɨu after *l.

*-H conditioned the second ̣'rising' tone.

The second entails the lenition of a dental stop *-T- to -l- after a presyllable *Sɯ- that later reduced to *S-:

*Sɯ-TUH (cf. Late Old Chinese 頭 *do 'head' < 豆 *dos 'ritual vase' [Sagart 1999: 156] and these two sets of soundalike forms at STEDT)

*U stands for an *u or an *o. Tangut -u rhymes may partly come from *-o rhymes, judging from this correspondence in Gong (1995: 82; the Old Chinese reconstruction is mine):

Tangut 2niu < *nuH < ?*noH 'to suck the breast' :

Old Chinese 乳 *noʔ 'nipple, milk, suckle'

The other common Tangut word for 'head'* could also have had a *o in pre-Tangut:

1ɣʊ < *Cʌ-QU or *Cʌ-KrU

It may be cognate to Written Tibetan mgo 'head' and Old Chinese 后 *goʔ < *ɢoʔ 'ruler' (< 'head'?).

*The two words for 'head' are paired in Homophones: 2lɨụ is the clarifier for 1ɣʊ and vice versa (47A66 and 41B21):

1ɣʊ 2lɨụ

1ɣʊ 2lɨụ

Yet the tangraphs for the two words do not share a single component even though the Tangut script is thought to be primarily semantic. STUMPED AT SECOND BASE

Last night I proposed the graphic etymology


2ʔiẽ 'sock' = 2tʂhɨi 'base' + 2lə 'to cover'

But perhaps the actual etymology in the lost second volume of Tangraphic Sea was


2ʔiẽ 'sock' = 1tʂhɨi 'base' + 2lə 'to cover'

with a slightly different, nearly homophonous first source character.

What is the difference between the two 'bases'?

Tangut is full of what have been considered synonyms. This is not merely speculation by modern scholars. The Tangraphic Sea dictionary has many chains of definitions like

'X means Y' and 'Y means X'

In some cases, the synonyms are clearly unrelated words, and may even belong to different registers or even be from different languages: e.g.,

2lhị < *Si-kla-H 'moon' (cognates; I doubt Proto-Northern Naga *liːt 'star, moon' is related - is the inclusion of ɸui⁵⁵ zɑ⁵⁵ 'rain' from the non-Naga Qiangic [like Tangut!] language Shixing an error?)

1ka 1ʔo 'moon' (in the 'ritual language' - a substrate language or a myth as Andrew West has argued?; no known cognates in any case)

But in this case, the two words for 'base' are surely related. If the second 'tone' is from a pre-Tangut *-H, then perhaps 2tʂhɨi is from 1tʂhɨi plus a suffix. However, analogical tonal derivation cannot be ruled out: given a different pair of words with different tones

2Z : 1Z

1tʂhɨi could have been created by analogy with 1Z.

Gong's "Phonological Alternations in Tangut" (1988: 825-830) lists many pairs of Tangut nouns that are synonyms differing only in tone including this pair (which he defined as 'root, origin'): e.g.,


1gwi 'word, sentence, language' = 2gwi 'id.' + 2dạ 'speech, word' (see below for another derivation*)

2gwi 'word, sentence, language' (graphic analysis unknown)

Perhaps one of more of these pairs contained first tone words that were the models for 1tʂhɨi.

That's all I can say about the phonetic difference between the two 1tʂhɨi for now. But what is the semantic difference? Here are their definitions from Kychanov 2006 and Li Fanwen 2008:

Reading 1tʂhɨi 2tʂhɨi
Kychanov 2006 счет, считать, документы, записи, Канон
count, canon, documents, list
計算, 經典, 根
корень, основа, книга, капитал, ссуда, транскрипционный знак
root, basic, book, capital, loan, transcription character
本,根典, 經, 書, 音譯字
Li Fanwen 2008 base, origin
base, origin
根, 本, 典

There is overlap between the two, as one would expect from the Tangraphic Sea definition of 1tʂhɨi which contains 2tʂhɨi:

2məʳ-2tʂhɨi 1nəị-1tshiee 1se-2bii

lit. 'origin-base speak-speak count count'

'to speak and calculate reason' (which makes no sense to me) or a list of three meanings: 'reason, to give a speech, to calculate'?

Still, they are not interchangeable. 2tʂhɨi occurs in polysyllabic words such as

2tʂhɨi 2ʂɨu 'foundation, base, bottom'

but Li lists only one polysyllabic word with 1tʂhɨi

1tʂhɨi 1kiẹ 'foundation' (lit. 'root stalk')

and no examples of 1tʂhɨi outside a dictionary, and Kychanov lists no polysyllabic compounds beginning with 1tʂhɨi. (Unfortunately I have no way of searching Kychanov 2006 for polysyllabic compounds with 1tʂhɨi in noninitial position.) Nevsky (1960 II: 358) only gives examples of 1tʂhɨi as a transcription character:

for Tangut period northwestern Chinese 擲 *tʂhɨi in 七佛八菩薩所說陀羅尼神咒經

for Tangut period northwestern Chinese 絺 *tʂhɨi in 妙法蓮華經

(10.8.1:34: Could these two instances have been errors for the similar-looking 2tʂhɨi?)

I conclude that 2tʂhɨi is the more basic word, and that 1tʂhɨi was derived from it by analogy (either through *-H deletion before tonogenesis or through tone alternation after tonogenesis). Two strokes could have been added to the tangraph for 2tʂhɨi to create the tangraph for1tʂhɨi:


That seems more likely to me than the Tangraphic Sea analysis of 1tʂhɨi:


1tʂhɨi 'base' =2tʂhɨi 'id.' + 1ʔiəʳ 'to ask'

There is nothing in the definitions of 1tʂhɨi 'base' that have anything to do with asking.

10.8.1:36: Conversely, could the tangraph for 1gwi 'word, sentence, language' be the tangraph for 2gwi minus a vertical stroke?


2gwi is more basic since it appears outside dictionaries (like 2tʂhɨi) whereas 1gwi does not (cf. the rarity of 1tʂhɨi outside dictionaries). How many other cases are there like these in which the second tone word is common and the first tone word is not - and in which their tangraphs are identical except for the presence or absence of a stroke or two?

