David Boxenhorn pointed out an obvious problem with my glides-to-glottals proposal: if it were correct, Classical Arabic would not have roots with initial w- and y-*, unless those glides came from other sources and the following chain shifts occurred:

(preclassical source of w) > w > classical ʔ

(preclassical source of y) > y > classical ʔ

But that is not the case, since Arabic w and y are retentions from Proto-Afroasiatic. (See Ehret 1995: 453-478 for a list of PAA roots with initial *w- and *y- which have Arabic descendants with initial w- and y-.)

David suggested that the solution might involve inflectional patterns. Now I wonder if w- and y- in Hassaniya reflect earlier prefixes

*w(V)-ʔ- > w-

*y(V)-ʔ- > y-

and/or vowels which were reanalyzed as root initials:

*ʔu- > w-

*ʔi- > y-

In any case, I still don't think glottal stops could become glides.

*E.g., w-z-r 'carry' (as in vizier [via Turkish - hence the initial v-) and y-m-n 'right/south' (as in Benjamin ['southern son'?] and Yemen).

Until tonight, I had assumed that vizier was of Romance origin and had something to do with seeing (cf. advisor). Even though the term referred to officials in premodern Muslim countries, I didn't assume that it was of Middle Eastern origin. After all, the word mandarin isn't of Chinese origin, even though it refers to premodern Chinese officials and to a Chinese language.

Sanskrit uses one word for 'right' and 'south' (dakṣiṇa). This word is the source of the first name of former Thai prime minister ทักษิณ ชินวัตร Thaksin Shinawatra.

The Indic component of the Thai lexicon was borrowed before the devoicing and aspiration of original voiced obstruents: *d, *j > th, ch (romanized as Sh). These secondary voiceless aspirates are still distinguished from their primary counterparts in spelling:

LetterEarlier ThaiModern Thai

The original voicing status of a consonant has also left traces in tones: e.g., the ทัก Thak- < *dak- of ทักษิณ Thaksin has a high tone, but the native word ถัก thak < *thak 'to braid' has a low tone.

The name Thai itself contains a devoiced and aspirated obstruent; it was once *day. (But it's spelled as if it were from an Indic Daiya: ไทย. Other native Thai words ending in -ay are spelled with ไ- or ใ- [if the -ay is from *-aɰ] without a following -ย. Were other words with ay also written with ไ-ย, or was the name Thai singled out for Indicization?)

ชินวัตร Shinawatra (which could be transliterated in an Indic fashion as jinavatra) looks like

Pali and Skt ชิน jina 'victorious' +

Thai วัตร wat 'practice' spelled as if it were from an Indic vatra; this spelling appears to be a compromise between Pali วตฺต vatta and Skt วฺฤตฺต vṛtta 'practice' (lit. 'turned' > '[what] is done'; cf. Eng turn out for 'occur').

The Skt root vṛt 'turn' is cognate to English -ward, worth, weird and various Latinate words with vers and vert. More English members of this word family are listed here.

(23:59: Why does Proto-Indo-European *t correspond to English -d in -ward and weird instead of the expected -th which does appear in worth?) FROM GLOTTALS TO GLIDES?

According to the Wikipedia article on Hassaniya Arabic,

classical [Arabic] /ʔ/ has in most contexts disappeared or turned into /w/ or /j/ (/ahl/ 'family' instead of /ʔahl/, /wak.kad/ 'insist' instead of /ʔak.kad/ and /jaː.məs/ 'yesterday' instead of /ʔams/).

A shift of ʔ to j and w would be very strange. ʔu to wu and ʔi to ji or even initial glides arising from long-distance assimilation to some other later segment (see my Romani hypothesis below) would be understandable, but there's nothing in ʔakkad that is labial like w, and there is nothing in ʔams that is palatal like j. I wonder if initial glides in Hassaniya are retentions rather than innovations:

preclassical Arabic *w-k-d > classical ʔ-k-d but Hassaniya w-k-d

preclassical Arabic*j-m-s > classical ʔ-m-s but Hassaniya j-m-s


preclassical Arabic *ʔ-h-l > classical and Hassaniya ʔ-h-l

Such an explanation probably cannot account for 'Romani' (variety unspecified; source) jag 'fire'* corresponding to Sanskrit agnis 'id.' rather than a hypothetical Sanskrit *yagni. Pokorny reconstructed 'fire' in Proto-Indo-European as *egnis ~ *ognis (for *o- cf. Russian огонь). One could try to propose that

PIE *egnis > *jegnis > Skt agnis but Romani jag

but PIE *je- became Skt ya-, not a-: e.g., 'barley': PIE *jewo- (Pokorny again) > Skt yavas. So I wonder if Romani j- reflects the i that had once been in 'fire': *agni > *jagni > *jag? I would have to find other examples of Skt a...i : Romani ja... to confirm this.

This is just one of several aspects of Romani that struck me as strange from a Sanskrit perspective. Romani has phonemes that don't look very Indo-Aryan: i.e.,

/ts/, /z/, /ʒ/, /f/, (distinct from the expected /tʃ/, /dʒ/, /ph/ which also exist in Romani)

/ts/ looks Slavic

could the rest be from Slavic, Greek, and/or Iranian?

Are such phonemes from loanwords (as in Hindi) or have they also developed in native words? /x/ is in the presumably native word oxto 'eight', so I assume ʂʈ > x (cf. Skt aṣṭa 'id.'). (I assume the Middle Indo-Aryan ancestor of Romani did not simplify ʂʈ to ʈʈh like Pali [aṭṭha 'eight']. A shift of ʈʈh to x is improbable.)

Back to the problem of mysterious initial glides: it seems that Russian (and Polish and Czech) have a lot of je and very little, if any, native e-. e- is mostly in loanwords. I suppose all the je- originated from an earlier *e-, but where did glideless e- in Russian

etot 'this' (cf. tot 'that')

etakij 'this' (cf. takoj 'such')

ekij 'what (a) ...'

(are there any other native e-words of note?)

come from? Is this e- reduced from an earlier je-, even though it is a stressed vowel in all of the above words?

In Russian, e- and je- are written as э- and е-. Why are the values of э and е 'reversed' in Ukrainian? DHAMPIR

looked like an invented word to me*, but it's real. Wikipedia doesn't mention what language it is from, so I don't know how the dh- is pronounced. I presume dhampir is from a language spoken in the Balkans. dh- is used in Albanian to represent [ð]**, but I doubt the word is Albanian, since the word must be cognate to vampire which is of Slavic origin. The only other possibiity I could think of was Romani, but nothing in Wikipedia's Romani writing systems article indicates that any Romani language has retained Indic dh (or gh, jh, ḍh, bh). So I can only guess that dh- could be a remnant of some earlier orthographic convention: cf. the th that appears in Hungarian names such as Németh (pronounced like német 'German').

There is an American TV reporter named Laurie Dhue. The 1990 US Census found several other dh-surnames, some of which look European: e.g., Dhillon. What sort of name is Dhue, and why was [d] (I presume) written as dh? (Similarly, what is the origin of th in German names and obsolete German spellings like thun and Thür for tun and Tür?)

*I first encountered the term in Japanese (danpiiru) in Vampire Hunter D. I had assumed that Kikuchi Hideyuki had made it up.

**Albanian dh [ð] is from Proto-Indo-European *d(h) in intervocalic position and *g(h) (Beekes 1995: 261). Perhaps PIE *g(h) weakened to pre-Albanian which merged with pre-Albanian *ð. HOSTILE OBSTACLE

In "Spiritual Love", I sided with Sofronov by interpreting the first tangraph of the fanqie for TT0524 EAT dzji R10 1.10 as

TT2010 愛 LOVE dzju R2 1.2

though it was not an exact match. Looking at the tangraph from Mixed Categories of the Tangraphic Sea again, I now think it is an accidental blend of TT2010 with the bottom left component (SMALL/FINE) of its homophone

TT1773 ('father's maternal uncle'?*) dzju R2 1.2

The homophone group containing LOVE and TT1773 have a fanqie with a nearly illegible first tangraph. To Sofronov (1968 II: 55) it looked like

TT4882 SUPPLY/(Sanskrit -j-) dzjɨ R31 1.30

Shi et al. (1983: 315) identified it as a tangraph similar to

TT4878 OBSTRUCT njwij R37 1.36

but with two strokes 二 atop the bottom left element 刂.

The ILCAA Tangut database identified it as

TT4856 ENEMY za

which bears very little resemblance to it.

The ILCAA database also oddly identified the nearly illegible fanqie initial speller of

TT4394 LOVE dzu R1 1.1

probably the root of TT2010 愛 LOVE dzju R2 1.2 above


TT4878 OBSTRUCT njwij R37 1.36

with n- though Sofronov (1968 II: 55) and Shi et al. (1983: 319) identified it as

TT4882 SUPPLY/(Sanskrit -j-) dzjɨ R31 1.30

the variety of Sanskrit known to Tibetans and Chinese (and presumably also the Tangut) had [dz] for j

with dz-. The bottom of the tangraph matches TT4882.

I am now fairly sure that TT4882 was the initial speller for both because TT4882 is used in other fanqie spellings whereas TT4856 and TT4878 are not.

The proper identification of fanqie tangraphs is important because an incorrect initial can 'contaminate' entire fanqie 'chains'. If a fanqie tangraph A with initial x- is misread as B with initial y-, then all tangraphs with fanqie starting with A (e.g., C, D, E ...) will be erroneously reconstructed with y-. Moreover, if tangraphs (C, D, E ...) are fanqie initial spellers for other tangraphs (e.g., F, G, H ...), then (F, G, H ...) will be erroneously reconstructed with y-.

I am fairly certain that the

*The Tangraphic Sea definition of TT1773 might be translated as:

TT1773 is what the mjuu of the father of the daughter's dwelling place is called.

Guillaume Jacques interpreted mjuu as a term used by female speakers to refer to

1. their maternal uncles

2. sons of their maternal uncles

3. their own sons

Does 'daughter's dwelling place' refers to a son-in-law's family? Was TT1773 a female speaker's term for 'maternal uncle of the son-in-law's father'? SPIRITUAL LOVE IS THE SOLUTION

In "Still Eating My Words", I was puzzled by this fanqie


TT0524 EAT dzji R10 1.10 =

TT1997 姻 MARRIAGE njạ R67 1.64 (with n-!) +

TT4879 MANY ʔji R10 1.10

until I realized that changing one stroke would reconcile the conflicting initials:


TT0524 EAT dzji R10 1.10 =

TT2010 愛 LOVE dzju R2 1.2 +

TT4879 MANY ʔji R10 1.10

The first fanqie was from the ILCAA Tangut database and Shi et al.'s (1983: 319) modern handwritten copy of Mixed Categories of the Tangraphic Sea. The second was from Sofronov (1968 II: 55). The actual fanqie in Mixed Categories of the Tangraphic Sea has a partly illegible tangraph which looks like LOVE but has a bottom left element resembling 女. I do not know if this is an error for LOVE or an entirely different tangraph.

LOVE and MARRIAGE are almost identical twins, even though they don't sound alike:

TT2010 愛 LOVE dzju R2 1.2

TT1997 姻 MARRIAGE njạ R67 1.64

However, their graphic analyses in Tangraphic Sea are completely different:


TT2010 愛 LOVE dzju R2 1.2 =

top of TT2097 COVER gjii R1R 2.12

bottom of TT4394 LOVE dzu R1 1.1

obviously cognate to TT2010 愛 LOVE dzju R2 1.2


TT1997 姻 MARRIAGE njạ R67 1.64 =

right of TT5691 MARRIAGE ʔjɨ R31 1.30 +

right of TT3306 神 SPIRIT njạ R67 1.64

phonetic in TT1997 and probably cognate to Written Burmese nat 'spirit'

Matisoff (2003: 333) reconstructed a Proto-Tibeto-Burman word family

*na ~ *na-n ~ *na-t

without any medial corresponding to Gong's -j- (which Gong believed was a retention from Proto-Sino-Tibetan)

I still do not know the function of HORNED-HAT in

The elements below HORNED-HAT are

TT3306 神 SPIRIT njạ R67 1.64

TT3544 TO-MILK tswər R90 1.84

TT3301 HOLY ɕjɨj R43 2.37

SPIRIT appears to be a compound of

TT3301 HOLY ɕjɨj R43 2.37 +

TT3344 PERSON dzjwo R53 2.44

but it was analyzed as


TT3306 神 SPIRIT njạ R67 1.64 =

all of TT3301 HOLY ɕjɨj R43 2.37 +

PERSON < bottom [right] of TT0211 PROTECT werj R77 2.66

which looks like UP + SPIRIT

TO-MILK looks like PERSON + vertical line + PERSON, but it was analyzed as


TT3544 TO-MILK tswər R90 1.84 =

PERSON < left of TT3649 BREAST new R44 1.43 +

cognate to Old Chinese 乳 noʔ 'breast; milk'

left of TT1545 HAND/ARM lạ R66 1.63

cognate to Written Tibetan lag-pa 'hand/arm', Written Burmese lak 'hand/arm', Old Chinese 翼 lək 'wing'

PERSON < right of TT3405 BIG thọ R73 2.62

Nishida (1966: 440) regarded this as a loan from Chinese. This is possible if 大 'big' was *tho as well as/instead of the expected *tha in Tangut period NW Chinese - but why is the Tangut vowel tense?

1.17.0:09: Does the tense vowel reflect a pre-Tangut prefix added to a Chinese loanword: *C-tho > thọ? STILL EATING MY WORDS: A MYSTERY OF MANY MARRIAGES

TT0524 EAT dzji R10 1.10 is strange for two reasons.

First, it and its two homophones are the only alveolar-initial tangraphs belonging to R10. Other alveolar initials (ts- tsh- s- z-) occur before R11.

Second, its fanqie in the Mixed Categories volume of the Tangraphic Sea has a strange initial speller in addition to an unexpected R10 speller:

TT0524 EAT dzji R10 1.10 =

TT1997 姻 MARRIAGE njạ R67 1.64 +

TT4879 MANY ʔji R10 1.10

Although the fanqie would seem to indicate nji R10 1.10 with n-, both Gong and Li Fanwen (1986: 362) reconstructed dz-. This is not a mistake. The homophone group containing EAT seems to be surrounded by dz-tangraphs in the alveolar section of the Mixed Categories volumes of Tangraphic Sea and Precious Rhymes of the Tangraphic Sea. (I haven't checked out every single tangraph in that section, but I would like to closely study PRTS in the future.) This implies that EAT etc. also had initial dz-. Moreover, EAT etc. are in chapter VI of Homophones for alveolar-initial tangraphs. If EAT etc. had initial n-, they would have been placed in chapter III of Homophones for dental-initial tangraphs. Unfortunately, I know of no transcription evidence for the EAT homophone group which could confirm its initial. The fanqie point in one direction and the placement of the tangraphs point in another. Which is correct?

Next: I just discovered a very simple solution for this problem. I'll reveal it in "Divine Love".

(And no, it doesn't involve reconstructing the initial of EAT as ndz-, which was Sofronov's [1968: 290] reconstruction.) EATING MY WORDS ON R10 AND R11

Last night, I posted a chart showing that there were no alveolar-initial R10 tangraphs. Nishida (1964: 44), Gong ("The Phonological Reconstruction of Tangut", p. 21), and Li Fanwen (1986: 195) agreed that no such tangraphs existed. However, three such tangraphs do exist:

TT0524 EAT dzji R10 1.10 (acknowledged by Gong in "Phonological Alternations in Tangut", p. 801)

TT2536 SEVER dzji R10 1.10

1.15.00:09: a loan from Middle Chinese 截 *dzet 'cut off' (the 'Jeet' of Bruce Lee's 截拳道 Jeet Kune Do), despite the mismatched vowel (also cf. the following word)?

TT4871 齊 ?EQUAL dzji R10 1.10

a loan from Middle Chinese 齊 *dzej 'equal', despite the mismatched vowel?

looks like a borrowing from an unattested 'reversed type' MC *dzi from a nonemphatic Old Chinese *dzəj, unlike MC *dzej from emphatic OC *dzəj

They contrast with R11 dzji 1.11/2.10 tangraphs: e.g.,

TT4577 ERROR dzji R11 1.11

TT0503 FOOD dzji R11 2.10

The latter is presumably cognate to TT0524 EAT dzji R10 1.10 via rhyme and tonal alternation (or suffixation, if my hypothesis of Tangut tonogenesis is correct: *dzji-H > dzji 2.10).

The only other example of R10-R11 alternation that I know of is 'many':

TT4879 ʔji R10 1.10 ~ TT3866 ʔji ?[ʔɰi] R11 1.11

I missed these dz-tangraphs because I was looking at the level and rising tone volumes of Precious Rhymes of the Tangraphic Sea which contain no alveolar-initial R10 tangraphs. I had forgotten about the Mixed Categories volume of PRTS which contains dz-initial tangraphs for some as yet unknown reason.

TT0524 EAT dzji R10 1.10 is cognate to

gDong-brgyad rGyalrong kɤ-ndza 'eat'

Pumi Dayang dzɨ 'eat' (Matisoff 2003: 169; with vowel raising like Tangut but without fronting)

and may be from a Proto-Tibeto-Burman *dzja (Matisoff 2003: 648).

For other Qiangic cognates, see Matisoff, "'Brightening' and the place of Xixia", p. 6.

Nevsky (1960 I: 572) linked TT0524 to Written Tibetan za-ba 'eat'. I don't know why Tibetan has a fricative instead of an affricate dz- which also exists in WT. See Matisoff (2003: 34) for more examples of seemingly deaffricated WT words. R10 AND R11

Although Gong has already written about this topic on pp. 21-24 of "The Phonological Reconstruction of Tangut", I find it useful to replicate existing work because doing so deepens my understanding and offers the possibility of discovering something that had been overlooked by others.

Gong reconstructed both R10 and R11 as -ji. Similarly, Arakawa reconstructed both rhymes as -ii. However, others reconstructed them as distinct rhymes:

Rhyme numberTibetan transcriptionsUsed to transcribe SanskritNishida 1964Hashimoto 1965Sofronov 1968Huang 1983Li Fanwen 1986Gong 1997Arakawa 1999
R10-i, -ï(not used)-i-jeej-je-(i)i-ie-ji-ii
R11-e, -i, -iH-i-iɦ-ii-i-(i)ĩ-i

It is strange that no one has reconstructed R11 with a vowel like ɪ or e which would correspond to the e in Tibetan transcriptions. Conversely, some have reconstructed R10 with e, even though it wasn't transcribed with Tibetan e. The Tibetan and Sanskrit evidence does not support medial glides or vowel length. The fact that TT2413 and TT3016 (both R10) have fanqie wth R11 final spellers indicates that the two rhymes were very similar: e.g., they may have had the same final vowel with different medials (see below).

The initials for R10 and R11 tangraphs are in nearly complementary distribution:

Rhyme numberHomophones chapter (initial class) and sample initial

Gong has already dealt with most of the overlaps in Homophones chapters V, VIII, and IX, so I will not repeat his arguments here. One exception is TT1686, the only l- initial R11 tangraph. He reconstructed this tangraph with initial lw-, which occurs only with R11. Another exception is TT4598, a chapter VIII transcription tangraph. The readings of transcriptive tangraphs may violate the general phonological pattern of Tangut, so

The chart below incorporates Gong's findings and my own:

Rhyme numberHomophones chapter (initial class) and sample initial
R10noyesnoyesonly one instance**noyesonly two instances: one purely transcriptive (TT4598) and another (TT4879) which did double duty as a transcriptive character and as a synonym for an R11 near-homonym (TT3866)l-, ʑ-
R11yesnoyesnothe majorityyesnothe majoritylw-, lh(w)-, z-

The pattern above is similar to the pattern for R2-R3:

III, V, VI: R3 and R11 (with one exception in R10)

VII and ʑ- (in IX but palatal like VII): R2 and R10

It seems that R3/R11 are less palatal than R2/R10. Perhaps they had different medials: e.g.,

R3 -ɰu, R11 -ɰi

R2: -ju, R10 -ji

R2/R3 and R10/R11 word-family alternations*** would be interpreted as medial alternations.

In pre-Tangut, dentals, alveolars, and velars may have palatalized before *-ju and *-ji: e.g.,

*tjV, *tsjV, *kjV merging with original *tɕV? > tɕV

If this hypothesis is correct, there should be dental, alveolar, and velar-initial external cognates for palatal-initial R2/R10 words. Some palatal-initial R2/R10 words could also be borrowings from Chinese which had undergone palatalization. (However, I suspect that such palatalization would have predated contact with Chinese, for all known Chinese borrowings retain their original dental, alveolar, and velar initials.)

I have great doubts about much of this, since

1. -ɰ- is an unusual medial

2. -ɰ- presumably would come from some other medial consonant (e.g., a velar or an *-l-), but no evidence so far points toward cluster initials in R11 words

The Tibetan transcription of R11 as -e may indicate that the vowel of R11 was slightly lower than that of R10. One could propose a lowering or depalatalizing rule that changed R10 to R11 after certain initials

*ti, *tsi, *ki > *tɰi, *tsɰi, *kɰi

instead of assuming that R11 absorbed a lot of earlier R10 words:

*ti, *tsi, *ki merging with original *tɕi? > *tɕi

But I have never seen the change of *-V to -ɰV in any language, so a third, simpler possibility is -j-insertion after palatals: e.g.,

*tɕi > tɕji R10 (the difference between these two would be awfully subtle, though!)

cf. ti, tsi, ki R11

There are several problems with the above ideas, including one major error which underscores the need for independent research. I am out of time tonight, so I will deal with these issues tomorrow.

*Although Nishida reconstructed a single labial-initial R10 tangraph (TT5518), the Tangraphic Sea and the Precious Rhymes of the Tangraphic Sea listed this graph under R11.

**Gong considered TT4134, the only velar-initial R10 tangraph, to be a transcriptive character for sutras and dharanis, but Shi et al. (2000: 42) defined it as 高歌 'sing lustily'.

***1.14.00:40: I listed examples of R2/R3 alternations here. The only example of an R10/R11 alternation that I know of is

'many': TT4879 ʔji R10 1.10 ~ TT3866 ʔɰi R11 1.11

(or should they be reconstructed as ʔji R10 1.10 and ʔi R11 1.11?)

