Amaravati: Abode of Amritas

10.11.5.23:59: THE LIMITS OF RECONSTRUCTION: STUMBLING IN THE DH-ARK

Here's another example of what happens when we reconstruct proto-languages without knowing all the facts - which is what we do all the time!

Only the Indo-Aryan branch of Indo-European has voiced aspirates like dh. Suppose we didn't even know they existed. We would see the following basic pattern among dental/alveolar stops:

Proto-Indo-European	Old Irish	Gothic	Latin	Greek	Old Church Slavonic	Hittite	Avestan
*C₁	t	θ	t	t	t	t	t
*C₂	d	t	d	d	d		d
*C₃	d	d	f	th	d		d

The traditional Proto-Indo-European reconstructed sources of these stops are *t, *d, *dh which are identical to their Sanskrit reflexes. Would anyone have come up with *dh for *C₃ if Sanskrit were unknown? Would *C₃ have been reconstructed as, say, *ð, which would have

- devoiced, hardened, and merged with t in Hittite

- devoiced to θ and hardened to Greek th (which later lenited to θ)

- shifted to f in Latin (cf. my Hawaii pronunciation of bathe as [bejf]).

- hardened and merged with d elsewhere - causing a chain shift in Germanic (ð > d > t > θ)?

I recall that David Stampe suggested *ð for *C₃ in a class I took from him. Are there are any languages whose dh < *ð?

11.6.1:33: Some UPSID statistics of interest:

1. A t-d-ð system without θ is unlikely, but not impossible. UPSID lists 12 languages with ð but without θ. 5 (Cubeo, Dahalo, Koiari, Nganasa, Tacana) have t-d-ð and the remaining 7 have t-ð.

2. All but one of the UPSID languages with dh are on the Indian subcontinent, and none contain ð. (The outlier is Igbo. The UPSID and Wikipedia descriptions of Igbo consonants do not match. Wikipedia doesn't list any voiced aspirates.) dh is in only 9 languages in UPSID, whereas ð is in 22 languages. Is it likely that ð would have hardened to a rarer dh? 106 languages have th, so PIE *ð > Greek th would be a shift from marked to less marked.

10.11.4.23:59: THE LIMITS OF RECONSTRUCTION: A WORST CASE SCENARIO

There is an unspoken assumption that we have sufficient extant material to reconstruct lost proto-languages. This is clearly false. Suppose Latin were completely lost. It would not be possible to reconstruct the Latin case system from modern Romance languages, not even Romanian, which still has a case system: e.g., 'wolf':

Latin case	Latin	Portuguese	Spanish	French	Italian	Romanian
vocative	lupe	lobo		loup	lupo	lupe
nominative	lupus					lup
accusative	lupum
genitive	lupi
locative	lupi
dative	lupo
ablative	lupo(d)

(Romanian only distinguishes nom./acc. from gen./dat. if the definite article suffix is attached: lup-ul [nom./acc.]. lup-ului [gen./dat.; -ului < Lat illui], lup-ule [voc.])

Similarly, it would not be possible to reconstruct the Sanskrit case system from modern Indo-Aryan languages: e.g., 'village':

Sanskrit case	Sanskrit	Hindi	Bengali
nominative	graamas	gããv	graam (nom.) graamke (acc./dat.) graamer (gen.) graame (loc.)
accusative	graamam
instrumental	graamena
dative	graamaaya
ablative	graamaat
genitive	graamasya
locative	graame
vocative	graama

I presume Bengali graam- is a borrowing from Sanskrit. Does Bengali have a descendant of graam- corresponding to Hindi gããv < graam-?

Hindi has lost all case distinctions in the singular of this particular masculine noun (though others have a two-way distinction).

The Bengali endings -ke and -er are not derivable from Sanskrit. Bengal locative -e looks too good to be a true preservation.

Proto-languages are approximations at best, and their accuracy is dependent on what happened to survive as well as the skill of historical linguists. No one can reconstruct what it is completely gone.

The above came to mind as I realized I had left out one other (highly unlikely) scenario from last night's list: what if Tangut vowel length reflected final voiced stops that were lost in all other Sino-Tibetan languages? E.g.,

Proto-Sino-Tibetan	Tangut	gDong-brgyad rGyalrong	Written Tibetan	Written Burmese	Old Chinese
-ag, -ad, *-ab?	-aa	-a	-a	-a	*-a
*-a	-a		-a	-a	*-a
*-ak			-ag	-ak	*-ak
*-at		-at	-ad	-at	*-at
*-ap		-aβ	-ab	-ap	*-ap

But is it really likely that Tangut is the only one out of hundreds of ST languages that preserved any remnant of this feature? Moreover, even if it did, it would still be impossible to reconstruct which voiced stop was the source of length in any given morpheme since no other ST language has a trace of them.

11.5.3:40: Reconstructing PST *-aa as a source of Tangut -aa and non-Tangut -a would be reasonable based solely on the five languages in the table. However, if each and every correspondence pattern between the hundreds of ST languages were the basis of a proto-rhyme, PST would have an impossibly huge number of rhymes. Proto-languages are still languages and are subject to the same constraints as attested languages. I am hesitant to reconstruct proto-traits on the basis of a single language (Tangut) whose phonetics remain uncertain: e.g., Gong's and my -aa corresponds to Arakawa's -a' ([aʔ]?), Sofronov's -aɯ, Nishida's -ɑw, etc.

10.11.3.23:36: DO GDONG-BRGYAD RGYALRONG CODAS CORRESPOND TO TANGUT LONG VOWELS?

I mentioned final consonants as a source of Tangut vowel length at the end of my last post. Here's a quick test of that hypothesis. Guillaume Jacques has compiled a list of gDong-brgyad rGyalrong (GBR)-Tangut cognates. If the simplest version of my hypothesis is correct, Tangut long vowel rhymes should only correspond to GBR rhymes ending in consonants. But that is clearly not the case because there are counterexamples: e.g.,

Correspondence type	GBR gloss	GBR	Tangut	Tangut gloss if different	Notes (WT = Written Tibetan; OC = Old Chinese)
1	to cook	kɤ sqa	1ɣii		ɣ- < Cɯ-q-; Cognate to 1ɣɪ̣* 'to cook' with a short vowel
2a	girl	tɯ me	1miəə	woman	Cognate to 1miẹ 'woman' with a short vowel
2b	tail	tɤ jme	1miee		cf. OC 尾 *məjʔ
2c	four	kɯ βde	1lɨəəʳ		cf. WT bzhi < blyi,* OC 四 *shli(t)s
3a	clear (water)	kɯ mgri	1gii	clear
3b	heavy	kɯ rʑi	1lɨəə
3c	name	tɤ rmi	2miee		cf. WT ming, OC 名 *meŋ
4a	to sit	kɤ mdzɯ	2dzəəu		OC 坐 *dzojʔ has the same initial but its rhyme is too different
4b	soft	kɯ mpɯ	1vəə		v- < *Cɯ-p-
4c	to steal	kɤ mɯ rkɯ < *-u	1kwiəəʳ		cf. WT rku-, OC 寇 *khos
5	dream	tɯ jmŋo < Proto-rGyalrong *-aŋ	1miee		cf. WT rmang lam, OC 夢 *məŋs
6	to welcome	kɤ qru	1khʊʊ		kh- < qh- (< qʁ- < *qr-?)

(The table lists all correspondence types but not all examples of each type.)

OC *-ʔ and *-s may or may not be parts of roots.

The roots of 'name' and 'dream' originally had a final consonant *-ŋ lost in both GBR and Tangut. Final *-ŋ does not necessarily guarantee a Tangut long vowel: e.g., the Tangut cognates of GBR mbro < *-aŋ are 1rieʳ and 2riaʳ with short vowels.

Some possible explanations for the counterexamples:

1. Tangut had some other feature instead of vowel length: e.g., Arakawa reconstructed final -' (phonetically [ʔ]). Tangut could have kept this feature whereas GBR lost it. Note that Arakawa's -' does not consistently correspond to OC *-ʔ.

2. Tangut retained a vowel length distinction lost in the other languages.

3. Tangut long vowels are partly or wholly from suffixes unique to Tangut. Hence 1ɣii 'to cook' and 1ɣɪ̣ 'to cook' may share a common root *qi with different affixes:

*Cɯ-qi-C > 1ɣii (why not 1ɣ̣ɪɪ?)
*Sɯ-qi > 1mɪ̣

Similarly, 1miəə and 1miẹ 'woman' may share a root *m-:

*mə-C > 1miəə
*Sɯ-me > 1miẹ

I am not convinced Tangut had long vowels since Sanskrit long vowels were transcribed with Tangut short vowels plus the tangraph 'long' (Grinstead 1972: 68) rather than with Tangut 'long' vowels. That implies that Tangut 'long' vowels were distinguished by some other feature.

10.11.2.23:59: A FORM-AL BOW

(Dedicated to Petri Kallio.)

This Finnic sound change reminds me of one of the origins of modern Japanese long vowels:

The velar nasal *ŋ was vocalized to a semivowel in various positions (*joŋsi "bow" → jousi, *suŋi "summer" → suvi). In some cases further loss occurred (*müŋä "backside" → Estonian möö-, Finnish myö-).

Further examples can be on found on pp. 232-233 of this paper by Petri.

Many Japanese long vowels are in borrowings from Chinese. These Sino-Japanese (SJ) long vowels have several types of sources: e.g., *V(p)u-sequences:

SJ *-a(p)u > -ou [oo]

SJ *-i(p)u > -yuu [jɯɯ]

SJ *-e(p)u > -you [joo]

SJ *-o(p)u > -ou [oo]

SJ *-uu > uu [ɯɯ] (there was no *-upu)

Middle Chinese (MC) readings ending in *-ŋ are another source:

MC *-aŋ > SJ *-aũ > *-au > -ou [oo]

(There was no MC *-iŋ.)

MC *-uŋ > SJ *-uũ > -uu [ɯɯ]

MC *-eŋ > SJ *-eĩ > -ei [ee]

MC *-oŋ > SJ *-oũ > -ou [oo]

When I saw Proto-Finnic *joŋsi > Finnish jousi 'bow', I was reminded of modern SJ compounds pronounced youshi whose components were once like *yoŋ and *si: e.g.,

SJ 用紙 youshi 'form' < *yoũ (< MC *juoŋh 'use') + *si (< MC *tɕieʔ 'paper')

SJ 容姿 youshi 'appearance' < *yoũ (< MC *juoŋ 'appearance') + *si (< MC *tsi 'appearance')

However, note that the Finnic change is word-internal, whereas MC *-ŋ in medial or final position ends up being reflected in SJ as vowel length: e.g.., the SJ reading for 'bow' in isolation is kyuu < MC *kɨwŋ.

11.3.1:06: The vocalization of a nasal coda (possibly *-ŋ) to a semivowel may also have occurred in Tangut if Gong's reconstruction of rhyme groups VII and XI is correct:

Gong's Tangut rhyme group	Pre-Tangut	Gong's reconstruction	This site
VII	*-eN	-ej	-ẽ
XI	*-oN	-ow	-õ

Compare the middle two columns with

MC *-eŋ > SJ *-eĩ > -ei [ee]

MC *-oŋ > SJ *-oũ > -ou [oo]

Perhaps these Tangut rhymes were like the earlier SJ rhymes with both nasalization and semivowels: -ẽj, -õw?

Gong reconstructed many Tangut long-vowel rhymes which I have more or less carried over into my reconstruction. These rhymes may also originate from lost consonantal codas:

*-VC > *-VG > -VV

10.11.1.21:41: THE GOLDEN GUIDE: LINE 89: TANGRAPHS 441-445

89. The first two surnames are uncommon in Chinese but may have been common in the Tangut Empire.

The first three tangraphs also represent three Chinese loanwords: 西 'west', 川 'river', and 凡 'ordinary', which were homophonous with the surnames 息, 傳, 范.

Tangraph number 441 442 443 444 445

Tangraph

Li Fanwen number 4293 1990 2052 5267 4710

My reconstructed pronunciation 1si 1tʃhwɨã 1xwiã 1lɨẽ 1lo

Tangraph gloss west; (transcription of Chinese) river; (transcription of Chinese) ordinary; Sanskrit; (transcription of Chinese) to tie; to take over; to contact; (transcription of Chinese) (transcription of Chinese)

Word the surname 息 Xi (*si) the surname 傳 Chuan (*tʃhwɨã) the surname 范 Fan (*fɨã) the surname 廉 Lian (*liẽ) the surname 羅 Luo (*lo)

Translation Si, Chhwan, Hwan, Len, Lo

441: Is the choice of 'wood' for a phonetic meant to imply 'Root West', the name of the indigenous Tangut religion?

=+

4293 1si 'west' (boxdiljeu) =

4250 1si 'wood' (boxdexdexcok; phonetic)

3226 1niəə 'to shine upon'(diljeu; cognate to Old Chinese 日 *nit 'sun'?; semantic - the setting sun?; why is jeu 'eight' on the right?)

442: 1990 is a semantic compound:

=++

1990 1tʃhwɨã 'river' (cirdaicok) =

3058 2ziəəʳ 'water' (cirzaa) +

2474 2raʳ 'to flow' (dexdaidex) +

2107 1tsəiʳ 'earth' (giigircok)

443: The graphic analysis of 2052 makes no phonetic or semantic sense and may be arbitrary:

=++

2052 1xwiã 'ordinary; Sanskrit' (biipikbel) =

1995 2məi 'the 巽 wind trigram ☴' (biidexdak) +

3695 2ziuʳ 'broom' (gempik; 'grass' + 'hand') +

1976 2bie 'gold; the 兌 marsh trigram ☱' (baebeldexbel)

11.2.2:40: 1xwiã 'Sanskrit' is from Chinese *fwɨã < Late Old Chinese *bramh < Skt Brahma.

444: 5267 may be a fanqie tangraph, even though the rhyme of 0535 is oral rather than nasal:

=+

52671lɨẽ (transcription of Chinese) (pekcox) =

1661 1lɨĩ (transcription of Chinese) (bospek) +

0535 1ʃɨe 'according to' (bumguxcox)

445: Could the structure of 4710 be loosely based on its Chinese near-homophone 廊 *lõ 'porch; corridor'? The graphic analysis makes no phonetic or semantic sense and may be arbitrary:

=++
4710 1lo (transcription of Chinese) (biobaepikheu) =

5045 1kwĩ 'gentleman' (< Chinese 君 *kwĩ) (biofeodex)

3508 2bi 'prime minister' (baepikpik)

5464 2ʒɛʳ 'to live, reside' (tiiheu)

10.10.31.21:29: THE GOLDEN GUIDE: LINE 88: TANGRAPHS 436-440

88. All five tangraphs below are Chinese transcription tangraphs.

Tangraph number 436 437 438 439 440

Tangraph

Li Fanwen number 5491 2771 1227 4329 5737

My reconstructed pronunciation 1xʊ 1phɛ 2ʃɨew 1xwĩ 1tshwe

Tangraph gloss (transcription of Chinese)

Word the surname 胡 Hu (*xəu) the surname 白 Bai (*phɛ) the surname 邵 Shao (*ʃɨew) the surname 封 Feng (*fɨũ) the surname 崔 Cui (*tshwe)

Translation Hu, Phe, Shew, Hwin, Tshwe

436: 5491 has a circular analysis:

=+

5491 1xʊ 'the Chinese surname 胡 Hu (*xəu)' (halbilfir) =

4093 2xʊ 'a kind of tree' (boxhalbilfir) +

2796 2rieʳ 'the Tangut surname Rer' (bilhascin)

I don't understand why Chinese 胡 *xəu wasn't borrowed as 1xəu, a syllable which does exist in Tangut. 1xʊ is Grade II unlike Chinese 胡 *xəu and 1xəu which are both Grade I.

Were the Chinese Hu of the Tangut Empire somehow connected to the Tangut Rer?

4093 (analysis unknown) must consist of 'wood' atop its nearly homophonous phonetic 5491.

437: Were the Chinese Phe of the Tangut Empire (like the Hu above) somehow connected to the Tangut Rer?

=+

2771 1phɛ 'the Chinese surname 白 Bai (*phɛ)' (dexhoecin) =

3366 1bɛ 'the Tangut surname Be' (dexhoe; phonetic) +

2796 2rieʳ 'the Tangut surname Rer' (bilhascin)

The radical hoe in 2771 and 3366 representing Pɛ-syllables may be based on Chinese 馬 *mbæ 'horse'.

438: 1227 2ʃɨew (analysis unknown) has no (near-)homophones with shared radicals. Could it be a semantic compound referring to characteristics of the Shao, or does it have cryptophonetics: components from tangraphs whose Chinese translations sounded like 2ʃɨew?

439: 4329 1xwĩ bears almost no phonetic resemblance to Chinese *fɨu. Tangut had no f- or -ɨũ. Although the Tangut could have borrowed *f- as ph- (as Koreans do), they approximated it with xw-. -ĩ was the closest approximation of -ɨũ since Tangut -ɨ- could not follow a velar and Tangut -ũ could not follow a high vowel.

=+

4329 1xwĩ 'the surname 封 Feng (*fɨũ)' (boxpikpik)

4342 2dia 'perfective marker' (boxjaltun) +

5751 1dii 'to divide, distribute' (pikpik; 'hand' x 2)

The function of 4342 is unknown. Why would a perfective marker originating from a prefix indicating movement away from the speaker have box 'wood' on top and tun 'skin' on the bottom right?

5751 may be a cryptophonetic. Its Chinese translation was 分 *fɨũ, which was homophonous with the surname 封 *fɨũ in the dialect known to the Tangut.

440: 5737 is a fanqie tangraph. Note how bio in 4824 is reduced to half its width in 5737.

=+

5737 1tshwe 'the surname 崔 Cui (*tshwe)' (pikbiohan) =

5760 1tshəu 'wide, thick' (pikquu) +

4824 1lwe 'rich, wealthy' (biohanpax)

Tangraph number	441	442	443	444	445
Tangraph
Li Fanwen number	4293	1990	2052	5267	4710
My reconstructed pronunciation	1si	1tʃhwɨã	1xwiã	1lɨẽ	1lo
Tangraph gloss	west; (transcription of Chinese)	river; (transcription of Chinese)	ordinary; Sanskrit; (transcription of Chinese)	to tie; to take over; to contact; (transcription of Chinese)	(transcription of Chinese)
Word	the surname 息 Xi (si)*	the surname 傳 Chuan (tʃhwɨã)*	the surname 范 Fan (fɨã)*	the surname 廉 Lian (liẽ)*	the surname 羅 Luo (lo)*
Translation	Si, Chhwan, Hwan, Len, Lo