Amaravati: Abode of Amritas

14.12.20.23:59: WHAT DO JAPHUG JC-CLUSTERS CORRESPOND TO IN TANGUT?

In my last entry, I asked if pre-Tangut *-Nb- could be from *-lb-.

In Japhug rGyalrong, *lC-clusters became jC-clusters with the exception of *lʑ- which became ldʑ- (Jacques 2004: 334). ldʑ- does not appear in Guillaume Jacques' 2006 list of Japhug Tangut comparisons, but three other *lC-clusters do:

J jp- < *lp- : T ʔw- < *ʔV-p-

J jph- < *lph- : T v- < *CV-ph- (was *C- *l-?)

J jm(ŋ)- < *lm-* : T m- < *m- or *nm- < *lm-?

There are also several Japhug jC- clusters that Guillaume regarded as retentions. I do not know why he did not derive their j- from a preinitial *l-. *jC- and *lC- are in complementary distribution in his Proto-rGyalrongic reconstruction.

J jt- < *lt-? : T d- < *Nd- or *nd- < *lt-?

J jts- < *lts-? : T dz- < *Ndz- or *ndz- < *lts-?

J jtsh- < *ltsh-? : T t- < *St- (not tsh- or dz-!)

J jn- < *ln-? : T n- < *n- or *nn- < *ln-?

J jŋ- < *lŋ-? : T ŋ- < *ŋ- or *nŋ- < *lŋ-?

J jl- < *ll-? : T l- < *l- or *nl- < *ll- (dissimilation)?

These correspondences can be grouped into five categories:

1. No trace of *l- in Tangut: T ʔw-.

*l did become ʔ in Marquesan: e.g., *lima > ʔima 'five', but I doubt that happened in Tangut.

Is *l- a rGyalrongic innovation or was it lost in Tangut before a prefix *ʔV- was added?

2. Possible lenition conditioned by *lV-: T v-.

3. Possible absorption of *l- by following sonorant: T m-, n-, ŋ-, l-.

12.21.0:21: This must have happened before *SC-sequences became geminates that conditioned vowel tension, as none of these words have tense vowels.

Another possibility is that *l- was simply lost before other sonorants.

4. Possible fusion of *lC- as voiced sonorant: T d-, dz- (but no cases of J jp(h)- : T b-!).

5. Doubtful: J jtsh- and T t- differ in articulation and aspiration. This correspondence only occurs once:

J kɤ-jtshi 'to give a drink' :

T 4582 1tị < *Sti 'to feed'

I think those words are unrelated.

12.21.0:51: One might think J tsh- is a metathesis of *St-, but Guillaume already reconstructed *st- and *ɕt- which remained intact in Japhug.

One could then try to reconstruct *Sth- as the source of J tsh- and link the aspirate of kɤ-jtshi to Tangut

4658 1thi 'to drink'

but I think the aspiration of 4658 is from a prefix *K- added to a root *ti. The bare root does not survive in Tangut.

*Japhug -ŋ- in jmŋ was conditioned by an earlier velarized vowel: jmŋo < *lmaˠŋ 'dream'.

14.12.19.23:59: UNEXPECTED JAPHUG RGYALRONG CLUSTERS CORRESPONDING TO TANGUT B-

Last night, I wrote that "I should look into the exceptions" to the correspondence of Tangut b- to Japhug rGyalrong nasal-stop clusters. Here is a list of those exceptions:

Li Fanwen number	Tangraph	Reading	Rhyme	Japhug	Gloss
1386		bʌ	2.25	kɤ-ɕphɤt < *ɕph-	to repair
2200		bɑ	1.17	kɤ-ɣɤ-rʁaʁ < *rb-	to hunt
2451		bọ	2.62	kɤ-phɣo < *phaˠŋ	to flee
4567		bɑ̣	2.56	tɤ-jwaʁ < *lb-	leaf

I presume that pre-Tangut had a nasal prefix *N- absent from Japhug:

1386: Tangut b- may be from *N-ph-.

Is the rGyalrong sibilant (also in Zbu kɐ-spês) part of the root of 'to repair'? I do not reconstruct a pre-Tangut sibilant *S- which would have conditioned a tense vowel as in 2451 and 4567.

Old Chinese 補 *paʔ 'to mend' may be an unrelated lookalike, as it lacks final *-t.

2200: Is Tangut b- from *N-b- or a retention of root-initial *b-?

2451: Tangut b- may be from *S-N-ph- with an *S- to condition a tense vowel.

4567: Is Tangut b- from *S-N-b- or *S-b- with an *S- to condition a tense vowel? And could *-Nb- be from *-lb-*?

*12.20.0:31: I used to think pre-Tangut *l- might have merged with preinitial *r- and conditioned the retroflexion of vowels, but now I'm not so sure. Tangut did not have tense retroflex rhymes, so perhaps the retroflexion of 'leaf' was lost:

*SlbaH > *SrbaH > *SrbaʳH > *bbaʳH > *bbạʳH > *2bɑ̣ʳ > 2bɑ̣

The dating of tonogenesis (*-H > 2-) and the backing of *a relative to the other changes above (l-/r-merger, retroflexion, gemination, tension, degemination, and deretroflexion) is unknown. Grade I *a could have backed before Grade IV *ia simplified to *a:

*ia > *a > ɑ

14.12.18.23:59: RETOOLING TANGUT VOICED OBSTRUENTS (PART 3): 'BEE'

In Guillaume Jacques' 2006 list of Tangut-Japhug rGyalrong comparisons, Tangut b- usually corresponded to a Japhug nasal-stop cluster: mb-, mbr-, mbɣ-, ʑmb-, ʑmbr-, rmb-, ʁmbɣ-. (I should look into the exceptions elsewhere.) So I would expect it to correspond to Old Chinese (OC) *b- < *NP- and Written Tibetan (WT) ⁿb-. (I am switching to Guillaume's 2012 transcription system for Tibetan.)

The first Tangut b-word in Gong's 1995 list of Tangut-OC-WT-Written Burmese comparisons is

#12 'bee': 2462 1bowr : OC 蜂 *buŋ ~ *phjuŋ : WT buŋ

One of the OC forms has an unexpected voiceless initial ph- and WT lacks the expected prenasalization.

I would reconstruct the Tangut form as 1bõʳ and the OC forms as *N-phoŋ and *Nɯ-phoŋ.

The Tangut form goes back to *R-NPoN:

*R- conditioned the retroflexion of the vowel.

*-NP- (*Nph- or *Nb-?) fused into b- (possibly [mb]?); Gong would reconstruct *b- which would match OC and WT b- (but not OC ph-).

*-N conditioned the nasalization of the vowel.

I think the two OC readings may both derive from *Nɯ-phoŋ:

In one variant, the presyllabic vowel was lost before it could condition vowel bending (see below), and *N-ph- fused into *b-.

In the other variant, the presyllabic vowel conditioned vowel bending before the presyllable was lost in Late Old Chinese, leaving the bare root initial:

*Nɯ-phoŋ > *Nɯ-phuoŋ > *phuoŋ

The presyllable *Nɯ- may have been a fuller version of the animal prefix *m- reconstructed by Sagart (1999: 85) and Baxter and Sagart (2011: 5): i.e., *mɯ-.

It is tempting to assume that Tangut *-NP- was *m-ph- as in OC, but without doublets, I have no Tangut-internal evidence to favor that reconstruction over some other nasal plus *-b- (as in WT) or even *-p-.

Nathan Hill (2012) reconciled the vowels of the OC and WT forms by reconstructing Proto-Tibeto-Burman (= my Proto-Sino-Tibetan) *əw which became OC *o and WT u. Another possibility is that the OC and WT forms reflect the same root *√Pwŋ with different grades: schwa-grade in OC and zero-grade in WT.

I wrote *P- because I don't know to reconcile the OC and WT initials.

One outrageous solution would be to reconstruct a *bh- that became *ph- (cf. Greek) in OC but b- in WT (cf. Germanic). But I don't know if there are enough sets to support the reconstruction of a voiced aspirate series.

Another solution assumes that WT and one OC reading preserve an original *b- whereas *ph- of the other OC reading is secondary:

*bəwŋ > *boŋ

*Kɯ-bəwŋ > *Kɯ-boŋ > *Kɯ-buoŋ > *phuoŋ

The prefix *Kɯ- is modeled after the aspiration-conditioning prefix *K- that I reconstruct for pre-Tangut.

A third solution is to regard at least one word in this group as an unrelated lookalike. Could there be two unrelated roots with different initials and vowels: *buŋ (WT) and *phoŋ (OC)? If so, which word is cognate to the Tangut form? (12.19.0:19: Answering my own question, OC *phoŋ is more likely since its vowel matches Tangut 1bõʳ < *R-NPoN.)

In any case, I can't find any cognates of these forms outside these three languages at STEDT. Is *√Pwŋ reconstructible for a common ancestor of Tangut, Chinese, and Tibetan at a node below the Proto-Sino-Tibetan level: e.g., Blench's (2014: 4) Proto-East Sino-Tibetan?

14.12.17.23:59: RETOOLING TANGUT VOICED OBSTRUENTS (PART 2)

The initial consonants of Tangut period northwestern Chinese (TPNWC) reconstructed by Gong (2002: 131) are similar to those of his Tangut reconstruction:

*p-	*ph-	*m(b)-	*f-		*w-
*t-	*th-	*n(d)-		*l-
*k-	*kh-	*ŋ(g)-	*x-
*ts-	*tsh-		*s-
*tś-	*tśh-	*ń(dź)-	*ś-	*ź-	*j-

Green indicates TPNWC consonants shared with Tangut.

Red indicates a TPNWC consonant absent from Tangut.

The nasal allophones of blue TPNWC consonants matched Tangut nasal consonants in Gong's reconstruction: e.g., TPNWC *m- matched Tangut m-. If Tangut voiced obstruents were actually prenasalized, then both allophones of the blue consonants had Tangut matches, and the prenasalized allophone of the yellow consonant had a Tangut match. (But it is not clear whether Tangut had ń-, which is why I colored *ń(dź)- yellow.)

The considerable overlap between Tangut and TPNWC consonant inventories made me wonder if the phonologies of the two languages converged through contact. Original voiced obstruents devoiced and aspirated in TPNWC: e.g., *g- > *kɦ- > *kh-. (The intermediate step was reconstructed by Pulleyblank 1984.) Could that also have happened in Tangut? Could that explain why the Tangut '36 initials' list and phonetic tables have voiceless aspirates where voiced obstruents were expected?

The tables, the Tangraphic Sea, and Homophones tell us that

0153 1khu 'bluish green' and 3415 1khu 'detestation'

were not homophones in spite of Gong's reconstructions. The placement of those characters in the tables implies that their initials could have been kh- and g- (or kɦ-). However, Gong pointed out that their fanqie initial spellers were homophones in Homophones.

One could work around that problem by hypothesizing that the Homophones dialect (1) had shifted g- to kh- and (2) developed different phonemic tones to compensate for initial mergers:

0153 1khu > 1khu

3415 1gu > 1'khu (1' being a tone distinct from 1)

Homophones is organized by consonant classes and does not mention tones, so there is no guarantee that its dialect only had two (main*) tones like the Tangraphic Sea dialect.

But there is a bigger problem that cannot be circumvented. If my hypothesis were true, some Tangut voiceless aspirates should correspond to voiced obstruents in other languages**. Is that ever the case? I couldn't find any examples in Guillaume Jacques' 2006 list of Tangut-Japhug rGyalrong comparisons, and the only instance I found in Gong (1995) is questionable***. It's more likely that the Tangut learned the Chinese phonological tradition after TPNWC had merged *g- with *kh- and carried over the doubled *kh- into their own tradition. That still doesn't solve the mystery of why 0153 and 3415 aren't homophonous, though.

*12.18.5:48: There are at least eleven 'entering' tone characters at the end of the second volume of the Precious Rhymes of the Tangraphic Sea in addition to 'level' tone characters in the first volume and 'rising' tone characters in the rest of the second volume.

**12.18.6:48: I am only referring to correspondences between inherited words. There are many instances of Tangut voiceless aspirates corresponding to Middle Chinese voiced obstruents, but these are all borrowings from TPNWC after devoicing.

***12.18:7:03: I am not sure these words are cognate:

5354 2thə 'this' : Old Chinese 是 *Cɯ-Teʔ > *d-, Classical Tibetan Hdi [ndi] 'id.'

The Tangut word is part of a th-family of demonstratives without front vowels:

0388 2tha ~ 2019 1tha 'third person pronoun', 2173 2thy 'here', 0396 2tha 'there'

Its root initial may be th-. If it is related to Tibetan Hdi, that th- may be from an earlier *K-d-.

It is tempting to reconstruct the Old Chinese form as *Nɯ-deʔ to match Tibetan, but *Nɯ-theʔ with a *-th- matching Tangut but not Tibetan is also possible, as is *Nɯ-teʔ with a *-t- matching Old Chinese 之 *tə 'this'. Old Chinese 時 *də 'this' may be from 之 *tə 'this' plus *N-.

14.12.16.23:59: RETOOLING TANGUT VOICED OBSTRUENTS (PART 1)

While looking up 5505 'sheep' (next year's calendrical animal) in Li Fanwen's Tangut dictionary, I stumbled upon

5501 2gy 'tool, utensil'

a borrowing from Late Middle Chinese (LMC) 具 *gy. Until recently I agreed with Gong that Tangut had plain (i.e., nonprenasalized) voiced stops. So of course LMC *g- would be borrowed as Tangut g-. But now I'm not so sure.

The Sino-Japanese reading gu for 具 was once *ŋgu which in turn might be from an earlier *ŋgo, an approximation of Early Middle Chinese *guo. Early Japanese had no *g-, but it did have a prenasalized *ŋg- which was the best available match for Chinese *g-.

Was Tangut like early Japanese? Did it lack plain voiced obstruents? Were Chinese words with voiced obstruents borrowed with prenasalized obstruents?

Looking at Guillaume Jacques' 2006 Tangut-Japhug rGyalrong comparisons, I see that Tangut g- nearly always corresponds to a Japhug nasal-obstruent sequence: ɴq-, ŋg-, ŋgr-, ʑŋgr-, mg-, mgr-. (The reverse is not true.*)

The one apparent exception involves words that may not be cognate:

2181 1ge 'valley' : Japhug co 'id.'

Guillaume Jacques (2004: 297) derived co from *twaŋ (cf. Written Burmese twaŋḥ 'well'). I can't imagine *tw- becoming a pre-Tangut *k- that would blend with a nasal prefix to become Tangut g-.

(If *tw- became pre-Tangut *k-, why is that cluster preserved in

0070 1thwə 'to open' : Japhug kɤ-cɯ < *-u 'id.'

with aspiration from preinitial *K-? I would not expect *tw- to back to *k- before the front vowel of 'valley'. But maybe the vowel of 'valley' was back *ɑ when the backing took place.)

If there are any Japhug words with voiceless velar or uvular initials that correspond to Tangut g-, I would reconstruct a nasal prefix in pre-Tangut that was absent from Japhug:

Japhug k/q- : Tangut g- < *N-k/q-?

Guillaume's list of cognates had no Japhug words with a simple initial g-. The only instance of a Japhug initial cluster with g- not preceded by a nasal (tɯ-zgrɯ 'elbow') corresponded to Tangut k-:

1298 1kiʳw 'elbow'

(12.17.1:34: The retroflex vowel is normally from pre-Tangut preinitial *r- or final *-r. I would not expect it to correspond to Japhug medial *-r-. Could pre-Tangut medial *-r- condition vowel retroflexion in certain environments? Maybe the two words aren't related. Could the Tangut word be related to Proto-Kuki-Chin *ki(i)w and rGyalrongic rk-forms like northern Ergong r̥kəu⁵³ and Danba Ergong ʐkiau?)

Did pre-Tangut *g- devoice to k-? Did Tangut then develop a new g- from prenasalized *Ng-?

Next: Another possible fate for pre-Tangut *g-?

*12.17.1:17: Japhug nasal-stop sequences may correspond to Tangut oral stops:

Japhug ɴɢ-, mq-, ŋk- : Tangut k-

I presume that either the Japhug nasal is an innovation or that Tangut lost a nasal prefix.

This pair suggests that the pre-Tangut dental-velar cluster *ng- fused into ŋ- instead of g-:

2857 2ŋo 'illness' : Japhug kɤ-ngo 'to be ill'

14.12.15.23:25: "MATHEMATICALLY ELEGANT" BUT INACCURATE Last night, I wrote,

I do not believe in glottochronology or other mathematical shortcuts in historical linguistics.

Today I found Martin W. Lewis'* stance on such shortcuts (2012; emphasis mine):

Mathematically intricate though it may be, the model employed by the authors [of this Science article] nonetheless churns out demonstrably false information. Failing the most basic tests of verification, the Bouckaert article typifies the kind of undue reductionism that sometimes gives scientific excursions into human history and behavior a bad name, based on the belief that a few key concepts linked to clever techniques can allow one to side-step complexity, promising mathematically elegant short-cuts to knowledge. While purporting to offer a truly scientific* approach, Bouckaert et al. actually forward an example of scientism, or the inappropriate and overweening application of specific scientific techniques to problems that lie beyond their own purview.

I haven't read the Bouckaert article, so I can't comment on it. However, I think Lewis' criticism could also apply to other studies. The temptation to "side-step complexity" is particularly strong when faced with huge language families like Indo-European, Austronesian, and Sino-Tibetan. Wouldn't it be so much easier to toss all that data into a digital vat, push a button, and get an answer? But is that answer meaningful?

I confess I've used probability in my 2008 paper on Old Chinese. And I've written in favor of phonostatistics:

By observing trends in languages with different phonostatistical [i.e., sound frequency] patterns, one might be able to make predictions about later changes or explain known (and sometimes baffling) changes.

So I wouldn't say all mathematics is irrelevant to historical linguistics. But there are limits to mathematical models. Languages are not like carbon-14 or lifeforms; they do not change at a fixed rate, and their changes are Lamarckian rather than Darwinian: i.e., acquired charactersitics are 'inherited'. (Strictly speaking, no language is 'inherited'; learners 'recreate' languages by imitating the users around them. Biological metaphors are arguably inappropriate for linguistics, but we are stuck with them.) I predict that viable mathematical models for linguistics (1) will be developed by linguists and (2) will be sui generis instead of being inspired by, say, paleontology or biology.

Are there mathematical models that have churned out demonstrably true information in the social sciences? If so, could such models be modified for linguistics?

*Like me, Martin W. Lewis also prefers pixels to paper:

Rather than publishing my research findings in academic journals, I now post them on-line on the GeoCurrents.info weblog.

14.12.14.23:59: A GENERATIONAL CONSTRAINT ON SOUND CHANGE?

This passage from the Wikipedia article on Verner's Law made me wonder if it was possible to formulate a generational constraint on sound change (emphasis mine):

Moreover, the combination of the above-mentioned traditional order (Grimm's before Verner's) and the dating of Grimm's law to the 1st century BC requires an unusually fast change of the late Common Germanic at the turn of the millennium: within only a few decades, the three dramatic changes mentioned below would have had to happen in quick succession. This would be the only way to explain that all Germanic languages show these changes. Such a rapid language change seems implausible. Strictly speaking, it would have caused a child to be unable to understand his own grandparents.

I do not believe in glottochronology or other mathematical shortcuts in historical linguistics. So I hesitate to state that X amount of years is needed for a sequence of changes. Nonetheless it is hard for me to believe that intelligibility can be lost in as few as two generations. I presume that sound changes must be gradual if the language is to be transmitted at all.*

*12.15.2:15: That last sentence rhymes, but is it true?

Here's a simple scenario to demonstrate what I meant by gradual. Suppose we know that a language lost final stops: e.g., -k > -Ø. Is there any documented instance of speakers losing them in a generation or two? I would imagine that there would be a transitional generation with -ʔ, and that -k and -ʔ (and later, -ʔ and -Ø) would coexist within the same community for at least a generation or two.

How would gradualism work with, say, the Vietnamese consonant shift?

*s > *t > [ɗ]

I suppose the two consonants could have been ts or θ and d at an intermediate stage:

*s > *ts/θ > [t]

*t > *d > [ɗ]

θ might have hardened into the aspirated stop th of some Muong varieties: e.g.,

'hair' (data from SEAlang)

Proto-Vietic *-suk

Son La and Thanh Hoa Muong sak⁷

Hoa Binh Muong thắc

Vietnamese tóc [tawk͡p]

(There is more to the chain. Palatal ɕ became a new s, and in the north, retroflex ʂ merged with that s. However, that happened long after the rest of the shift was completed, which is why I didn't star ɕ and ʂ. Both fricatives are attested in 17th century Vietnamese romanization as <x> and <s>. There was no alveolar s in the dialect that was the basis of that romanization.)