Amaravati: Abode of Amritas

10.12.4.21:45: EHRET'S PROTO-AFROASIATIC AND PROTO-SEMITIC CONSONANTS

This is a reorganized version of the table in Ehret (1995: 481-482) with Arabic added and excluding non-Semitic descendants of Proto-Afroasiatic. I have placed 'emphatics' at the top since they interest me most.

	Proto-Afroasiatic	Proto-Semitic	Arabic
'emphatics'	*p'	*b (nonemphatic!)	b (nonemphatic!)
	*t'	t (t')	t
	*tl'	s (s')	s
	*s'	*s (nonemphatic!)	s (nonemphatic!)
	*c'	θ (tʸ')	ð or z
	*k'	k (k')	q
	*kʷ'	k (k')	q
labials	*p	*p	f
	*f	*p	f
	*b	*b	b
	*m	*m	m
	*w	*w	w
dentals, alveolars, palatals	*t	*t	t
	*d	*d	d
	*n	*n	n
	*ɬ	*ɬ	š
	*dl	d (ɮ) (emphatic?)	d (emphatic!)
	*l	*l	l
	*r	*r	r
	*ts	θ (tʸ)	θ
	*c	θ (tʸ)	θ
	*dz	ð (dʸ)	ð
	*j	ð (dʸ)	ð
	*s	*s	s (may be palatalized to š)
	*z	*z	z
	*š	*c	s
	*y	*y	y
velars, labiovelars	*k	*k	k
	*kʷ	*k	k
	*g	*g	j
	*gʷ	*g	j
	*ɲ	*n	n
	*ŋ
	*ŋʷ
	*x	*x	x
	*xʷ	*x	x
	*ɣ	*ɣ	ɣ
	*ɣʷ	*ɣ	ɣ
pharyngeals	*ħ	*ħ	ħ
pharyngeals	*ʕ	*ʕ	ʕ
glottals	*ʔ	*ʔ	ʔ
glottals	*h	*h	h

12.4.23:50: Ehret is not certain whether Proto-Semitic had ejectives or emphatics.

It's not surprising that PAA p' was lost since p' is an uncommon ejective. (Sanskrit b, a descendant of PIE *p', is only a third as common as Skt bh, a descendant of PIE *ph. Are there other languages in which *ph became *bh? Voiced consonants usually devoice, not the other way around.)

The shift of PAA *p' to PS *b has a parallel in Indo-European: PIE *p' > *b. (Could there have been an intermediate implosive stage: *p' > *ɓ > *b? Are there known cases of ejectives becoming implosives?) However, all PIE ejectives voiced

*p' > *b
*t' > *d
*k' > *g
*kʷ' > *gʷ

whereas only PAA *p' voiced. PS *θ (*tʸ') later voiced to Arabic ð or z.

I wonder if there was a chain shift between PAA and PS:

*tl' > *ts' > *s' > *s

One Arabic emphatic (*d) has a non-emphatic/ejective source: PAA *dl (which may have become PS *ɮ). Why did *dl or *ɮ harden and become emphatic? Its voiceless counterpart deaffricated: PAA *tl' > Arabic s.

Note that there is no non-ejective *tl in PAA corresponding to ejective *tl' and voiced *dl. I'm surprised that UPSID has slightly more languages with ejective tl' (9) than with nonejective tl (7). The difference (2) may be insignificant. What would the difference be with a larger sample?

10.12.3.23:59: WHY IS DAAD SO ZAAD?

Looking at the Persian name رضا Rezaa, I wondered why the Arabic letter for emphatic d (ض daad) was pronounced z in Persian.

Persian has four letters for [z] - one (ز) for native [z] and three for [z] that correspond to Arabic consonants absent from Persian:

Letter	ذ	ز	ض	ظ
Arabic	ð	z	d	ð or z
Persian	z

The spellings of Arabic borrowings are more or less retained in Persian: e.g., Rezaa (< Arabic رضا ridaa) was not respelled as رزا with a regular ز z. (Arabic short i was borrowed as earlier Persian short i which later lowered to e.)

The substitution of Persian [z] for ð in Arabic borrowings has a parallel in Japanese: e.g., Jpn za < English the [ðə].

However, the substitution of Persian [z] for d in Arabic borrowings is surprising since Arabic t was borrowed as Persian [t]. I would expect Arabic d to have been borrowed as Persian [d]. Did Persians hear Arabic d and approximate it as [z]? I doubt it. According to Wikipedia, the Arabic letter now pronounced as an emphatic stop d was once an emphatic lateral fricative ɮ in Qur'anic Arabic. Could Persian [z] for ض be an approximation of earlier Arabic ɮ? Does Rezaa reflect an earlier Arabic riɮaa?

One might expect the Arabic emphatic stop t to have been Qur'anic Arabic ɬ, the voiceless counterpart of ɮ, but the stop quality of t goes all the way back to proto-Semitic. Proto-Semitic voiceless *ɬ became Qur'anic Arabic voiced ɮ according to Wikipedia.

Similarly, Huehnergard's Proto-Semitic voiceless *θ became Arabic voiced ð or z. In some modern Arabic dialects, q (the emphatic counterpart of k) became voiced g. Why did some emphatics voice?

There is no parallel phenomenon in Chinese. Old Chinese voiceless emphatics never became voiced. (The use of voiced consonant letters to write voiceless consonants in Pinyin romanization does not count: e.g., 東 Pinyin dong [tʊŋ] < Old Chinese *toŋ 'east'.)

12.4.1:04: According to Kaye (1987: 669), "[i]n Old Arabic, the primary emphatics were, in all likelihood, voiced":

Huehnergard's Proto-Semitic	Kaye's Old Arabic	Later Arabic
all voiceless	all voiced	half voiceless, half voiced
*s	z	s (same as Proto-Semitic!)
*ɬ	*z^λ*	d
*t	d	t (same as Proto-Semitic!)
*θ	ð	ð or z

(Bold indicates voicing. Superscript λ is Kaye's symbol for lateralization.)

Assuming Huehnergard's Proto-Semitic consonants are correct,

- why did all the emphatics voice in Old Arabic?

- why did only two emphatics (z and d) devoice in later Arabic, reverting to their Proto-Semitic values (though Arabic speakers would not have realized that)?

I would rather assume that Proto-Semitic *s and *t were retained in Arabic.

Ehret (1995) reconstructed three voiceless ejectives and one voiced lateral affricate as the Proto-Afroasiatic sources of the four Proto-Semitic emphatics in the previous table:

Ehret's Proto-Afroasiatic	Ehret's Proto-Semitic	Arabic
three voiceless, one voiced	three voiceless, one voiced	half voiceless, half voiced
*tl'	*s	s (same as Proto-Semitic!)
*dl	*d	d
*t'	*t	t (same as Proto-Semitic!)
*č'	θ or t^y' (sic for *t^y?)	ð or z

Why did one of Ehret's Proto-Semitic consonants voice in Arabic?

10.12.2.23:57: YOD IN GONG'S OLD CHINESE AND TANGUT RECONSTRUCTIONS (PART 2)

Gong reconstructed a yod in the Old Chinese (OC) and Tangut numerals for 'two', 'four', 'six', 'eight', and 'nine'. In this entry, I'll only look at 'two' and 'four' which have similar rhymes and save the rest for later. MC = Middle Chinese.

Gong	Old Chinese	Written Tibetan	Tangut
Expected pattern	*-j-	palatalized consonant	-j-
'two'	二 njids > MC ɲʑi	gnyis [gnjis] > gnyis* [gɲis]	1njɨɨ
'four'	四 ljids > MC si	b-lyi [blji] > bzhi* [bʑi]	1ljɨɨʳ

Both numerals conform to Gong's expected pattern.

But I don't think it's necessary to reconstruct a yod in these words:

This site	Old Chinese	Written Tibetan	Tangut
'two'	二 nis > MC ɲih	g-nis > gnyis [gɲis]	nəə > 1niəə*
'four'	四 s-hlis > MC sih	b-li > bzhi* [bʑi]	r-ləə > 1liəəʳ*

Old Chinese: Nonemphatic OC *n- palatalizes to MC *ɲ-. I don't project MC palatality back to OC.

Gong's OC *lj- normally becomes MC *z-, not the *s- of 'four'. I have adopted Sagart's (1999) reconstruction (*s-hl-) for the initial of 'four'.

Written Tibetan: Gong's system distinguishes between pre-Tibetan dental-i and dental-yi sequences. I propose that PT dental-i generally became WT palatal-i. Exceptional cases of WT dental-i may be

- due to dialect mixture (as proposed by Jacques 2004: 10)

- archaisms

- from dental-nonpalatal vowel sequences: e.g., PT *nɨ > WT ni?

- from uvular-dental-i sequences: e.g., PT *qni > WT ni?
- from emphatic dental-i sequences: e.g., PT *ni > WT ni?

- from lost presyllables with nonhigh vowels: e.g., PT *Cʌ-ni > WT (C)ni?

(cf. OC *Cʌ-ni > MC *nej with nonpalatal n and pre-Tangut *Cʌ-ni > Tangut nəi [unattested; see the table below]. In all these cases, *Cʌ- reduced the palatality of the following syllable)

For further discussion of palatalization in Tibetan, see Jacques (2004: 9-10).

Tangut: Pre-Tangut *-əə automatically rose to *-ɨəə and then fronted to -iəə after dentals. Hence Tangut -i- in -iəə is an innovation, not a retention. Pre-Tangut *Cʌ- blocked raising of *ə: *Cʌ-Cəə > Cəə (not Ciəə).

The long Tangut rhymes may imply lost final consonants.

The retroflex rhyme of 1liəəʳ 'four' implies a pre-Tangut *r(ɯ)- prefix which may have had a high vowel.

I cannot explain why OC and WT i correspond to Tangut central vowels.

12.3.0:45: Tangut dental-i is more common than Tangut dental-əi:

Pre-Tangut	*(Cʌ-)Ti	*(Cɯ-)li	*Cʌ-Ti
Tangut initial \ Rhyme	-i	-ɨi	-əi
t-	4	0	1
*th-*	3	0	0
d-	11	0	1
n-	14	0	0
l-	0	18	9
*lh-*	10	0	1
Total	42	18	12

There is no simple li in my reconstruction. Perhaps Tangut l- was velar [ɫ] whereas lh- was nonvelar voiceless [l̥] or an alveolar fricative [ɬ].

I would expect the sum of simple *Ti-syllables and *Cɯ-Ti-syllables to outnumber *Cʌ-Ti syllables in Tangut: e.g., a 2 : 1 ratio. However, the actual ratio is 5 : 1. If (nondental?) l-syllables are excluded, the ratio is 14 : 1. Why are Təi-syllables so uncommon? Did -əi merge with -i after dentals? If so, are təi and dəi archaisms or borrowings? And why are di, ni, lhi more common than ti and thi? (d- n- lh-) do not comprise a phonetic class sharing a single characteristic distinguishing them from (t- th-).

10.12.1.23:39: A SÓNG OF TWO WATERS

While searching for the vietograph for sóng 'wave' (㳥)* in Unicode, I discovered eleven sinographs with double water radicals (氵):

㳲㴢㴣㴺㵈㵉㵜㶂㶃㶙㶝

I've never seen sinographs with duplicate left-hand radicals before. None of the eleven are in the Kangxi dictionary (now online!) or the Taiwanese sinograph variant dictionary (with 106,230 entries!). According to the Unihan database, all eleven are in PKS C 5700-2 1994 from Korea, whatever that is. Are these made-in-Korea sinographs (koreographs)? They all have counterparts minus one water radical:

汰㳞洟浹㴌淋湳潢潕㶋㶐

I presume these are phonetic in the double-radical graphs. What is the function of the second water radical? Were the double-radical graphs devised to write Korean river names?

Endnotes added 12.2.3:10:

*㳥 sóng 'wave' is identical to an obscure sinograph 㳥 Md long 'name of a river' which is in the Kangxi dictionary but not in the Taiwanese variant dictionary. When vietographs look like little-known sinographs, it's not possible to determine whether the Vietnamese deliberately recycled existing rare characters or accidentally recreated rare characters unknown to them by combining components (in this case, 氵 'water' and the phonetic 弄, pronounced lộng in Vietnamese**).

The choice of an *l-phonetic indicates that 㳥 was chosen or devised at a time when Vietnamese *Cr- had not yet fused into s-. Since there were no *(C)r-phonetics, an *l-phonetic were the closest possible substitute. Unfortunately, 㳥 contains no clues to the identity of the original initial consonant *C-.

**The Mandarin reading nong for 弄 has an unexpected n- instead of l- and is presumably a borrowing from a dialect which had shifted *l- to n-. Vietnamese and other languages preserve the original l-. Xinhua zidian (1971 ed.) regards long as an "old" and "dialectal" (i.e., obsolete and nonstandard) Mandarin reading.

10.11.30.23:36: YOD IN GONG'S OLD CHINESE AND TANGUT RECONSTRUCTIONS (PART 1)

The Tangut word for 'one' is the odd man out according to Gong's set of correspondences:

	Old Chinese	Written Tibetan	Tangut
Expected pattern	*-j-	palatalized consonant	-j-
Gong-style reconstructions	隻 *tjik	g-tjig > gcig*	1lew (no -j-!)
My reconstructions	隻 tek (< tjak?)	k-tjik > gcig*	1lew (no -j-!)

(I don't have Gong's reconstructions of the OC and WT words for 'one' on hand, so I have reconstructed them according to my understanding of his system.)

As far as I know, Gong did not reconstruct dental stops as sources for Tangut l-, so he might have regarded lew as unrelated to *t-words for 'one'. On the other hand, I have proposed that the l- of 1lew is a lenited *t-: *Cʌ-tek > 1lew.

That takes care of the root initial mismatch, but what about the lack of a medial -j- in Tangut 1lew? Gong's Tangut reconstruction has no rhyme -jaw < *-jak. Perhaps Tangut 1lew is from *Cʌ-tjak, which in turn would be from a Proto-Sino-Tibetan *tj-k 'one'. WT gcig could be from 'zero grade' *tjik (with an -i- inserted to break up *tj- and *-k) whereas the other two words could be from an *a-grade *tjak. (See Schuessler 2007: 106-107 for other examples of OC *e : foreign ja.)

12.2.2:44: One may wonder why pre-Tangut *ja corresponds to Tangut Grade I e in 'one' rather than Tangut Grade IV ie. Perhaps the *ja > e shift predated the breaking of *e to ie in syllables with high vowel presyllables:

Stage 1	*(Cʌ-)Cjak	Cɯ-Cjak, Cɯ-Cek
Stage 2: ja > je > *e	*(Cʌ-)Cek	*Cɯ-Cek
Stage 3: e > ie before *Cɯ-	*(Cʌ-)Cek	*Cɯ-Ciek
Stage 4: Presyllable loss; stop coda lenition	Cew	Ciew

10.11.29.23:59: YOD IN GONG'S OLD CHINESE AND TANGUT RECONSTRUCTIONS (PART 0)

In "The System of Finals in Proto-Sino-Tibetan", Gong Hwang-cherng (1995: 41-42) wrote,

It [Tangut] is very important for the reconstruction of Proto-Tibeto-Burman (PTB) as well as of PST [Proto-Sino-Tibetan] phonology, because it retains the medial /-j-/ of PST which is supposed to have been lost in WT [Written Tibetan] in all environments except before the high front vowel /-i-/ and WB [Written Burmese].

[...]

A comparison of OC [Old Chinese] with WT, WB, and Tangut shows that the PST medial /-j-/ was retained in OC and Tangut, but almost always lost in WT and WB except before the high front vowel /-i-/ in WT, where the medial /-j-/ has either been retained or has its reflect in the palatalization of the preceding consonants. In the latter case the medial /-j-/ was lost after causing the palatalization of the preceding consonant.

Here is Gong's proposed system of correspondences in tabular form:

Proto-Sino-Tibetan *-j-
Old Chinese *-j-	Proto-Tibeto-Burman *-j-
	Written Tibetan		Written Burmese -Ø-	Tangut -j-
	-y- [j] or palatalized preceding consonant before -i-	-Ø- elsewhere

His *-j-loss rule for Tibetan strikes me as strange. Is there any other language in which -j- was lost before all vowels other than i? I would expect a ji : i distinction to be lost before a ja : a distinction, not the other way around.

In later parts of this series, I will test this set of correspondences with numerals.

10.11.28.23:54: DO TIBETAN TRANSCRIPTIONS SUPPORT THE YOD OF GONG'S TANGUT RECONSTRUCTION?

The short answer is 'no'.

W. South Coblin wrote in his eulogy for the late Prof. Gong (emphasis mine):

It is clearly premature to venture a full-blown assessment of Professor Gong's scholarly life and work, if we mean by that a statement of the ultimate influence of his published oeuvre on the development of the field. But some reflections in this area are perhaps not unwarranted here. We may begin with Tangutology, a field on which the present writer is unqualified to venture an opinion. Clearly, it is incumbent upon specialists in it to examine Professor Gong's contributions and convey their conclusions to the rest of us in forms we can all understand. For, to undertake the herculean task of mastering the details of this arcanum, as Professor Gong to his great credit in fact did, is probably beyond the capacity of many of us. As mentioned above, Professor Gong's work on the reconstruction of Old Chinese phonology is somewhat distanced from the mainstream of this discipline as it is widely practiced today, in Europe and North America at least. Whether what he did will ultimately have a significant impact on the field at large is difficult to predict. But it seems clear that his work must be examined and taken into account by specialists in Old Chinese phonology, something that has not so far been done. To cite a case in point, the majority of present-day historical linguists outside China now reject the existence of the earlier ubiquitous high front medial glide [j], conventionally called yod, as a segmental feature in pre-Han Chinese. F. K. Li posited this yod in his Old Chinese system, and Professor Gong retained it in his, as well as in Proto-Sino-Tibetan. At first sight, this might appear to be rank conservatism, i.e. a sort of atavistic appeal to the earlier traditions of Li and Bernhard Karlgren. But such a conclusion would be erroneous. For Professor Gong's yod [for Old Chinese and Proto-Sino-Tibetan] is in fact buttressed by the presence of a comparable sound in his Tangut reconstruction. Hence, he concludes that methodological parsimony supports its retention in Old Chinese and, of course, in Sino-Tibetan. And anyone who rejects his yod must needs confront this conundrum. First of all, is Tangut yod correctly reconstructed by Professor Gong, or is it fallacious? And secondly, if it is correct, is it inherited from earlier stages of Tibeto-Burman, or on the contrary a Tangut innovation of some sort? To ignore the Tangut data is to leave a significant corpus of evidence out of the Sino-Tibetan/Old Chinese equation. And this question is but one of many raised by Professor Gong's work. Each of these must ultimately be addressed, and therein may lie his ultimate influence on the field

My answers to Coblin's two questions are:

- Prof. Gong's yod in Tangut is dubious in many (but not all) cases.

- When it is not dubious, it may be inherited (primary) or innovative (secondary).\

I'll expand on these points later this week.

Yod (-j-) appears in 48 of the 105 rhymes of Prof. Gong's reconstruction. I would expect this -j- to correspond to -y- in Tibetan transcription. Yet the vast majority of Tangut syllables that should have -y- in Tibetan transcriptions do not. Out of 524 tangraphs with Tibetan transcriptions, only twenty were transcribed with -y- in Tibetan (Tai 2008: 154). Out of these twenty, Prof. Gong reconstructed seventeen with -j-, two with -i-, and one without either -j- or -i-. If Tangut really had -j- in 46% of its rhymes, why was this -j- rarely transcribed? Were the transcribed dialects in the process of losing -j-, or did Prof. Gong overreconstruct -j-? I suspect the latter.

11.29.1:51: Although Gong's reconstructions for 308 of the 524 tangraphs with Tibetan transcriptions in Tai 2008 have -j-, only 5.5% (17) of those 308 were transcribed with -y- in Tibetan. What could account for a 94.5% rate of mismatch? The absence of coronal + -y- sequences in Tibetan orthography would explain why there are only two instances of coronal + -y- transcriptions

3368 Gong: 1thjwị (my 1thwị) transcribed as thwyi and thyi (both unusual letter sequences in Tibetan)

but it would not explain why there are only two instances of labial + -y- sequences.

5415: Gong: 1bju (my 1bɨu) transcribed as Hbyu

0083: Gong: 1we (no -j-!; my 1vəi) transcribed as wyi (an unusual letter sequence in Tibetan)