Amaravati: Abode of Amritas

14.8.2.23:48: S-PRƏNG FROM SOME COMMON SOURCE?

The title is a reference to Sir William Jones' famous phrase "sprung from some common source" which is also the title of this book I borrowed two decades ago.

Schuessler (2009: 117) reconstructed the Old Chinese word 冰 for 'ice' as *prəŋ and suggested that it may be cognate to Proto-Tibeto-Burman *pam, the source of Tangkhul Naga pʰam and Kanauri pom 'snow' and rGyalrong (variety unspecified) ta-rpam 'ice'. I don't believe in the traditional binary view of Sino-Tibetan with Chinese in one branch and all other languages in a Tibeto-Burman branch. So I do not think there ever was a Proto-Tibeto-Burman - a single ancestor of all non-Chinese Sino-Tibetan (or 'Trans-Himalayan') languages. Nonetheless, could a word like *pam be reconstructed at the Proto-Sino-Tibetan level?

If such a word existed, how did it develop into Old Chinese *prəŋ? Let's look at two changes that occurred in similar syllables:

Old Chinese *-əm dissimilated to *-uŋ in Middle Chinese after labial initials unless blocked by a medial *-r-: e.g., *pəm 'wind' (writen with a character 風 containing the phonetic 凡 *bam) became Middle Chinese *puŋ.

Early Old Chinese *-əŋ assimilated to *-uŋ in Late Old Chinese after labial initials unless blocked by a medial *-r-: e.g., 夢 *məŋ 'dream' became Late Old Chinese *muŋ(h). (The final *-h is irregular, but does not concern us here.)

Middle Chinese 冰 'ice' was *pɨŋ, not *puŋ, so it could not have come from Old Chinese *pəm or *pəŋ. Schuessler's Old Chinese *-r- blocked the dissimilation of *-əŋ dissimilated to *-uŋ after *p-, and *-rə- regularly became Middle Chinese *-ɨ-.

冰 'ice' rhymed with *-əŋ words in two poems in Shijing (2, V, 1, 6 and 2, V, 1, 6).

To force a connection between Old Chinese *prəŋ and pam-type words elsewhere in Sino-Tibetan, I would have to claim that final *-m irregularly dissimilated to *-ŋ after *p- long before such dissimilations were first attested. Moreover, such dissimilations were blocked by *-r- which did not block dissimilation in this scenario. Old Chinese 品 *phrəmʔ 'class' regularly developed into Middle Chinese *phɨmˀ, not *phɨŋˀ.

On the other hand, Old Chinese 稟 *prəmʔ 'to receive from above' corresponds to Mandarin bing, a homophone of 冰 bing 'ice' (disregarding tone), rather than the expected *bin which would be the regular reflex of Middle Chinese *pɨmˀ. Could 稟 bing descend from an Old Chinese dialect in which *-m dissimilated to *-ŋ after *pr- - a dialect not ancestral to the dialects underlying the Middle Chinese lexicographical tradition? Is Old Chinese 冰 *prəŋ 'ice' a word from such a dialect? Or is 稟 bing the result of a later change? In Mandarin -ng sporadically assimilated to -i- by fronting to -n (e.g., 拼 pin in pinyin is from *ping), but I don't know of any cases of the reverse.

Perhaps it would be preferable to find an internal etymology for 冰 *prəŋ 'ice' instead of forcing it into a foreign mold. I'll evaluate such an etymology next time.

14.8.1.23:54: PRE-TANGUT S-ʔɅ-PAM 'ICE'

When writing last night's entry on pre-Tangut *Si-pa 'snow', I rediscovered a 2009 entry in which I reconstructed pre-Tangut *sʌ-pam 'ice' as well as *Si-pa 'snow'. Although I still think *Si-pa is valid five years later, *sʌ-pam cannot be correct because it would become Tangut 1vọ, not

4053 1ʔwọ

which is one of the three actual Tangut words for ice. In 2009, I accidentally reconstructed 4053 as 1vọ with a class II (labiodental) initial v- instead of a class VIII (glottal) initial ʔ-. Errors in Tangut lead to errors in pre-Tangut. I now reconstruct the pre-Tangut ancestor of 1ʔwọ as *S-ʔʌ-pam:

*S- conditioned the tenseness of the vowel (indicated with subscript dot).

The unaccented presyllabic vowel *-ʌ- was later lost, though the presyllabic initial *ʔ- remained.

The unaccented presyllabic vowel could not have been high because a high vowel would have conditioned a high vowel in the main syllable: *S-ʔɯ-pam would have become *1ʔwiọ*.
*-p- lenited to -w- between the vowels *-ʌ- and *-a-.

*-am became *-o

I am assuming** the root *pam is cognate to Proto-rGyalrong *lpaˠm 'ice' (as reconstructed by Jacques 2004: 249; see attested forms in item 1290 of Nagano and Prins' database). (Jacques first identified 4053 as a cognate of Japhug rGyalrong tɤ-jpʰɣom in 2003.)

Next: Does Chinese also share that root?

*8.2.0:28: There are no rhymes with Grade IV -io, -iọ, or -ioʳ after 1ʔ-. Are these chance gaps, or clues to a constraint of ̣pre-Tangut phonological structure?

**8.2.0:52: This assumption may be false. 4053 1ʔwọ has many other potential sources:

-w- could also be original or a lenited reflex of *ph- or *b- or even *m- if nasals lenited. (So far I have not yet seen any evidence for Irish-style nasal lenition: e.g., v-/ʔw- ~ m-alternations within Tangut and/or Tangut v-/ʔw-words with m-cognates in other languages. However, I have not yet looked for such evidence. Hence I cannot rule out the possibility of nasal lenition.)

-o may also be from *-aŋ or *-o.

*ʔ- may be part of the root and might be from *q-, though I would expect a uvular to condition the Grade II medial -ɤ- that is absent in 4053 1ʔwọ.

Only *S- is certain.

14.7.31.23:31: PRE-TANGUT *SI-PA 'SNOW'

Blench and Post (2013) gathered words for 'snow' from 190 Sino-Tibetan (or as they prefer, Trans-Himalayan) languages and dialects and found that

there are some 30% unidentifiable forms [i.e., apparent isolates], the remainder assigned to some ten different roots, each of lowfrequency. In Sinitic, we find attestations of four of these roots suggesting that this may infact represent a complex network of borrowing rather than reconstructions of greatantiquity. Accordingly, the probability is low that 'snow' was part of the environment of early Sino-Tibetan speakers.

One of those unidentifiable forms was Tangut

4091 1vɨị 'snow'

That syllable could have a variety of pre-Tangut sources:

- the tenseness of its vowel (indicated by a subscript dot) is from a pre-Tangut *S-

- v- could be from *w- or from an unknown labial obstruent *-P- that lenited in intervocalic position

- the Grade III medial -ɨ- is automatic between v- and a high vowel; it need not be projected back into pre-Tangut.

- -i could be original or be from a pre-Tangut *-a that raised after a presyllabic *-i-

- if there was a coda in pre-Tangut, it was lost in this environment. Tangut generally did not permit nasalized tense vowels (from earlier *S- ... -vowel-nasal sequences) and lost all stops after *-a. That coda could not have been a final glottal and/or fricative *-H which would have conditioned a second ('rising') tone in Tangut instead of a first ('level') tone. (8.1.23:11: Nor could that coda have been *-ŋ or *-m; pre-Tangut *-aŋ and *-am became Tangut -o, not Tangut -a.)

The possibilities could be summed up as *(S(I))-Pi/a(C).

If the Tangut word is cognate to rGyalrong words for 'snow' (item 1288 in Nagano and Prins' database) such as Japhug tɤ-jpa as first proposed by Jacques in 2003, then its pre-Tangut ancestor was *Si-pa.

*Si-pa > *Si-pia > *Si-pi > *Si-βi > *Si-vi > *Svi > *vvi > *vvị > *vị > 1vɨi

The proto-rGyalrong root was something like *lpa(k). Some rGyalrong varieties (e.g., Yophyi, Marspang, Sabarkyo) have -k or -ʔ or even -ʔk; others like Japhug do not. I cannot confidently reconstruct *-k at the proto-rGyalrong level because Somang ta-jpâ 'snow' lacks the final *-k I would expect. I do not know whether the pre-Tangut form had *-k.

(23:39: Could pre-Tangut *-i- be from a preinitial *l-: *S-lpa > *S-jpa > *Si-pa?)

This *pa-type root may be unique to Qiangic. Some or even all of the words that Blench and Post derived from #pu[n] and #[te] van in Qiangic may actually be from *pa. But I don't think it's possible to link that root to their #pham which ends in a nasal. Could va-type words for 'snow' in Loloish be from *pa with an initial that lenited (as in Tangut)?

23:36: What about Naxi be 'snow'? Could it be from *pa with brightening of the vowel (see Jacques and Michaud) and an initial that voiced after a now-lost presyllable: *CV-p- > *CV-b- > b-?

8.1.1:39: Could Naxi b- be from *N-p- with a nasal prefix?

8.1.1:36: I forgot that I had already written about the Tangut word for 'snow' five years ago! But at least this time I included the character for that word.

14.7.30.22:45: SINO-TIBETAN AND/OR TIBETO-BURMAN AS SUBGROUPS OF TRANS-HIMALAYAN

Last night I rediscovered Blench and Post's 2013 paper "Rethinking Sino-Tibetan phylogeny from the perspective of North East Indian languages". After a year I have yet to fully absorb it, and here I only intend to comment on a couple of bits in it. However, before I get there, I need to outline pre-Blench/Post views on Sino-Tibetan.

The traditional view of the Sino-Tibetan family is that it consists of Chinese in the east and everything else ('Tibeto-Burman') in the west.

Sino-Tibetan
Tibeto-Burman: Tibetan, Burmese, Tangut ...	Chinese: Mandarin, Cantonese, Taiwanese ...

W. South Coblin (2010) observed that the late Gong Hwang-cherng's Proto-Sino-Tibetan reconstruction "was, phonologically at least, virtually the same language as Old Chinese."

Similarly, Matisoff's (2003) reconstruction of Proto-Tibeto-Burman resembles Classical Tibetan: e.g., his PTB *b-r-gyat 'eight' is almost identical to Classical Tibetan brgyad.

Are Old Chinese and Classical Tibetan really so archaic? Blench and Post would probably say no:

It cannot be emphasised too strongly that these [languages with early written records: i.e., Chinese and Tibetan] are, if not indeed irrelevant, of relatively very low significance for the reconstruction of proto-forms of a phylum the great majority of whose members have never been written and which must be far beyond the reach of epigraphy. This emphasis on 'major' languages has had another consequence: 'minor' and often poorly documented languages have generally been excluded from consideration. This is particularly true of the languages of North East India, where the way of life hardly matches the settled agricultural lifestyle depicted for Proto-Sino-Tibetan speakers. (p. 2)

In Blench and Post's model of Sino-Tibetan (or 'Trans-Himalayan') in figure 6 on page 18, at least three of the 'major' languages turn out to be the tip of just one branch of the family which I call 'Sino-Tibetan':

Trans-Himalayan (traditional Sino-Tibetan)

'Greater Nagish'

2-11 other primary branches

'Greater Kachinic-Karenic'

Tani

Nagish

West

Kachinic

Karenic

East

'Qiangic-Sino-Tibetan-Nungish'

Tujia

Bai

North Qiangic

South Qiangic

'Sino-Tibetan' redefined

Nungish

Sinitic

Lolo-Burmish-Naxi

Greater Tibetic (Bodish)

My placeholder names for nodes are in single quotes. I am not happy with 'Qiangic-Sino-Tibetan-Nungish' which is overly long (maybe just 'Qiangic-Nungish'?) or 'Greater Kachinic-Karenic' for the non-Tani-Nagish languages of the 'Greater Nagish' branch. ('Greater Kachinic-Karenic and Tani are earlier and later offshoots of Nagish proper rather than sisters of Nagish proper.)

The number of primary branches is uncertain since Blench and Post are only certain about Mikir and Mruish as primary branches. Six other potential primary branches (Kamengic, Puroik, Mishmi, Miji, Hruso, Siangic) may not all be Trans-Himalayan. Blench and Post's tree has three more primary branches "for which there is apparently no data, so their position is simply a default."

The term 'Tibeto-Burman' could be recycled within the Blench-Post framework if Lolo-Burmish-Naxi and Bodish could be demonstrated to share an innovation absent from their sister Sinitic.

I don't know where Blench and Post would place Tangut.

Nishida (1976) regarded Tangut as "rather isolated from Lolo-Burmese proper", but still more closely related to Lolo-Burmese than to Sinitic or Bodish. If Tangut is Lolo-Burmese-Naxi, it would be 'Sino-Tibetan' under my new narrower definition.

More recently, Tangut has been regarded as Qiangic. Blench and Post split Qiangic into two branches. I presume Tangut would be a North Qiangic language as it is the northernmost Qiangic language. If so, it would not be 'Sino-Tibetan' in my new narrow sense.

In either case, Tangut is just part of 'Qiangic-Sino-Tibetan-Nungish' and not a primary branch.

All this reminds me of Austronesian. Just as Gong reconstructed Proto-Sino-Tibetan using just four languages (Chinese, Tibetan, Burmese, and Tangut), Dempwolff reconstructed Proto-Austronesian in 1934 using just three languages (Javanese, Tagalog, and Toba Batak). But eighty years later, we know that those three languages belong to just one branch of Austronesian, and that all the other primary branches are on Taiwan. Similarly, in the Blench-Post framework, all of Gong's four languages belong to just one branch of Trans-Himalayan, and all the other primary branches are in northeast India.

14.7.29.23:43: SAY YEYS TO NOTO SANS KOREAN

I ran out of time last night to write a substantial entry, and I don't have time to write part 2. I would, however, like to transition toward tonight's diversion by noting one major difference betwen Old Persian cuneiform and the Khitan small script: the former had a word divider character (𐏐), whereas characters of the latter were clustered into word blocks.

The hangul alphabet invented five centuries after the Khitan small script also has character clusters, though Korean clusters consist of letters for single syllables.

I've been able to type modern Korean clusters on my computers for 18 years. Stacking is automatic: e.g., if I type ㄸ tt and ㅐ ae, they combine into 때 ttae 'time'.

However, I wasn't able to type premodern Korean letters until the last decade or so, and I've never been able to combine them. But now I can. On Saturday I installed Google's free Noto Sans Korean font and tonight I discovered it has full support for premodern hangul. Now I can type ᄣᅢ pstay, the Middle Korean ancestor of modern 때 ttae 'time'. (You probably won't see a cluster for pstay without Noto Sans Korean or another font with similar capabilities.) Clustering is not yet automated. I have to build obsolete hangul clusters piece by piece from the Hangul Jamo Unicode block using BabelMap: e.g., ᄣᅢ consists of initial ᄣ- pst- (U+1123) followed by medial -ᅢ -ay (= modern ae, U+1162). (That syllable does not have a final component.) It will be awkward to stop to construct a cluster in the middle of touch typing: e.g., the title of the 1446 document introducing hangul was 훈민 hunmin (with clusters still used today) followed by the archaic clusters 져ᇰ tsyəng* and ᅙᅳᆷ ʔɯm. Nonetheless, I'm excited to be able to type archaic clusters at all. Perhaps some future edition of BabelPad will enable me to type, say, Yale romanization for Middle Korean and convert that into premodern hangul clusters.

The "yeys" in the title of this post is the Yale romanization of modern Korean yet 'old', spelled 옛 <Øyəys>. In Middle Korean, 'old' was 녯 nyəys.

*In modern hangul, ㅇ does double duty for zero and ng, but originally ㅇØ ~ ɣ and ㆁng were distinct letters.

14.7.28.23:55: WAS THE KHITAN SMALL SCRIPT LIKE OLD PERSIAN? (PART 1)

Last night I was reading the Encyclopædia Iranica biography of Karl Hoffmann who worked on the Old Persian script. What would an Iranist like him have thought of the Khitan small script? The two scripts superficially resembled their predecessors (cuneiform and sinography) and had characters for syllables, vowels, and a small number of words. Unlike Old Persian, the Khitan small script also had characters for diphthongs, vowel-consonant sequences (which may have also been read as consonant-vowel sequences) and consonants.

Old Persian character types	Khitan small script character types
syllable: 𐎣 <ka>	syllable: <qa>
	consonant: <dz>
vowel: 𐎠 <a>	vowel: <a>
	diphthong: <ai>
	vowel-consonant (~consonant-vowel): <al> (~ <la>?)
word: 𐏈 <AURAMAZDĀ>	word: <HEAVEN>

Next: Gaps in the Old Persian and Khitan small script syllabaries.

14.7.27.23:48: AVESTAN ALPHABETICAL ORDER

Last night I installed Google's Noto fonts. The first and so far only one I've used is Noto Sans Avestan. Over a decade ago I struggled with non-Unicode Avestan fonts, but now I can use BabelMap to input it.

The Unicode order of Avestan letters is close to the standard Indic order with several differences:

1. The e and o-vowels are before i and u instead of after.

2. Fricatives absent from Indic are where aspirated stops absent from Avestan are in Indic: e.g., 𐬑 x is where kh would be excepted.

3. All nasals are grouped together after labial obstruents instead of being grouped with obstruents at the same point of articulation.

4. 𐬭 r follows 𐬫 y and 𐬬 v instead of being between them. (The Pazend non-Avestan letter 𐬮 l follows 𐬭 r.)

What is the origin of this order? I see it in Bartholomae's Altiranisches Wörterbuch (1904). My first introduction to Avestan over twenty years ago was in Jackson's Avesta Grammar (1892) which had a more Indic-like order except for the placement of nasals. Later I saw Kanga's Practical Grammar of the Avesta Language (1891) whose order was as Indic as possible. And last year I finally got ahold of Beekes' Grammar of Gatha-Avestan (1988) which had an ABC-based order like Skjærvø's Old Avestan Glossary (2006).

Beekes (1988: 13) mentioned letters I didn't see in Jackson or Kanga:

𐬕 ġ was "of unknown value"

𐬜 δ̣ was a "graphic variant of δ?" (The letter Beekes transliterated as δ does not have its own code point; similarly, there is only one code point for two versions of t̰ which Skjærvø transliterated as t̰ and t̰₂.)

𐬂 å "only occurs in one manuscript"

𐬅 ą̇ was "of unknown use"

𐬪 ẏ was "a variant of y"

According to Skjærvø (2003: 10), å was for short a before ŋ (I am reminded of the *-aŋ > -o shift in Tangut and northwestern Tangut period Chinese), and ą̇ was originally for a nasalized schwa: *ə̨. Do others agree with these interpretations?

Skjærvø (2003: 1) seemed to regard ġ, δ̣ (his δ₂), t̰₂, and ẏ (his Y) as graphic variants of g, δ, t̰, and y. Are they truly interchangeable? Why did ġ and ẏ get their own codepoints while and did not? Will variation selectors permit distinctions between the two kinds of δ and t̰ in the future?