I left out the retroflex sibilant out of my discussion of Zanabazar Square script retroflex characters in part 2. Unlike the other retroflexes, <ṣa> is a mirror image of a nonretroflex character <śa> as in Tibetan and not as in other Brahmic scripts such as Brahmi or Devanagari where <ṣa> and <śa> are completely different.


𑨯 𑀱
𑨋 𑀓
𑨲 𑀓𑁆𑀱

The table above includes <kṣa> which has a special character in the Zanabazar Square script which is clearly derived from <ka> though the altered lower left corner bears little resemblance to <ṣa>.

Tibetan has a transparent stack of <ka> over <ṣa>.

In Brahmi, <ka> and <ṣa> are fused into a transparent ligature.

Only now after twenty-six years do I finally see the logic in Devanagari <kṣa> which I learned as a special character. The top left loop is what's left of <ka> and the bottom of the left side is what's left of <ṣa>.

2.11.15:03: I wonder if Zanabazar's <kṣa>  was influenced by Devanagari <kṣa>. Both have similar bottom left-hand corners. THE ZANABAZAR SQUARE SCRIPT (PART 2)

Thanks to Andrew West for providing me with a WOFF version of his font for the Zanabazar Square script. I have tagged part 1 of this series to employ that font.

The Tibetan roots of the script are implied in its characters for retroflex consonants which are derived from the characters for dental consonants:

Zanabazar Transliteration
𑨙 <tha>
𑨕 <ṭha>
𑨚 <tha>
𑨖 <ḍa>
𑨛 <da>
𑨘 <ṇa>
𑨝 <na>

'Implied' because the retroflexes are derived from the dentals in several different ways rather than simple mirror-image versions of the dentals as in Tibetan:



The Zanabazar retroflexes nonetheless are not as distinct from the dentals as they would have been if they had directly descended from Brahmic retroflex characters:

𑀢 <tha>
𑀞 <ṭha>
𑀣 <tha>
𑀟 <ḍa>
𑀤 <da>
𑀡 <ṇa>
𑀦 <na>

(If the Semitic hypothesis of the origin of Brahmi is correct, <tha da na> are original* and <ṭha ḍa ṇa> may be derived from them, but the relationships between them were no longer obvious in the Brahmic scripts of Zanabazar's time after centuries of graphic evolution: e.g., Devanagari <ṭha> and <tha>.)

The Tibetan script did not incorporate any descendants of the Brahmi retroflex characters. Here is the earliest extent account of the creation of the Tibetan script (translated by Sam van Schaik):

In India the script has 50 letters. Tönmi discarded the gha [voiced aspirate] group and the ṭa [retroflex] group, which do not appear in Tibetan speech.

The consequences of discarding the gha group are visible in part 1 where I explained how the Tibetan script and its Zanabazar derivative represented voiced aspirates without relying on descendants of the Brahmi voiced aspirate series.

Later the Tibetan script was extended for transcribing Sanskrit, and retroflex letters were created by mirror-imaging dental letters instead of Tibetanizing northern Indian retroflex letters that descended from Brahmi retroflex letters.

Was Zanabazar unaware of a Brahmic script with a retroflex series completely different from its dental series? If he was aware of one, he might have decided to somewhat follow the Tibetan precedent anyway because graphically related characters are easier to learn than graphically unrelated ones. (The same logic may underlie the voiced aspirates of the Zanabazar square script.)

* Salomon (1998: 25) compares Brahmi <tha da na> to Phoenician and Aramaic <tˁ d n>. I don't see much similarity between Brahmi <da na> and Semitic <d n>, but Brahmi 𑀣 <tha> certainly does look like Phoenician 𐤈 <tˁ> and its Greek derivative Θ theta. However, coincidental overlaps between simple shapes are expected.

If Brahmi was based on a Semitic script, its <ṭa> seems to have been created ex nihilo unless Bühler's (1895) view that <ṭa> was a reduction of <ṭha> is correct. A GOOD AMOUNT OF VARIATION: THE ORIGIN AND ORTHOGRAPHY OF VIETNAMESE TỐT (PART 1)

Thompson (1976: 116) reconstructed Proto-Viet-Muong *tʰoc 'good (beautiful)' on the basis of

(I have rewritten Thompson's segments in IPA but have retained his tonal notation.)

The trouble with reconstructing *tʰ is that there is a different set of sound correspondences also pointing to *tʰ found in 'medicine':

Thompson solved this problem by reconstructing two kinds of proto-*tʰ: one that deaspirated and one that didn't. But why would one deaspirate?

Premodern Chinese loans into Vietnamese point to a simpler solution. They have the following correspondences:

A chain shift occurred in Vietnamese after borrowing from Chinese:

*s > *t >

That shift postdates the split of Vietnamese from the other Viet-Muong languages.

Proto-Viet-Muong *s became aspirated in the ancestor of Mường Khến rather than unaspirated t as in Vietnamese. Some other Muong varieties retain *s.

Thus I reconstruct Proto-Viet Muong *soc 'good'.

Can that word be projected back into Proto-Vietic, or is it a Proto-Viet-Muong innovation? In other words, does it exist in any non-Viet-Muong Vietic languages, and if it does, is it native to those languages (rather than a borrowing from Viet-Muong)?

The only match I could find at SEAlang is Ruc tʰóːt 'good'. This looks like a loan postdating the fortition of Proto-Viet-Muong *s and the shift of Proto-Viet-Muong *oc to ôt in Vietnamese since

The aspirated initial rules out Vietnamese as a source. Is there a nearby Muong language with a word like tʰóːt for 'good'? Could the Ruc word be a composite of a Muong word with tʰ- and a Vietnamese word with [ot]? But there do not seem to be any Muong in Quảng Bình Province where the Ruc live. Puzzling.

I suppose Ruc haːj is the native word for 'good'.

Next: The many spellings of Vietnamese tốt in Chữ Nôm. THE ONSET OF PROTO-TAI 'NEAR'

I know from experience that interrupting a series of posts leads me to dropping a series midway and forgetting about it. (I do remember the Golden Guide series I never finished, though. That is too big to forget.) But on the other hand I also don't want to forget topics that come up in the middle of a series. This is one such topic.

In Siamese,

(I use Tai tone terminology [A1, C1] in lieu of IPA tone letters to facilitate comparison between Tai languages.)

are a minimal pair distinguished only by tone in pronunciation. Their different vowel symbols imply an earlier segmental distinction that was lost - and that can be confirmed by other Tai languages which preserve different rhymes: e.g., Yay caj A1 'far' and caɰ C1 'near'. (All non-Lao Tai data in this post is from Pittayaporn 2009.)

The Lao cognates of 'far' and 'near' are like those of Siamese apart from lacking a medial [l]:

So far, Siamese, Lao, and Yay seem to indicate that 'far' and 'near' should be reconstructed with the same initial in their common ancestor Proto-Tai. However, other Tai languages have different initials in the two words: e.g.,

Bao Yen
kwɤj A1
kwaj A1
sɤɰ C1
kʰjaɰ C1

Therefore the two words must have had different initials in the proto-language. Pittayaporn (2009: 345) reconstructed them as sesquisyllables ('one-and-a-half syllables') *k.laj A and *k.raɰ C with a presyllable (his 'degenerate syllable') k.-. The presyllable-onset sequences *k.l- and *k.r- were distinct from the true clusters *kl- and *kr- which had different reflexes:

Pittayaporn's Tai subgroup
Bao Yen
*k.l- (only in 'far')
*kl-: e.g., 'rice seedling'
*k.r-: e.g., 'illness, fever'
kʰ- c-
*kr-: e.g., 'six'

Notice that the initials of the non-Yay reflexes of *k.raɰ C 'far' do not match those of the similar-sounding word *k.raj A 'illness, fever' in the table above:

'Far' is the only instance of Siamese kl- from *k.r-. Could the common ancestor of Siamese and Lao have irregularly altered 'far' to match the*k.l-initial of 'near'? However, no such analogy would motivate the initials of 'far' in Bao Yen and Lungchow.

Bao Yen has both kʰ- and s- as reflexes of *k.r-. Might they be reflexes of different presyllables?

Bao Yen, as its Vietnamese name implies, is spoken in northwestern Vietnam. It may be no coincidence that the kʰ- and s- reflexes of *k.r- are like the Mường Khến and northern Vietnamese reflexes of Proto-Viet-Muong *kr-:  x- (< *kʰ-) and [s].

As for Lungchow, it has three reflexes of *k.r-:

Perhaps *k.r- generally simplified to *kr- in pre-Lungchow but not in 'hard' and 'near' where it developed into kʰ(j)-. The medial -j- of kʰjaɰ C1 'near' is reminiscent of the -j- that is a reflex of *-l- in *kl-. kʰjaɰ C1 'near' looks like a compromise between *k.r- and *kl-variants of 'near' in pre-Lungchow. Could such variation go back to Proto-Tai, with some languages like Thai and Lao reflecting a version of the word with *-l- instead of *-r-?

But I think the Lungchow words for 'near' and 'hard' may actually reflect different presyllabic vowels:

A palatal presyllabic vowel conditioned -j- in 'near', whereas 'hard' had no such vowel and therefore never developed -j-.

That hypothesis might explain other cases of unexpected -j- in Lungchow. But do such cases exist? And might the s- of Bao Yen sɤɰ C1 come from the *kIr- I proposed above (as opposed to Bao Yen kʰ- < *kVr- in which V is not palatal)? THE ZANABAZAR SQUARE SCRIPT (PART 1)

Yesterday I learned of the Zanabazar Square script from Andrew West and downloaded his font for it. In short it is like an extended version of the Tibetan script with additional characters for Sanskrit and Mongolian. If you do not have a Zanabazar Square font, you can see the characters here.

The first thing that caught my eye was that it has characters for voiced aspirated initial <gha ḍha dha bha dzha> that are not simply ligatures with <ha> like Tibetan གྷ ཌྷ དྷ བྷ ཛྷ <g.ha ḍ.ha d.ha b.ha dz.ha>. They are also not derived from the Brahmi characters for 𑀖𑀠𑀥𑀪𑀛 <gha ḍha dha bha jha>:

There is no consistent graphic method of derivation. Moreover, the base characters are a mix of voiceless unaspirated initial characters (<ka ta>) and voiced initial characters (<ḍa ba dza>). Might that hint at how Mongolians perceived Tibetan pronunciations of Sanskrit voiced aspirates? Nonetheless, those derived characters are still easier to learn than hypothetical Brahmi-based characters for <gha ḍha dha bha dzha> whose shapes bore no relation to those of characters for phonetically similar consonants: e.g., 𑀖 <gha> looks nothing like 𑀓 𑀔 𑀕 <ka kha ga>, etc. You can see many more examples in Brahmi-descended scripts here. RETURN TO THE SILVER RIVER (PART 4)

Given that the Tangut character for the Chinese loanword for 'river'


1990 1chhwan3 'river' < Tangut period northwestern Chinese 川 1chhwan3 'river'



Unicode Tangut component 036 / Nishida radical 181 / Boxenhorn code cir

the left side of


3058 2zyr'4 'water'

I would expect the Tangut character for the native loanword for 'river' to contain that component. But it doesn't. 1530 1ma4 'river' is analyzed in Tangraphic Sea as


- top right (not left side!) of 3058 2zyr'4 'water'

- right side of 0632 1vi1 'ripe, cooked'

0632 has no phonetic or semantic similarity to 1530.

Of course, there is water in a river, so 3058 is not surprising, though its abbreviation as


Unicode Tangut component 185 / Nishida radical 026 / Boxenhorn code fam

is. Nishida (1966: 242) regarded that component as 'stone', as it appears in


1074 1luq1 'stone'

But this is a case where labels for components are misleading; it makes no sense to call the top of 1530 'stone'.

If the Tangut script reflects the phonetic structure of a second Tangut language - 'Tangut B' - the characters imply that the Tangut B readings of 3058, 1530, and 1074 all have a common element X, and that 'river' and 'ripe' are near-homophones:

3058 'water': X + ? + ? (the left-hand element might be semantic and have no reading)

1530 'river': X + the sounds of 0632 'ripe'

1074 'stone': X + ? + ?

Is there a language in the region in which 'river' sounds like 'ripe' preceded by a segment or syllable that would be the phonetic value of X?

2.6.1:09: I would also expect that language to have the same initial consonant (or syllable) in 'river' and 'stone'. I'm guessing no such language exists today. But did one exist in the past? RETURN TO THE SILVER RIVER (PART 3)

3572 2ngwo1 'silver' was analyzed in Precious Rhymes of the Tangraphic Sea as


- the bottom left of 0136 2de'4 'ingot' (< Middle Chinese 鋌 *deŋˀ 'id.') +

- the right of 5722 2ngwo1 (first half of 3360 5722 0nwy0 2ngwo1 'eloquence')

Clearly 0136 is semantic and 5722 is phonetic. Case closed? Not quite.

First, why pick the bottom left of 0136 instead of the 'metal' radical


Unicode Tangut component 542 / Nishida radical 028 / Boxenhorn code tex

which is also absent from the character for another major metal in the analysis of 0136 2de'4 'ingot':


- top of 0152 1kiq2 'gold' +

- left of 3572 2ngwo1 'silver' +

- right of 2290 2lon1 'round' (but an ingot isn't round! is 0136 really 'ingot'?)

Why write the words for some metal objects with 'metal' but not others? That is quite different from the situation in Chinese where nearly all metals are written with 金 'metal' - one exception that comes to mind is 汞 gong 'mercury', a combination of 工 gong (phonetic) and 水 'water' (semantic).



Unicode Tangut component 529 / Nishida radical - / Boxenhorn code tau

is also phonetic in


5723 2ngwo1 'elephant' (only found in dictionaries) =

- bottom center of 0021 1bu2 'elephant, ox'? (the semantics of this word need closer examination)

- right of 3572 2ngwo1 'silver'

but the seven other characters containing that component are not read 2ngwo1. I would like to look into the other functions of that component after I wrap up this tetralogy on the Silver River.

