Amaravati: Abode of Amritas

14.2.15.16:50: WHAT'S SO BRISK ABOUT BREST?

Earlier this week I was reading about the Treaty of Brest-Litovsk, named after a city that is now called Brest in Belarusian. Almost all the other names end in -t or contain some consonant derived from *t (e.g., earlier Belarusian Bieraście < *Beraste). The one exception is Yiddish בריסק Brisk. Why does it end in -k instead of -t? Is it from an earlier *Brist-k? I don't think *-t dissimilated to *-k after *s since the Yiddish present second person singular verb ending is -st, not *-sk.

19:33: Is the Yiddish noun Brisk derived from a Slavic adjective like Polish brzeski 'Brest'?

14.2.15.14:42: THE LINEAGE OF 'LOVE': THE TWO DZU OF TANGUT

Tangut has two words for 'love' that were pronounced something like dzu:

Li 2008 #	Tangraph	Rhyme	Nishida 1966*	Nishida in Arakawa 1997	Sofronov 1968	Li Fanwen 1986	Gong 1997	Arakawa 1997**	This site
1338		1	ⁿdzǐu	1ndzu	1ndzu	1dzu	1dzu	1dzu	1dzəu
4973		2	ⁿdzǐu	1ndzǐu	1ndzɪ̭u	1dzǐu	1dzju	1dzyu	1dzɨu

This is the only case of alternation between rhymes 1 and 2 that I know of. How can I account for it?

These words may be cognate to Old Chinese 慈 *dzə 'affectionate, loving, kind' and 字 *dzə-s 'to love' (now 'character'!). Tangut -u may be from an earlier *-k. Hence I reconstruct

Pre-Tangut *dzə-k > 1dzəu

Pre-Tangut *Cɯ-dzə-k > *Cɯ-dzɨək > 1dzɨu

The prefix *Cɯ- (perhaps *mɯ-?; cf. Written Tibetan mdzaH 'to love') conditioned the partial raising of *ə to *ɨə which later simplified to ɨ. I do not know whether this raising and simplification preceded or followed the shift of *-k to -u.

The STEDT database has Proto-Tibeto-Burman*** *m-dza-k 'to love' on the basis of

NW (northwestern?) rGyalrong ndot tɕʰak 'to copulate'

but this variety of rGyalrong also has mdz-, so did *m-dz- really become tɕʰ-, or is tɕʰak just a lookalike? (Too bad 'love' isn't in this massive rGyalrong database.)

Jingpho ndžáʔ < *-k 'love'

Nanhua Yi ȵe̱³³ dʑᴀ̱³³ 'love'

but Nanhua Yi vowel constriction (indicated by underlining) could also point to another stop

Only the Jingpho form unambiguously points to *-k. Tangut is distant from Jingpho, so this *-k is not a shared innovation. It may also not be a shared retention if it was independently added in the two languages.

Oh great, I already wrote about this pair of words in 2010. And here's an even earlier article with *-k-less reconstructions. At least my comparative table and commentary on the STEDT reconstruction are new.

*Nishida's (1966) 西夏文字小字典 (Small Tangraph Dictionary) lists no tones. I don't know why the two words were reconstructed as homophones. They are in different homophone groups in Homophones and Mixed Categories of the Tangraphic Sea.

**Arakawa's (1997) 西夏語通韻字典 (Tangraph Rhyme Dictionary) does not contain reconstructions for either 'love' tangraph, but does contain reconstructions of their initials and finals which I have combined here.

***I don't believe in a Tibeto-Burman subgroup of Sino-Tibetan, so I regard STEDT Proto-Tibeto-Burman reconstructions as a vague approximation of Proto-Sino-Tibetan minus Chinese evidence.

14.2.14.14:32: WHAT PRECEDES INDRA'S POWER?

The father of the first king of all of Laos was Zakarine. I think the modern Lao spelling of his name is

ສັກກະລິນ

<sakkḥlin> [sakkalin]

Why was this name romanized with Z-? At no time would <s> have ever been [z]. Lao did not have a [z] in the 19th century.

The Thai Wikipedia article about him has the name

สักรินทรฤทธิ์

<sakrindrṛddhi̽> [sakkarinrit]

ินทร <indr> [in] is 'Indra' (< Skt Indra-) and ฤทธิ์ <ṛddhi> [rit] is 'supernatural power' (< Skt ṛddhi-), but what is สักร <sakr> [sakkar]? My guess is that it is a hybrid of Pali Sakka- and Sanskrit Śakra- 'The Mighty', i.e., Indra. However, I would expect the Thai spelling

ศักรินทรฤทธิ์

<śakrindrṛddhi̽> [sakkarinrit]

with a different first letter if the name was based on Sanskrit Śakra-.

ศักรินทร์

<śakrindr̽> [sakkarin]

with that letter is a Thai male name.

The reduction of the vowel sequence -a-i- to -i- in the underlying Sanskrit *Śakrindra- is possible in Pali but not Sanskrit which favors fusing the vowels into -e-: Śakrendra-.

Why does the Thai name have an extra element ฤทธิ์ <ṛddhi̽> [rit] absent from the full name

Samdach Brhat Chao Maha Sri Vitha Lan Xang Hom Khao Luang Prabang Parama Sidha Khattiya Suriya Varman Brhat Maha Sri Sakarindra

in the English article?

Zakarine's son was

ສີສະຫວ່າງວົງ

<sīsḥhw1āṅwŏṅ> [siːsavaŋvoŋ]

<sī> is from the Sanskrit honorific Śrī.

<wŏṅ> is from Sanskrit vaṃśa- 'family'. Is it short for Phoulivong, a name in the English but not the Lao, Thai, or French Wikipedia entries? What is the Indic source of Phouli?

<sḥhw1āṅ> [savaŋ] is Lao for 'dawn'. [sa] appears to be a prefix of unknown function added to a root [vaŋ] which is homophonous with 'free, idle'*.

Why did the Thai Wikipedia romanize [savaŋ] as Savangsa with a silent -sa?

Sisavang Vong's son, the last king of Laos, was

ສີສະຫວ່າງວັດທະນາ

<sīsḥhw1āṅwaɗdḥnā> [siːsavaŋvattʰanaː]

<waɗdḥnā> 'progress' is from the Pali neuter noun vaḍḍhana- 'increase'. Why does the name end in a long <ā> as if it were feminine? The common noun 'progress' is ວັດທະນະ <waɗdḥnḥ> [vattʰanaʔ]; its Thai equivalent ends in long <ā> like the king's name: วัฒนา <waḍhnā> [wattʰanaː].

*14:56: Pittayaporn (2009: 133) derived early Tai *(h)waːŋ B 'free, idle' (my reconstruction) from Middle Chinese 亡 ‘not present’ or 罔 ‘not have’; both were *wɔŋ in Annamese Middle Chinese but were something like *mɔŋ in early Cantonese. The semantic match is loose, the initials do not match if the Tai were not in contact with a *w-dialect of Middle Chinese, the vowels do not match, and the tones do not match (亡 had a tone corresponding to Tai A and 罔 had a tone corresponding to Tai C).

14.2.14.10:14: A MAXIMUM PRICE FOR A MINIMUM THAI ALPHABET?

In my last post, I wrote,

In theory it would be possible to reduce the number of [Thai and Lao] consonants even further and indicate tones solely with diacritics, but that would be a radical change to both systems.

The Thai and Lao scripts indicate tones with a combination of consonant and vowel as well as tone symbols. The main combinations for Thai* are in this table:

Consonant class \ Tone mark	Syllables ending in sonorants					Syllables ending in stops
	None	่ <1>	้ <2>	๊ <3>	๋ <4>	Short vowel	Long vowel
X-: ขฉฐถผฝศษสห	rising	low	falling	(not used)		low
K-: กจฎฏดตบปอ	mid			high	rising
G-: คฆงชซฌญฑฒณทธนพฟภมยรลวฬฮ		falling	high	(not used)		high	falling

There is no single way to write any of the five tones of Thai. Markers ๊ <3> and ๋ <4> only have one value each, but they are not the only ways to write high and rising tones, and the other markers (including zero) do not have single values.

Here are a few examples to demonstrate how the system works:

Consonant class \ Tone mark	Syllables ending in sonorants					Syllables ending in stops
Consonant class \ Tone mark	None	่ <1>	้ <2>	๊ <3>	๋ <4>	Short vowel	Long vowel
X-class: ข <kh> [kʰ] < *kʰ-	ขา <khā> [kʰaː] 'leg' (rising)	ข่า <kh1ā> [kʰaː] 'galangal' (low)	ข้า <kh2ā> [kʰaː] 'servant' (falling)	not used; [kʰaː] (high) written as <g2ā>	not used; [kʰaː] (rising) written as <khā>	ขัด <khaɗ> [kʰat] 'to polish' (low)	ขาด <khāɗ> [kʰaːt] 'to lack' (low)
K-class: ก <k> [k] < *k-	กา <kā> [kaː] 'crow' (mid)	ก่า <k1ā> [kaː] 'kind of duck' (low)	ก้าน <k2ān> [kaːn] 'stem' (falling)	ก๊วน <k3wn> [kuan] 'group' (high)	ก๋า <k4ā> [kaː] 'bold' (rising)	กัด <kaɗ> [kat] 'to bite' (low)	กาด <kāɗ> [kaːt] 'leafy plant' (low)
G-class: ค <g> [kʰ] < *g-	คา <gā> [kʰaː] 'stuck' (mid)	ค่า <g1ā> [kʰaː] 'price' (falling)	ค้า <g2ā> [kʰaː] 'to trade' (high)	not used; [kʰaː] (high) written as <g2ā>	not used; [kʰaː] (rising) written as <khā>	คัด <gaɗ> [kʰat] 'to choose' (high)	คาด <gāɗ> [kʰaːt] 'to expect' (falling)

'Servant' and 'price' were *kʰaː and *gaː witih different tones indicated by ้ <2> and ่ <1> in earlier Thai, but in modern Thai they are both [kʰaː] with the same falling tone. Nonetheless they are still spelled with different initial consonants and tone markers. But what if Thai were respelled without any regard for history? Each modern consonant would have only one letter, and each tone would have only one marker (or no marker at all):

Minimum Thai alphabet: 21 letters for 21 phonemes; no more consonant classes

	ก <k> [k]	ค <g> [kʰ]		ง <ṅ> [ŋ]
	จ <c> [tɕ]	ช <j> [tɕʰ]
ด <ɗ> [d]	ต <t> [t]	ท <d> [tʰ]		น <n> [n]
บ <ɓ> [b]	ป <p> [p]	พ <b> [pʰ]	ฟ <v> [f]	ม <m> [m]
ย <y> [j]	ร <r> [r]	ล <l> [l]	ว <w> [w]	ซ <z> [s]	อ <ʔ> [ʔ]	ฮ <ɦ> [h]

This alphabet is similar to the 1942-44 reformed alphabet minus ten letters:

1-7. All X-class etters (ขฉถผฝสห) were dropped in favor of their G-class homophones which are associated with the mid tone if not accompanied by a tone marker: e.g., ข <kh> [kʰ] was dropped in favor of ค <g> [kʰ], etc.

8. ญ <ñ> (sans its final stroke) which was a less common letter for final [n] in the reformed alphabet was dropped in favor of น <n> for [n] in all contexts

9-10. The last two Indic voiced aspirates ธ <dh> [tʰ] and ธ <bh> [pʰ] were dropped in favor of their homophones ท <d> [tʰ] and พ <b> [pʰ].

Simple Thai tone marking: one marker per tone in all contexts

Sonorant-final syllables (spellings altered by reform in italics)

Former class \ Tone mark	None: mid	่ <1>: low	้ <2>: falling	๊ <3>: high	๋ <4>: rising
K-	กา <kā> [kaː] 'crow' (mid)	ก่า <k1ā> [kaː] 'kind of duck' (low)	ก้าน <k2ān> [kaːn] 'stem' (falling)	ก๊วน <k3wn> [kuan] 'group' (high)	ก๋า <k4ā> [kaː] 'bold' (rising)
X-/G-	คา <gā> [kʰaː] 'leg' (mid)	ค่า <g1ā> [kʰaː] 'galangal' (low)	ค้า <g2ā> [kʰaː] 'servant' / 'price' (falling)	ค๊า <g3ā> [kʰaː] 'to trade' (high)	ค๋า <g4ā> [kʰaː] 'leg' (rising)

Stop-final syllables (spellings altered by reform in italics)

Former class	Vowel length	None: mid	่ <1>: low	้ <2>: falling	๊ <3>: high	๋ <4>: rising
K-	short	no independent words**	กั่ด <k1aɗ> [kat] 'to bite' (low)	ตุ้บ <t2uɓ> [tup] 'thud' (falling)	โต๊ะ <t3oḥ> [toʔ] 'table' (high)	จ๋ะ <c4ḥ> [tɕaʔ] 'yes?' (rising)
K-	long or diphthong	impossible	ก่าด <k1āɗ> [kaːt] 'leafy plant' (low)	อ้วก <ʔ2wk> [ʔuak] 'to vomit' (falling)	จ๊วก <c3wk> [tɕuak] 'pure (white)' (high)	impossible
X-/G-	short	no independent words**	คั่ด <g1aɗ> [kʰat] 'to polish' (low)	ค้ะ <g2ḥ> [kʰaʔ] 'yes' (falling)	คั๊ด <ga3ɗ> [kʰat] 'to choose' (high)
X-/G-	long or diphthong	impossible	ค่าด <g1āɗ> [kʰaːt] 'to lack' (low)	ค้าด <g2āɗ> [kʰaːt] 'to expect' (falling)	เชิ๊ด <jə3ɗ> [tɕʰəːt] 'shirt' (high)

This reform could easily accomodate currently impossible combinations in the future: e.g.,

กาด <kāɗ> [kaːt] (mid)

คาด <kāɗ> [khaːt] (mid)

ก๋าด <kāɗ> [kaːt] (rising)

ค๋ัค <kaɗ> [khat] (rising)

ค๋าด <kāɗ> [khaːt] (rising)

However, the price of these reforms would be staggering for existing users. The spellings of most X-/G-class and stop-final syllables would be changed according to an alien logic, and many new spellings would be easily confused with old ones (e.g., the new spelling of 'servant' / 'price' looks like the old spelling of 'to trade'). Moreover, nearly all stop-final words would have to be written with tone markers, whereas such words were previously mostly written without them. Only new learners would benefit, but they would have difficulty learning to read anything written before the reform. Therefore I think this reform is a terrible idea for Thai. (Similar arguments would apply to an analogous reform for Lao.)

I think an optimal reform should make a system more consistent according to an existing logic rather than a newly imposed one. I like the existing Thai and Lao orthographies because each is consistent in its own way. Thai is etymological whereas Lao is phonetic, and both are constrained by the old letter class/tone marker system. I don't like the 1942-44 Thai reform in my last post because it is a compromise between etymology and phonetics. And I don't like the reform in this post because it is phonetic at the expense of the loss of the old letter class/tone marker system. And that is an unnecessary expense because Lao proves the old system is compatible with phonetically transparent spelling.

*Rare combinations such as K-class consonant + ๊ <3> + short vowel + stop for a high tone (e.g., โต๊ะ [tóʔ] 'table') are not included. These combinations are in particles, onomatopoeia, and loanwords; they are not inherited from ancient times.

**Unstressed CV syllables in polysyllabic words have tones like CV-stop syllables in careful pronunciation but may have mid tones in rapid speech. These tones are unmarked in this new orthography as well as the old orthography.

14.2.12.17:52: HOW SIMILAR ARE THE 1942-44 THAI ALPHABET AND THE MODERN LAO ALPHABET?

In Creating Laos, Ivarsson (2008: 193) wrote,

With the exception of three letters, the Lao and [1942-44] Thai alphabets would now be [structurally] identical and the spelling employed in Thailand would be closely related to the 'simple etymological spelling' used in Laos.

I don't know what the orthographic standard for Laos was like in 1942, but I think the consonant letters were the ones still used today including ຣ <r> (eliminated from Phoumi's 1967 orthography). If my assumption is correct, then the 1942-44 Thai alphabet had four letters (in red) without Lao equivalents:

Velars	(none)	<k>	<kh>	(<x>)	<g>	(<ɣ>)	(<gh>)	<ṅ>
Thai		ก	ข		ค			ง
Lao		ກ	ຂ		ຄ			ງ
Palatals		<c>	<ch>	(none)	<j>	<z>	(<jh>)	<ñ>
Thai		จ	ฉ		ช	ซ		ญ
Lao		ຈ	see <s>		ຊ	see <j>		ຍ
Dentals	<ɗ>	<t>	<th>		<d>	(none)	<dh>	<n>
Thai	ด	ต	ถ		ท		ธ	น
Lao	ດ	ຕ	ຖ		ທ		(none)	ນ
Labials	<ɓ>	<p>	<ph>	<f>	<b>	<v>	<bh>	<m>
Thai	บ	ป	ผ	ฝ	พ	ฟ	ภ	ม
Lao	ບ	ປ	ຜ	ຝ	ພ	ຟ	(none)	ມ
Miscellaneous	<y>	<r>	<l>	<w>	<s>	<h>	<ʔ>	<ɦ>
Thai	ย	ร	ล	ว	ส	ห	อ	ฮ
Lao	ຢ	(ຣ)	ລ	ວ	ສ	ຫ	ອ	ຮ

Did Lao have a letter in 1942 that it doesn't have anymore: e.g., <ch>?

The thirteen letters dropped from the 1942-44 Thai alphabet were

1-2. The original velar fricatives ฃ <x> and ฅ <ɣ> (which were restored to the modern alphabet, though they are not normally used to write any words*)

3-4. The Indic voiced aspirates ฆ <gh> and ฌ <jh>. (See below for ฒ <ḍh>.) Why were ธ <dh> and ธ <bh> retained? Why not consistently drop all Indic voiced aspirates?

5-12. The Indic retroflexes ฏ <ṭ>, ฐ <ṭh>, ฑ <ḍ>, ฒ <ḍh>, ณ <ṇ>, ษ <ṣ>, and ฬ <ḷ> and the Khmerized** retroflex ฎ <ɗ̣>.

13. The Indic palatal ศ <ś>

The letters ฉ <ch> and ซ <z> could not be dropped from the Thai alphabet because they represent contrasts lost in Lao:

Proto-Southwestern Tai	Thai	Lao
*cʰ	ฉ <ch> [tɕʰ] + X-series tones	ສ <s> [s] + X-series tones
*s	ส <s> [s] + X-series tones	ສ <s> [s] + X-series tones
*ɟ	ซ <j> [tɕʰ] + G-series tones	ຊ <j> [s] + G-series tones
*z	ซ <z> [s] + G-series tones	ຊ <j> [s] + G-series tones

X- and G- respectively represent tone classes conditioned by consonants with voiceless friction and voiced consonants. See "Tone Codes".

The Thai and Lao script indicate tone series with different consonant letters and combinations. In theory it would be possible to reduce the number of consonants even further and indicate tones solely with diacritics, but that would be a radical change to both systems.

*The title of the 2006 movie ฅนไฟบิน <ɣnfaiɓin> Khon fai bin, lit. 'flying fire person', has a deliberately unusual spelling presumably intended to be appropriate for its . I think this spelling is pseudoarchaic since Jones (pre-1850) and Bradley (1873) have the spelling คน <gn>. Pittayaporn (2009: 76) reconstructed Proto-Tai *ɢwɯn A 'person' and a shift of *ɢ- to *g- (not *ɣ-) in Proto-Southwestern Tai, the ancestor of Thai and Lao.

**Some Indic ṭ became ɗ in Khmer and were borrowed into Thai and Lao as ɗ.

14.2.11.13:11: THE BEST SON? 2: REMNANT OF A REFORM

I'd like to read every entry on Rikker Dockum's Thai 101 blog. I just found an entry explaining the spelling of สอ เสถบุตร So Sethaputra's surname (see "The Best Son?"):

I've known for a few years that Thailand experimented with simplified spelling during World War II [...] I understand that it was mandated by Field Marshal Plaek Phibunsongkhram [on May 29, 1942], and even people's names had to be respelled under this system. One remnant legacy of this is that famous name in dictionaries, So Sethaputra, whose last name is spelled เสถบุตร <sethputr> to this day. The original spelling of his last name is เศรษฐบุตร <śreṣṭhputr>, but since his name became famous along with his first dictionary under its revised spelling, he was one of the few that didn't revert the spelling [on December 12, 1944] after Field Marshal Plaek was ousted.

I vaguely recall seeing this entry before but obviously I missed or forgot the part about Sethaputra. The name exemplifies two of the principles of the reform:

Character set reduction

The palatal ศ <ś> needed for Sanskrit but not for Thai was replaced with ส <s>.

The retroflex consonant letters needed for Sanskrit but not for Thai were replaced by dentals: e.g., ฐ <ṭh> was replaced with ถ <th>.

Another character that was dropped was ใ <aɨ> which originally represented *-aɰ but became homophonous with ไ <ai> for *-aj.

Silent letter elimination

The silent ร <r> in the initial cluster of <śreṣṭhputr> was dropped.

The ษ <ṣ> of เศรษฐบุตร <śreṣṭhputr> may have been regarded as silent (even though I think it corresponds to [t] in pronunciation) and was dropped instead of being converted into a dental ด <ɗ> (the usual spelling of final [t]).

Not all silent letters were eliminated: e.g., the final ร <r> of <sethputr> [sèːttʰàbùt] remained.

Two other changes mentioned by Dockum don't fall into those two categories:

Elimination of redundant clusters

ทร <dr> was respelled as its homophone ซ <z> [s] and อย <ʔy> was respelled as its homophone หย <hy> [j].

Elimination of initial <ñ>

Initial ญ <ñ> was respelled as its homophone ย <y> [j]. However, final ญ <ñ> was not respelled as its homophone น <n> [n]; it remained, albeit without its final stroke on the bottom right. (Was simplifying the shape of ญ <ñ> a problem for printers?)

One might think that this reformed Thai spelling is like Lao in Thai characters, but it is still far from the phonetic simplicity of Lao. Compare the two spellings of Sethaputra with its Lao spelling. Similarities are in blue; differences in red.

Pre-reform Thai เศรษฐบุตร <śreṣṭhputr>
Reformed Thai เสถบุตร <sethputr>

Lao ເສດຖະບຸດ <seɗthḥɓuɗ>

Similarities

1. The use of <s> for [s] ultimately from Sanskrit <ś>

2. The absence of <r> after the first consonant

I wonder if the Sanskrit-style spelling <śreṣṭh> was created in Thailand in modern times; if so, a traditional Thai and Lao spelling might have been a Pali-like <seṭh>

3. The absence of <ṣ>

4. The use of <th> for [tʰ] ultimately from Sanskrit <ṭh>

Differences

1. The use of <ɗ> for all final [t] in Lao; the first [t] is unwritten in reformed Thai and the second is written as <t> followed by a silent <r>

2. The explicit marking of medial [a] as <ḥ> in Lao; it is unwritten in either Thai spelling, though <ḥ> does represent Thai [a] in other contexts

Googling, I see that not all Sethaputras kept the new spelling; the old spelling <śreṣṭhputr> is still in use today. That situation reminds me of what happened in China decades later:

Following the retraction of the second round [of Chinese character simplification in 1978], many people still kept the new forms as their surnames so that the three family names are now written six or seven different ways.

How many other surname pairs like เสถบุตร <sethputr>/เศรษฐบุตร <śreṣṭhputr> remain today, nearly seventy years after the end of the reform?

14.2.11.9:01: COLONEL SUPREME MERIT

I used to have to guess the non-Roman spellings of romanized names. Nowadays I love how I can easily find original spellings in Wikipedia, complete with pronunciations in IPA.

Unfortunately not all original spellings are included yet: e.g., the name of Colonel Bounleuth Saycocie in Lao script.

Bounleuth is

ບຸນເລີດ <ɓunlət> [bùnlə̂ːt]

a compound of [bùn] 'merit' from Pali puñña- or Sanskrit puṇya- 'id.'* plus [lə̂ːt] 'supreme' from Khmer លើស <ləs> 'surpass'. Why was final [t] romanized as th?

I Googled "Saycocie" and ລາວ <lāw> 'Lao' so I could find a mixed-script text which might have both Latin and Lao spellings of the name. I found

ໄຊໂກສີ <jaikosī> [sájkòːsǐː]

which is a compound of [sáj] from Sanskrit or Pali jaya- 'victory' plus [kòːsǐː] 'Indra', ultimately from Sanskrit Kauśika- 'belonging to the Kuśikas; Indra'***.

*Gedney (1947: 146) regarded both forms as possible sources of Thai [bun] 'merit'. The Thai spelling บุญ <ɓuñ> is not necessarily evidence of Pali origin; it could be a Palified respelling of an earlier Sanskrit-based spelling *บุณย์ <puṇy̽> (cf. Khmer បុណ្យ <puṇya> [bon])

**The Thai version of this word เลิศ <ləś> [lə̂ːt] is written with a nonetymological final ศ <ś> normally only in Sanskrit loans. Was this word ever spelled as លើឝ <ləś> in Khmer?

The Khmer word is now [ləːh], but the Lao and Thai final [t] and the Khmer spelling point to *[ləːs] at the time of borrowing.

***The Lao spelling implies Pali *Kosī, but the only kosī I can find in The Pali Text Society's Pali-English Dictionary means 'sheath' (and may be derived from Sanskrit kośa- 'id.').

I cannot find a Thai version of the Lao word.

The Khmer version is កោសិយ <kosiya> [kaosəj] which implies a Pali *Kosiya-.

14.2.10.19:28: THE BEST SON?

Last week I rediscovered my copy of สอ เสถบุตร So Sethaputra's New Model English-Thai Dictionary and wondered about the etymology of his surname. Thai surnames are often Indic compounds and may even be romanized according to their Indic-style pronunciation: e.g., the บุตร <ɓutr> -putra of Sethaputra is from Sanskrit putra- 'son' but is actually pronounced [bùt] in Thai.

I would expect เสถ <seth> Setha- [sèːttʰà] to be from a Sanskrit or Pali setha-. However, there is no such word in either language. Moreover, there are no native Thai or Khmer morphemes of the type /CVCʰ(V)/.

My guess was that เสถ <seth> was from Pali seṭṭha- 'best' even though the latter would ideally be spelled with retroflex consonants as เสฏฐ <seṭṭh>. But I forgot that Gedney (1947: 467) had an entry for the surname Sethaputra in his dissertation. He explained that เสถ <seth> is from Pali seṭṭhi- 'merchant' (which in turn is derived from seṭṭha- < Sanskrit śreṣṭha- 'best', so I wasn't too far off). The same Indic root appears with a Sanskrit-style spelling เศรษฐ- <śreṣṭh> in words such as

เศรษฐกิจ <śreṣṭhkic> [sèːttʰàkìt] 'economy' (+ Pali kicca- 'duty')

เศรษฐศาสตร์ <śreṣṭhśāstr̽> [sèːttʰàsàːt] 'economics' (+ Sanskrit śāstra- 'science')

I used to think the first element of those words was 'best', but 'merchant' makes much more sense.

How did an Indic i-word come to end in -a in Thai compounds? Was Pali seṭṭhi- borrowed into earlier Thai as *seːt, just as Sanskrit/Pali jāti- 'birth' was borrowed as *i-less ชาติ <jāti> *ɟaːt (now [tɕʰaːt]) 'nation'? Indic borrowings with final -t in Thai tend to regain their original -a- when in compounds, so the combining form [sèːttʰà] for *seːt might be analogous to combining forms like [ráttʰà] for รัฐ <ra> [ráṭh] 'nation' (< Pali raṭṭha- 'kingdom').

14.2.9.22:27: TONE CODES: XK'G- + -vqhslc

In part 3 of "Soni linguae Capitis" I resurrected and revised my old system of codes for Sinospheric tonal categories. That system left out a category of initials that are, as far I know, unique to Tai: namely, glottals. In this latest version, initial codes are mostly indicated by capital letters that do not overlap with final codes:

Initial consonant class \ Final class	Voiced: -v	Glottal stop: -q	Fricative: -h	Short vowel + final nonglottal stop: -c	Long vowel + final nonglottal stop: -lc
Voiceless friction: X-	Xv	Xq	Xh	Xsc	Xlc
Voiceless unaspirated: K-	Kv	Kq	Kh	Ksc	Klc
Glottal: '-	'v	'q	'h	'sc	'lc
Voiced: G-	Gv	Vq	Gh	Gsc	Glc

I chose velars (X-, K-, G-) to symbolize initials since they are almost as back as '- for a glottal stop.

I added -s- for 'short' to distinguish -sc tones from -c tones of languages without phonemic vowel length before final glottal stops.

I replaced the colon for long vowels with -l- for 'long' to avoid confusion with the colon as a punctuation mark.

I could refer to tones shared by rhymes with short and long vowels followed by nonglottal stops as -slc. Similarly I could refer to other composite categories by combinations of their codes.

Applying this terminology to Muong Yong Lue in northeastern Burma as described in Hudak (2008: 21):

Initial/final	*-V(nasal/glide)	*-ʔ	*-h	*-Vːk/c/t/p	*-Vk/c/t/p
*h/χ/x/s- ...	Xv: high rising	Xq: low level, glottalized	Xh: low rising	Xlc: low rising	Xsc: high rising
*k/c/t/p- ...	Kv: high rising	Kq: low level, glottalized	Kh: low rising	Klc: low rising	Ksc: high rising
*ʔ/ʄ/ɗ/ɓ- ...	'v: mid rising-falling	'q: low level, glottalized	'h: low rising	'lc: low rising	'sc: high rising
*g/ɟ/d/b- ...	Gv: mid rising-falling	Gq: falling, glottalized	Gh: mid level	Glc: mid level	Gsc: mid level

(The list of initials is representative but not comprehensive: e.g., the G-class also includes other voiced initials such as *ɢ-, *ɣ-, *ŋ-, etc.)

There are six tones:

1. XKv/XK'sc = Xv + Kv + Xsc + Ksc + 'sc

2. XK'q = Xq + Kq + 'q

3. XK'hlc = Xh + Kh + 'h + Xlc + Klc + 'lc

4. 'Gv = 'v + Gv

5. Ghslc = Gh + Gsc + Glc

6. Gq

All but 6 occupy two or more cells of the tone box.

XK-tones are identical to '-tones within each column except in the first column: 'v is mid like Gv, not high rising like XKv.

All -q tones have glottalization as a trace of a lost final glottal stop *-ʔ.

-h and -lc tones are identical. I have hence moved the -lc column next to the -h column.

Twenty boxes are enough for most Tai varieties but not for Bac Va in Vietnam which

exhibits a tonal split in the voiceless friction sounds [i.e., the X-class] that does not occur in other [Tai] dialects Gedney has studied. In this case, the proto-initials in the top row [see table below] include aspirated stops and voiceless fricatives such as [f] and [h]. In the second row are the preaspirated sonorants [e.g., *ʰm = *m̥] and [s].

I could call the second row the S-class (with an upper case S to distinguish it from lower case s for short vowel):

Bac Va tones

Initial/final	*-V(nasal/glide)	*-ʔ	*-h	*-Vːk/c/t/p	*-Vk/c/t/p
*h/χ/x- ...	Xv: low rising	Xq: low-level, glottalized	Xh: low falling, glottalized	Xlc: low rising	Xsc: mid level
*s/ŋ̊/ɳ̊/n̥/m̥- ...	Sv: high rising	Sq: low level	Sh: low rising	Slc: low rising	Ssc: high level
*k/c/t/p- ...	Kv: high rising	Kq: low-level, glottalized	Kh: low rising	Klc: low rising	Ksc: high level
*ʔ/ʄ/ɗ/ɓ- ...	'v: high rising	'q: low level	'h: low falling, glottalized	'lc: low rising	'sc: high level
*g/ɟ/d/b- ...	Gv: mid level	Gq: high level	Gh: mid-falling, glottalized	Glc: mid-falling, glottalized	Gsc: mid level

There are eight tones:

1. Xv/SKh/XSK'lc

2. SK'v

3. Gv/XGsc

4. X'h

5. Ghlc

6. XKq

7. S'q

8. Gq/SK'sc

The S-class puzzles me because I would expect the fricative *s- to be in the X-class rather than in a class with voiceless sonorants. I doubt the latter were *s-clusters at the time of the tone split, though perhaps such clusters existed at an even earlier date. Moreover, why are S-class tones almost identical to '-class tones? *s- and voiceless sonorants are not more like glottals than *x- and aspirated stops.

A bigger question is why some Kra-Dai (or should I say Kam-Tai?*) languages have so many tones: e.g., nine in Kam open syllables as opposed to a maximum of six for Chinese open syllables. The five-way split in Bac Va could have resulted in up to fifteen tones in open syllables (< *-V/*-ʔ/-*h) in the past. It is as if Kra-Dai speakers beat the Chinese at their own tonogenetic game. (Chinese developed tones early in the first millennium AD, and Kra-Dai languages probably soon followed.)

*Kra languages do not have large numbers of tones. According to Ostapirat (2000), Buyang has four, Pubiao has six, Paha has five, etc. Ostapirat (2000: 1) grouped Kam-Sui and Tai into a Kam-Tai subgroup, whereas Norquest (2007: 16) placed Kam and Tai in northern and southern subgroups of Kra-Dai. I have not yet investigated the problem of Kra-Dai subgrouping and do not have an opinion.