10.4.17.23:51: ĐỊA DANH DIVERGENCE: DEEP DEVOTION? (PART 1)
Last night, I predicted that ̣(Sino-)Vietnamized names would be in decline in Vietnamese. I thought 深圳 Shenzhen might be a good test case for my hypothesis since it is a name that has only recently become famous. Its Chinese spelling contains a low-frequency character 圳 (#3695 on this list compiled in 1993-94, 13-14 years after it began its rise to fame as a 'special economic zone'; it must be much more frequent now). This character is
- not in any dictionaries I can find earlier than 龍龕手鑑 Longkan shoujian (997 AD), postdating the independence of Vietnam (and the borrowing of the later layers of Sino-Vietnamese). Longkan shoujian contains 26,000+ characters, including many obscure variants. Apparently not even the gigantic Jiyun (1037) lists 圳 among its 53,525 characters.
- not in my pocket Sino-Japanese dictionary (大修館新漢和辞典) with 9,000+ character entries
- not in my pocket Sino-Korean dictionary (동아 現代活用玉篇) with 7,000+ character entries (but it is in my pocket SK dictionary 새字典 with 23,558 character entries!)
- not in 五千字 Ngũ thiên tự (Five Thousand Characters) a premodern Vietnamese primer with 5,000 characters (hence the title)
- not at nomfoundation.org ̣(unfortunately huesoft is down, so I can't check there)
Thus I wonder whether 圳 was known to premodern Vietnamese literati. If it wasn't, it wouldn't have a Sino-Vietnamese reading. And even if it did have one, it might be difficult to find. So I guessed that Shenzhen would be called Shenzhen in Vietnamese.
Vietnamese Wikipedia article for it is titled Thâm
Quyến. Moreover, the
online Vietnamese edition of Nhân
Dân (人民 People), the official
newspaper of the Communist Party of Vietnam uses Thâm
Quyến, not Shenzhen which is
only in its English
Thâm is the SV reading of 深 'deep', but I didn't expect Quyến as the SV reading of 圳 'drain'. 圳 is homophonous with 震 'shake' in Chinese, so I would have assumed that the SV reading of 圳 would be homophonous with 震 SV chấn, not 眷 SV quyến 'attached, devoted to'.
Next: What is the 元/源 nguyên (origin) of 圳 Quyến?
In my last post, I gave Têhêrăng 'Tehran' as an example of a loanword in Vietnamese borrowed after the shift of *t- to implosive ɗ- (spelled đ- since the 17th century). I think the word was probably borrowed from French because of its final -ng which I presume corresponds to a French nasal vowel: cf. Li-băng 'Lebanon', probably from French Liban. Têhêrăng appears in Ho Chi Minh's 1945 Tuyên ngôn độc lập (宣言獨立) 'Declaration of Independence'* predating American influence. Almost all Google hits for Têhêrăng link to copies of the text of that speech, so I assume that name is obsolete. The Vietnamese Wikipedia article for Tehran is titled "Tehran", though it notes that Tehran
đôi khi còn được viết thành Teheran hoặc Têhêrăng
'... is also sometimes still written as Teheran or Têhêrăng'
Offhand, I think there are five strata of foreign place names (địa danh [地名] 'earth names') in Vietnamese:
1. Traditional Sinospheric names converted into Sino-Vietnamese: e.g.,
Triều Tiên 'Korea' < Chn 朝鮮
Nhật Bản 'Japan' < Chn 日本
2. Local names with or without Vietnamization presumably dating from the nam tiến (南進) 'southern advance'- the Vietnamese expansion to the non-Vietnamese areas to the south: e.g.,
Kon Tum (would be *Con Tum if it were Vietnamized; Ko violates Vietnamese spelling conventions**) < Bahnar kon 'village' + tum 'pool'
Pleiku (also semi-Vietnamized as Plei Cu, Plây Cu, Plây Ku and Plei Ku; p[l]- is not permissible in core standard Vietnamese phonology and Ku violates Vietnamese spelling conventions**; is the name Bahnar?)
3. Names borrowed from Chinese and converted into Sino-Vietnamese: e.g.,
Thổ Nhĩ Kỳ 'Turkey' < Mandarin 土耳其 Tuerqi 'earth ear its' (a phonetic transcription)
Cựu Kim Sơn 'San Francisco' < Cantonese 舊金山 Gaugamsaan 'Old Gold Mountain'
This stratum is structurally like stratum 1. The only difference is that these names are newer than the stratum 1 names. No one in ancient Vietnam had ever heard of Thổ Nhĩ Kỳ or Cựu Kim Sơn.
4. Names borrowed from French with or without Vietnamization: e.g.,
Maroc 'Morocco' (would be *Ma Rọc if Vietnamized)
Mauritanie 'Maurtania' (would be *Mo Ri Ta Ni if Vietnamized)
5. Postcolonial names with or without Vietnamization: e.g.,
Campuchia 'Cambodia' < Khmer កម្ពុជា Kampuciə
(would be *Cam Bu Chia if fully Vietnamese)
cf. English Kampucheacf. stratum 4 Cam Bốt and Căm Bốt < French Cambodge
cf. stratum 1 Cao M(i)ên < Chinese 高棉, transcription of Khmer
Zimbabwe or Vietnamized Gim-ba-buê 'Zimbabwe'
Kazakhstan or Vietnamized Ca-giắc-xtan 'Kazakhstan'
Here's a list of country names in Vietnamese. Cambodia is not the only country with three different names. Others are Argentina, Belarus, Canada, etc.
I hypothesize that over time
- stratum 1 names will remain stable for well-known Sinospheric places like Japan (which will not become Nihon, Nippon, or Japan in Vietnamese)
- stratum 2 names will also remain stable
- no new stratum 3 names will be created with the exception of Chinese names that will continue to be Sino-Vietnamized
- less frequent stratum 1, 3, and 4 names will become obsolete and be replaced by stratum 5 names: e.g., in Google
Campuchia has 3.06 million Vietnamese language hits (VLH)
Căm Bốt (stratum 4) has 280,000 VLH
Cao Miên (stratum 1) has 110,000 VLH
Căm Bốt (stratum 4) has 25,900 VLH
Cao Mên (stratum 1) has 13,600 VLH
My 1966 Vietnamese-English dictionary only has entries for Cao M(i)ên.
- all new place names will be non-Vietnamized stratum 5 names: e.g., if Somewherestan is established tomorrow, it will be called Somethingstan rather than Xơm-vê-rơ-xtan.
These trends also apply to personal names: e.g., my 1966 dictionary lists Cam-Địa < Md 甘地 Gandi ('sweet land'!) for 'Gandhi' but his Vietnamese Wikipedia entry is titled Mahatma Gandhi. (However, Hu Jintao 胡锦涛 is Hồ Cẩm Đào.)
Similar trends generally won't occur in other East Asian languages*** because Chinese characters, kana, and hangul all entail some degree of phonetic divergence from foreign originals, whereas Roman letter spellings can be carried over intact into the Vietnamese alphabet. There is no Chinese character or kana for syllables with initial st- and current mainstream hangul input methods and fonts do not allow users to easily type syllables with initial st-. (Such syllables did exist in Middle Korean.) So I don't think that
Md 哈萨克斯坦 Hasakesitan
Kor 카자흐스탄 Khajahŭsŭthan
Jpn カザフスタン Kazafusutan
will be replaced by Kazakhstan. Right now, Vietnamese Ca-giắc-xtan has 257,000 VLH, but Kazakhstan is catching up at 216,000 VLH, and the Vietnamese Wikipedia page titled "Kazakhstan" doesn't even mention the Vietnamized form.
*獨立宣言 'independence declaration' in other East Asian languages which have modifier-modified order instead of Vietnamese (and general Southeast Asian modified-modifier order).
**Vietnamese spelling allows k only before front vowels (in bold):
|kia, kiê-||cưa, cươ-||cua, cuô-|
Hyphens indicate that another letter must follow. No written Vietnamese syllable can end in -iê, -ươ, -uô, -ă, or -â.
***Korean diverges less than Vietnamese from foreign names in at least one way. Sino-Koreanized Chinese names may be on the wane: e.g., the Google hits for Hu Jintao:
후진타오 Hu Chinthao: 475,000
호금도 Ho Kŭm-do (Sino-Koreanized): 128,000
However, the numbers for his predecessor Jiang Zemin show the opposite trend:
Hu JIntao (non-Vietnamized): 4,700
Hồ Cẩm Đào (Sino-Vietnamized) 5.82 million
장쩌민 Chang Tchŏmin: 56,300
강택민 Kang Thaeng-min (Sino-Koreanized): 69,500
And established older Sino-Koreanized Chinese names are probably here to stay: e.g., 모택동 Mo Thaek-tong instead of 마오쩌둥 Mao Tchŏdong for Mao Zedong.
10.4.15.23:59: VEXING VOICING: B- AND D- IN SOUTHEAST ASIA (PART 2)Last night, I wrote about these sets of correspondences between Indic and Southeast Asian languages:
Indic voiced : SEA voiceless
Sanskrit, Pali Thai, Lao Khmer b- ph- p- d- th- t-
Indic voiceless : SEA voiced
Sanskrit, Pali Thai, Lao, Khmer p- b- t- d-
I forgot to explicitly mention a third correspondence pattern (though I hinted at it with Khmer taaraa < Skt/Pali taaraa 'star'):
Indic voiceless : SEA voiceless
Sanskrit, Pali Thai, Lao, Khmer p- t- p- t-
Pali paññaa 'intelligence'
Thai ปัญญา panyaa
Lao ປັນຍາ panñaa
Khmer បញ្ញា paññaa
Skt/Pali taala 'palm' (plant)
Thai ตาล taan
Lao ຕານ taan
Khmer តាល taalaʔ- (in compounds)
Why would Indic p- and t- be borrowed both as b- and d- and as p- and t-?Here's a solution combining elements from Gedney (1947) and Ferlus (1992).
1. Khmer borrowed Indic p- and t- as p- and t-:
Khmer *paalii < Pali paali 'Pali'
Khmer *taaraa < Skt/Pali taaraa 'star'
2. Khmer p- and t- shifted to implosive ɓ- and ɗ- as in Vietnamese:
Khmer *ɓaalii < *paalii
Khmer *ɗaaraa < *taaraa
3. Khmer borrowed Indic words after this shift: e.g.,
Khmer *paññaa < Pali paññaa 'intelligence'
Khmer *taal < Skt/Pali taala 'palm' (plant)
Khmer *taaraa < Skt/Pali taaraa 'star' (reborrowed)
These new borrowings were not affected by the shift that predated them, so they have p- and t- corresponding to Indic p- and t-.
4. The ancestor(s) of Thai and Lao borrowed both layers of Indic borrowings from Khmer:
4.16.1:46: Vietnamese also has two similar layers of borrowings:
- the earlier layer with implosives: e.g.,
Thai baalii < Khmer *ɓaalii < *paalii < Pali paali 'Pali'Thai daaraa < Khmer *ɗaaraa < *taaraa < Skt/Pali taaraa 'star'
- the later layer without implosives: e.g.,
Thai panyaa, Lao panñaa < Khmer *paññaa < Pali paññaa 'intelligence'
Thai, Lao taan < Khmer *taal < Skt/Pali taala 'palm' (plant)
- an earlier layer with implosives: e.g.,
binh [ɓiɲ] < *piɲ < Colonial Chinese 兵 *piŋ 'soldier'
đế [ɗe] < *té < Colonial Chinese 帝 *tèj 'emperor'
- a later layer without implosives: e.g.,
pin < French pile 'battery'
Têhêrăng [tehezaŋ] < French Téhéran? 'Tehran'
10.4.14.23:59: VEXING VOICING: B- AND D- IN SOUTHEAST ASIA (PART 1)
In my experience, initial voiced obstruents more often become voiceless than the other way around: e.g.,
b- > p-
is more common than
p- > b-
ph- > b(h)-
seems to be nonexistent!
I specified initial position because voiceless obstruents often voice in intervocalic position (e.g., in Korean). Voiced segments are in bold:
V1pV2 > V1bV2
p in this situation is assimilating to the voiced segments around it. If one thinks of phonemic features in terms of 'switches', the speaker turns on the 'voicing' switch for V1. Voicing -p- allows the speaker to keep the 'voicing' switch turned on all the way through V2 without having to turn it off and then on again for -p-:
Before voicing assimilation
After voicing assimilation
Are voiceless obstruents favored in initial position because they are transitions between the preceding silence and a voiced vowel? Conversely, is the switch from silence (represented here as '...') to the full voicing of an initial consonant
relatively less common than
because it's abrupt?
Recently, David Boxenhorn and I have been talking about the voiceless-to-voiced change
q > g
in Arabic. Was there a transitional stage *ɢ? David pointed out that such a voiced uvular stop was more marked than either q or g. Here are the frequencies of those three segments at UPSID:
For comparison, voiceless k is in 89.36% of the UPSID languages and 97.12% have some kind of k-like stop. (Which languages don't have any? I'll answer this question in a future post.)
According to UPSID, Somali has ɢ but not q. Ehret (1995: 236) lists an etymology deriviing Somali q (phonetically [ɢ]?) is from Proto-Afroasiastic *k'. According to Wikipedia, voiceless χ in Arabic loans are 'Somalized' as ɢ. (I would have expected voiceless h.)
The unusual voicing of native and borrowed voiceless stops in Somali reminds me of these phenomena in Southeast Asian languages:
Voicing in Vietnamese
Vietnamese has no initial p- because *p- merged with *b- to become implosive b- [ɓ].
Similarly, *t- and *d- merged into implosive đ- [ɗ]. Vietnamese t- is from *(t)s- and *(d)z-. I assume that *(t)s- became *t- after the old *t- became đ- [ɗ].
However, *c- and *k- did not voice whereas their voiced counterparts *ɟ- and *g- devoiced. Voiced palatal and velar implosives (ʄ, ɠ) are rare (2.22% and 1.11% in UPSID), so their absence is unsurprising.
*ts- and *dz- were only in Chinese loanwords, whereas all other initials including *s- were in native words. The early loanword chữ 'character' has ch- < *ɟ- for Colonial Chinese *dz- whereas the later loanword tự has t- < *(d)z-. chữ was borrowed before Vietnamese speakers added *(d)z- to their phonetic repetoire.
b- [ɓ] < *p-, *b-
đ- [ɗ] < *t-, *d-
t- < *(t)s-, *(d)z-
ch- [c] < *c-, *ɟ-
x- [ɕ] < *ch-
c-/k- [k] < *k-, *g-
(p- marginal; only in loanwords; not from native *p-)
ph- [f] < ph-
|th- (the only aspirate left!)||
x- [s] < [ɕ]
kh- [x] < kh-
I'll discuss the shift of *kj- to voiced gi- [z] in a future post.
Voicing in Thai, Lao, and Khmer
In these three languages, initial voiced obstruents usually devoiced: e.g.,
Skt braahmaṇa 'brahman':
Thai พราหมณ์ phraam
Lao ພາມ phaam
Khmer ព្រាហ្មណ priəm
Skt deśa, Pali desa 'country'
Thai เทศ theet
Lao ເທດ theet
Khmer ទេស teeh < *-s
The aspiration in Thai and Lao implies intermediate stages like
*b- > *bh- > ph-
*d- > *dh- > th-
which are reminiscent of the middle stage of Arabic q- > ɢ- > g- because voiced aspirates are unusual (though not as unusual as ɢ).
However, some initial voiceless obstruents in Indic loanwords voiced (cf. Vietnamese *p- > [ɓ] and *t- > [ɗ]):
Skt or Pali paada 'foot' (cognate to Eng foot):
Thai บาท baat 'baht' (the monetary unit)
Lao ບາດ baat
Khmer បាទ baat
Skt or Pali taaraa 'star':
Thai ดารา daaraa
Lao ດາຣາ daaraa
Khmer តារា daaraa (also taaraa; both are spelled with ត t)
What's going on here? Thai, Lao, and Khmer b and d go back to implosive *ɓ and *ɗ (and are still generally implosive in modern Khmer and sometimes implosive in Thai; I don't know if Lao b and d are really [b d] or not). But I've never heard of implosive allophones for p- and t- in any Indic language. Did Khmer once have Korean-like reinforced consonants (pp-, tt-) when it borrowed from Indic?
Skt/Pali taaraa > early Khmer *ttaaraa? > later Khmer *ɗaaraa > Southwestern Tai *ɗaaraa > Thai / Lao daaraa
Lee and Ramsey (2000: 62) compared Korean reinforced consonants to the voiceless unaspirated stops of French and Mandarin.
One Koreanization of Paris is 빠리 Ppari (355,000 Google hits); the other is 파리 Phari with an un-French aspirated ㅍ ph (7,009,000 Google hits).
Beijing [pejtɕiŋ] has been Koreanized as both 뻬이징 Ppeijing (28,900 Google hits) and 베이징 Peijing (7,004,000 Google hits).
4.15.2:35: There are two huge problems with the reinforced hypothesis.
First, voiceless reinforced consonants are rare. pp- is in only 1.11% of UPSID and Korean is the only language in UPSID with dental tt-. No modern Southeast Asian languages have them and there is no evidence for them in earlier SEA languages. I prefer not to reconstruct rare segments (though by that criterion, one wouldn't reconstruct Korean properly).
Second, UPSID has no language with just pp- and tt-. Korean has a full set (pp-, tt-, cc-, kk-) as do Shushwap and Siona in the Americas, but there is no evidence for cc- and kk- in Khmer.
4.15.4:09: There's no need to posit exotic consonants in early Khmer to explain why Indic p- t- correspond to Khmer b- d-. Khmer is like Vietnamese but with a twist I'll explain next time.
10.4.13.2:57: WAS THERE A *KRP-ORATE CLUSTER IN 'CLOTH'?
Last night I wrote about Southeast Asian words for 'skirt'. The Thai word ซิ่น sin has a longer variant ผ้าซิ่น phaa sin with ผ้า phaa 'cloth'. The Lao equivalent ຜ້າ phaa appears twice in this memoir. The spellings of the Thai and Lao forms for 'cloth' indicate a Proto-Southwestern Tai *phaa with tone C going back to Proto-Tai *phɯa (in Li's 1977 reconstruction system at Proto-Tai'o'matic) vaguely resembling Old Chinese 布 *pas 'cloth'. There are three problems with a Tai-OC comparison:
First, the initials do not match: PT *ph- : OC *p-
Second, the vowels do not match: PT *ɯa : OC *aThird, the tone classes do not match: PT *tone C corresponds to OC *-ʔ, not *-s.
Pittayawat Pittayaporn's recent proposal may help solve the first problem. According to him, Proto-Tai had no aspirates:
Aspirated onsets in cognates found across modern Tai varieties developed mainly from
PT clusters with medial *-r-, e.g. *pr-
PT uvular consonants, e.g. *q-
Loanwords, especially from Chinese
The word for 'cloth' is of Austroasiatic origin. Schuessler (2007: 173) lists a Proto-Austroasiatic *k-rn-paas that is the source of Sanskrit karpaasa 'cotton', a word that traveled westward to become Greek κάρπασος and Latin carbasus. Ferlus reconstructed Proto-Viet-Muong *k-paas as the source of Vietnamese vải 'cloth'.
Perhaps the ph- of Thai and Lao phaa 'cloth' is from an earlier cluster:
a. *k-p- (cf. Middle Korean ph- < *kp-?) or
b. *pr- < *r-p- or even
c. *k-r-p- > *kh-p- > *x-p- > *px- > ph-.
Northern Tai dialects like Po-ai pɨɨ have a tone implying a *voiced initial in this word (Li 1977: 64). That *voicing may reflect an earlier voiced segment:
a. *pr- > *br- > *b- (4.15.0:22: cf. Laha blaa < *p-la 'fish' [Ostapirat 2000: 225]) or
b. *rp- > *rb- > *b- or
c. *rp- > *np- > *mp- > *mb- > *b-d. *(k-r)n-p- > *mp- > *mb- > *b-
It may not be necessary to reconstruct a PT *phɯa if Tai languages borrowed their words for 'cloth' from two or more sources.
Po-ai pɨɨ < *-ɨə? could be a loan from a 'reversed type' Chinese form like *pɨəh < *Cɯ-pas. (The *C- might have been *k-.) Proto-Min *pio (Schuessler 2007: 173) may descend from such a form.
Note that Po-ai has tone B instead of tone C. Tone B regularly corresponds to Chinese *-h < *-s.
The *tone C of the Proto-Southwestern Tai form may imply a foreign original with *-ʔ, though all data point to *-s.
Did some Austroasiatic language shift *-s to *-t and then *-ʔ?
Or could this *tone C reflect a borrowing from a Chinese dialect that had shifted *-h to a 'departing tone' sounding like Proto-Southwestern Tai *tone C?
The tone C : 'departing tone' correspondence is reminiscent of the sắc/nặng : 'departing tone' correspondence in the main layer of Sino-Vietnamese, though that layer has bố for 'cloth' with -ồ rather than an -a like the -aa of Thai and Lao.
Perhaps Proto-Southwestern Tai *phaa with tone C reflects a southern Chinese archaism *phà retaining Old Chinese *a (cf. other archaic words retaining *a: 怕 Md pa < OC *paks 'to fear', 他 Md ta < OC *hlaj 'he').
10.4.12.1:32: VOX POPULI LAOTIAE
I Googled the following romanizations of
'voice people' = Voice of the People
the title of the newspaper of the Lao People's Revolutionary Party. Nonetymological romanizations are starred.
siang pasason: 1,160 (77.7%)
siang paxaxon: 0
*xiang pasason: 0
*xiang paxaxon: 0
sieng pasason: 204 (13.7%)
sieng paxaxon: 3
*xieng pasason: 126 (8.4%)
*xieng paxaxon: 0
The most common romanization is the most obvious one. The s/x distinction lost in speech is on the wane, as is the use of -ie- for ຽ [iə] (reflecting an earlier, now extinct [ie] pronunciation?).
Xieng is etymologically wrong because [siəŋ] 'voice' never had initial x- ([ɕ]?). Jonsson (1991) reconstructed Proto-Southwestern Tai *siaŋ 'sound'. Schuessler (2007: 460) thought the word looks like a loan from a southern Chinese language: cf. Proto-Min 聲 *šiaŋ ~ *tshiaŋ. (Northern Middle Chinese had a mid vowel: *ɕieŋ.) [siəŋ] also resembles Sino-Japanese shou < siyaũ, a borrowing from southern Early Middle Chinese via Paekche. Proto-Tai had no *š- or *ts(h)- so such initials could have been borrowed as *s-.
Vietnamese tiếng < *siə́ŋ 'voice, sound, noise, language' has a t- < *s- and a sắc tone instead of the th- < *ɕ- and ngang tone in thanh < *ɕɛɲ?, a later borrowing of the same Chinese word from Colonial Chinese *ɕiɛɲ. Perhaps there was an early Colonial Chinese form *siaŋ-ʔ with a nonpalatal *s- and a glottal stop suffix conditioning the sắc tone in Vietnamese. Does any modern Chinese language have a 'rising tone' for 聲 corresponding to the sắc tone of tiếng?
The *s- of Proto-Tai and early Vietnamese indicate that 聲 was borrowed long after Early Old Chinese (EOC). The phonetic series of 聲 suggests an initial other than simple *s-:
殸磬罄 EOC *qheŋs
謦 EOC *qheŋʔ
Schuessler (2009: 136) reconstructed 聲 as EOC *hjeŋ, similar to Starostin's *heŋ. I reconstruct *sɯ-qheŋ with a high-vowelled presyllable conditioning nonemphasis and vowel warping:
*sɯ-qheŋ > *sɯ-kheŋ > *skheŋ > *xeŋ > *xieŋ > *ɕieŋ (> southern *siaŋ?)
馨 'fragrance' is almost homophonous except for emphasis:
*s(V)-qheŋ > *sqheŋ > *χeŋ
Its *s(V)- lost its vowel and was absorbed into the following emphatic syllable.
馨 may be an ablaut nonemphatic variant of 香 'fragrance':
*sɯ-qhaŋ > *sɯ-khaŋ > *skhaŋ > *xaŋ > *xɨaŋ (and eventually the Hong of Hong Kong)
10.4.11.22:36: IS SYLLABLE-FINAL -H IN LAO ROMANIZATION ETYMOLOGICAL?
In Khmer romanization
- final -nh represents [ɲ] (as in Vietnamese spelling): e.g., ភ្នំពេញ Phnom Penh [pnum pɨɲ]
- final -ch represents a palatal stop [c] (also as in Vietnamese): e.g., សម្ដេច samdech [sɑmdac] (a Khmer royal title)
- final -h after other consonants represents an Indic aspirate: e.g., មុនីនាថ Monineath [muniiniət] is from Sanskrit muniinaatha. (I have no idea why u was romanized as o.)
I had long assumed that syllable-final -h in Lao romanization had similar functions:
-nh [n] represents an Indic or Khmer ñ
-ch [t] represents an Indic or Khmer c(h)
-h after other consonants represents an Indic aspirate: e.g., -th [t] < Indic th
However, I have found counterexamples with nonetymological h: e.g.,
ເສດຖາທິຣາດ Setthathirath [seetthaathiraat] < Skt śreṣṭhaadhiraaja 'best emperor'?*
I would expect -t or -j, not -th.
ພູມສະຫວັນ Phoumsavanh [phumsawan] < Skt bhuumisvarga 'earth heaven'?
I would expect -n, -nn, or even -ne, not -nh. (-nn and -ne are devices to indicate [n] as opposed to a nasalized vowel for French readers of Lao romanization.)
Are these uses of -h idiosyncracies that somehow caught on, like the silent -h- of the Korean surname 李 Rhee [i] (no consonant!) < ri that corresponds to nothing in Korean?
*The Thai spelling เชษฐาธิราช implies Skt jyeṣṭhaadhiraaja 'eldest/best emperor', but I wonder if เชษ sêet is an attempt to transcribe the low falling tone of Lao ເສດ sêet without regard for etymology. The etymologically correct เศรษ sèet would have a low nonfalling tone.
4.12.0:02: This Lao author romanized ສິ້ນ [sin] (spelling implying *sin) 'Lao skirt' as sinh with a nonetymological -h. The word corresponds to
Thai ซิ่น sin (spelling implying *zin; no final -ญ corresponding to -nh) 'wrap-around skirt, sarong; skirt'
Khmer ស៊ឹង sɨŋ (without a final -ញ [ɲ] = -nh) 'kind of cotton skirt with floral pattern / stripes (commonly worn by young women and girls)'
I don't know why Khmer has a different nasal. Khmer -ŋ normally corresponds to Lao and Thai -ŋ, not -n. I guess Khmer borrowed the word from some language in which *-n had backed to -ŋ. The nonmatching Thai and Lao spellings imply that one or both may be nonetymological: i.e., one borrowed the sound of the word from the other without regard for etymology.
4.12.2:33: Vietnamese xiêm [siəm] 'skirt' has yet another coda! The x- may not necessarily indicate an original *ch-. The word could have been borrowed from an *s-original after *ch- > Middle Vietnamese ɕ-, or even after ɕ- became [s].
4.13.1:31: The trəysap diacritic in Khmer ស៊ឹង sɨŋ probably indicates a loanword. Vowels after Khmer s- without trəysap:are nonhigh but are high(er) after s- plus trəysap:
សឹង səŋ 'almost all' (without trəysap)
ស៊ឹង sɨŋ 'kind of skirt' (with trəysap)
High(er) vowels normally reflect proto-voiced initials, so one might initially guess that sɨŋ is from earlier *zɨŋ. However, earlier Khmer had no *z-. Therefore sɨŋ may be a loan from a language which allowed voiceless s- to be followed by a high vowel.