The Vietnamese words for 'gecko' look like borrowings:

tắc kè ~ cắc kè ~ cắc ké

Vietnamese roots are monosyllabic. If this were a typical native Vietnamese term, it should be a compound, but none of these disyllabic sequences make sense as compounds. Moreover, the variation of initials and tones (t ~ c and ~ ˋ ~ ˊ) is alien to Vietnamese. Similar words are not only in other Mon-Khmer languages but also in unrelated Southeast Asian languages: e.g.,


Mon kapke ~ kapkai

Khmer តឹកកែ  <tɨkkɛ>


Thai ตุ๊กแก túkkɛɛ (the high tone on the first syllable is atypical)
Lao ກັບແກ້ kápkɛ̑ɛ ~ ກັບໂກ້ kápkȏ


Burmese တောက်တဲ့ <toktaiˀ>

Kokborok toʔ-ke (in South Asia!)

Phuza tʌ⁵⁵ kɛ³¹


Indonesian tokék

Tagalog tuko

Ilocano tekka

The word has even entered English as tokay. Do Hmong-Mien languages have similar words?

I thought that maybe the word had spread from one language to the others, but I couldn't figure out what the source was. It turns out that there was no single source; these words are all renditions of the gecko's mating call. (Although I've lived with geckos all my life in Hawaii, I've never heard that call.) They are no more related than Japanese karasu 'crow' and English crow.

Vietnamese t- is usually from earlier *s-, and earlier *t- should have become đ- [ɗ]. Do the Vietnamese t-words for 'gecko' postdate the shift of *s- to t-, or are they onomatopoetic archaisms that were immune to the shift like the noises sheep make in Greek*?

*Harry Foundalis:

[...] direct evidence for [an earlier stop pronunciation of] beta comes from a fragment of Attic comedy where it is said that the voice of the sheep is BH-BH [i.e., βῆ βῆ [bɛː bɛː]]. In Modern Greek this would read as "vi-vi", rather un-sheepish-like; while in the reconstructed way it would be "beeh-beeh" [i.e., μπεε μπεε [bɛː bɛː]], exactly the sound that we, contemporary Greeks, attribute to the animal. MOUNTAIN ... CAVE

What does the name of the world's largest cave mean? sondoongcave.org glossed hang Sơn Đoòng as 'mountain river cave'. Although hang is the native Vietnamese word for 'mountain' and sơn is Vietnamese for 'mountain' (from Chinese 山), đoòng does not resemble any Vietnamese or Lao word for 'river'. (The cave is near the border with Laos.) Is it a Vietnamized minority language word for 'river'? The Katuic language Bru is spoken in Quảng Bình Province (where the cave is) and đoòng [ɗɔŋ͡m] resembles Sidwell's (2005) Proto-Katuic *rɔɔŋ 'river'. [ɗ] might be the closest possible substitute for foreign [r] in central Vietnamese which lacks [d] and [r]. *WRDH 'GROW' COME FROM?

The first of Merriam-Webster's "Big Words on Campus" is culture - Thai วัฒนธรรม watthanatham < Pali vaḍḍhana- + Sanskrit dharma-. Pali vaḍḍha- is from Sanskrit vṛddha- whose root is √vṛdh. Its Greek cognate is (ϝ)ορθός. So I'd expect the Proto-Indo-European (PIE) root to be √*wrdh. However, Cheung (2007: 208) reconstructed Proto-Iranian *(H)u̯ard-, implying a PIE √*Hwrdh. Odder still, Watkins (2000: 24) reconstructed PIE *ʔ(e)rʕʷdh- (in my preferred notation) without any *w at all. How would Watkins have accounted for the Indo-Iranian and Greek forms?

9.13.0:43: Does any attested language have a cluster like rʕʷdh? Could the laryngeal have switched places to break up the cluster?

PIE *ʔerʕʷdh- > *ʕʷerdh-

But PIE *ʕʷ- did not become Sanskrit v- or Greek ϝ-.

Looking further down Cheung's entry for Proto-Iranian *(H)u̯ard-, I see another metathesis-based explanation:

This IIr. root is according to Schindler apud Krisch: 24 f. from (metathesized) IE *H1l(e)udh- (> Ir. *Hraud) that has given rise to a new ablaut series: > IIr. *Hurdh/Hu̯ardh. No explanation has been provided for the assumed metathesis of *ru > *u̯r. Perhaps, this root has been contaminated with semantically similar roots, notably *Hard1 [glossed by Cheung as 'to prosper' on p.163 and derived from PIE *ʕeldh-].

However, the *l in PIE *H1l(e)udh- (which I would write as *ʔl(e)udh-) would not have become the ρ in Greek (ϝ)ορθός. 105 RHYMES DATABASE UPDATE: HASHIMOTO (1965)

I added Hashimoto's 1965 Tangut rhyme reconstructions to my database (Excel / HTML). It is a shame that Hashimoto never published a full-scale reconstruction - or at least a list of initials.

Hashimoto's rhymes consist of a vowel (short or long) which may be flanked by glides and followed by -N:


Out of 2 x 6 x 2 x 3 x 2 = 144 possibilities (excluding combinations with medial -w-), he reconstructed only 98, leaving 46 for 'departing' or 'entering' tone syllables with rhymes not found with the 'level' and 'rising' tones.

There are six possible vowels disregarding length: u, o, e, i, a, ə. (These are also the six vowels that I reconstruct for pre-Tangut.) Hashimoto wrote their long counterparts as U, ɔ, ɛ, I, A, э. They are never tense or retroflex, so all 98 rhymes have the same color in my file.

There are four grades:

vowel length \ medial -Ø- -j-
short II III
long I IV

I also reconstruct four grades, though mine are unrelated to vowel length:

Grade Description / Base vowel u i a ə e o
I lower vowels əu əi a ə e o
II even lower vowels ʊ ɪ æ ʌ ɛ ɔ
III nonpalatal -ɨ- + higher vowel ɨu ɨi ɨa ɨə ɨe ɨo
IV palatal -i- + high vowel iu i ia ie io

My a is a low central or back unrounded vowel: IPA [ä] ~ [ɑ]. Its lower counterpart æ may have been fronter - perhaps [a̞] in IPA. 105 RHYMES DATABASE UPDATE: NISHIDA (1964)

Five years ago, I uploaded a database comparing my Tangut reconstruction with those of Gong (1997) and Arakawa (1999). After a minor revision in 2009 (that I later forgot about - oops!), I finally got around to adding Nishida's 1964 reconstruction to my Excel file. (The file is also available in HTML.)

Although Nishida's reconstructions has only 102 rhymes instead of the 105 of most other reconstructions, it has many other characteristics that are still in my reconstruction nearly a half century later:

- a three-way distinction between plain, tense, and retroflex vowels

(Nishida also had tense retroflex vowels in his rhyme class XXII)

- a distinction between oral and nasal vowels

- thirteen vowels (disregarding the above distinctions) with five degrees of height

(twelve of the thirteen match if my æ and a are rewritten to match Nishida's a and ɑ; only Nishida's ʉ does not match my ʌ or ɤ)

Unlike the other three reconstructions, it lacks phonemic vowel length.

The most significant unique characteristic of his reconstruction is the final consonant which appears in nearly a third of his plain vowel rhymes. Does any Chinese-type language today have as a coda? NA-T EIGHT

Andrew West pointed out that the Tibetan transcription na for

1ʔiaʳ 'eight'

was actually for the similar character

2nia 'second person singular suffix'

2nia is probably an extended use of the homophonous pronoun

2nia 'thou'

which should go back to pre-Tangut *Cɯ-na-H. The presyllable is needed to account for the 'secondary yod' -i- absent from cognates like Written Burmese naŋ 'thou'. There was no *-ŋ in the pre-Tangut form for 'thou', as *-aŋ would have become Tangut -o. (That sound change was shared with the neighboring Chinese dialect.) Perhaps -2nia became a suffix after the presyllable was lost, as a phonologically complex form is less likely to be an affix. But was the Tangut person marking system really of recent origin? Moreover, why would a second person singular pronoun have a prefix?

Could the -i- of 2nia be due to analogy? There is an honorific second person pronoun


which might be a loan from Chinese 你 or be from *naH with a 'brightened' vowel. Was an earlier *2na changed to 2nia to match 2ni?

Maybe there was no -i-. If 'thou' were 2nia with -i-, why was it transcribed in Tibetan as na rather than nya? My -i- - and Gong's -i- and -j- - often correspond to zero in the Tibetan transcriptions. It is possible that the transcriptions reflect a dialect with simplified diphthongs and medial glide loss, but it is also possible that the reconstruction of -i- and -j- may simply be wrong. In many reconstructions, such medials distinguish otherwise identical rhymes: e.g.,

Rhyme number Nishida 1964 Hashimoto 1965 Sofronov 1968 Huang 1983 Li 1986 Gong 1997 Arakawa 1999 Sofronov 2012 This site Tibetan transcriptions
17 -ɑɦ -ÄwN -a -ɑ, -uɑ, -iɑ -a -a -a -a -a(H) (but -ar x 1)
18 -a -äwN -ɑ̃ -ia -ia -ya -ɑ̂ none
19 -ǐa -jäwN -i̭a -iɑ̃, -iã a -ja -a: -jɑ -ɨa -a(H)
20 -aɦ -ÄN -aC -æ̃ -a -a -ia -a (but -aH x 1)

All four of these rhymes are rendered simply as -a without any medial in my agnostic lay transcription.

2nia had rhyme 20, and no rhyme 20 word was ever transcribed with -y- in Tibetan. Most Chinese transcriptions of rhyme 20 also lack *-j-: 麻達辢怛[口+捺] (but 截宣[衤+旋] did have *-j-). If 2nia was just 2na, how did its vowel differ from that of rhyme 17 which was almost certainly *-a? Could Tangut preserve vocalic distinctions (e.g., two kinds of a) lost in the rest of its language family? Did Proto-Sino-Tibetan have different vowels in the ancestors of the Tangut pronouns

1ŋa 'I' and 2nia 'thou'

cognate to Old Chinese 吾 *ŋa and 汝 *Cɯ-naʔ? 'A:R-AKAWA'S EIGHT HORSES

In my previous entry, I proposed that Tangut

1ʔiaʳ 'eight'

retained a 'primary yod' (-i- in my notation) that was lost in Chinese. However, Arakawa (2006: 123) reconstructed 'eight' as 1'a:r (IPA [ʔaːʳ]?) without a palatal segment of any kind. This is surprising because in the Timely Pearl, 'eight' was transcribed in Chinese with *j-characters: 耶 (9.3, 12.1) and 盈 (36.5). Could Arakawa's long a: be a monophthongization of an earlier *ia with compensatory lenghthening?

Tai (2006: 323) found three Tibetan transcriptions for 'eight': rye, na (sic!), and ?e.

Did rye reflect [rje] in a dialect without 'rj-eduction', or was r- an attempt to record vowel retroflexion rather than an initial [r]?

I have no explanation for na which is not even graphically similar to ya.

I bet the consonant below e was y.

I reconstruct both

1rieʳ and 2riaʳ 'horse' (this collocation is in Homophones 53B27)

with a 'secondary yod' -i-, but according to Kotaka's dictionary of Arakawa's readings, Arakawa reconstucted the first word as 1ryeq'2 (what is the -2?) and a homophone of the second word as 2ra:r (so I presume he reconstructs 'horse' as 2ra:r). Arakawa's reconstruction has a yod in 1ryeq'2 but not in 2ra:r.

My secondary yod was conditioned by high vowels in lost presyllables. What is the source of Arakawa's -y-? It happens to correspond with my -i- in the first word for 'horse' but it usually corresponds to zero in my reconstruction: e.g.,

Arakawa 2phyo : my 2pʰɔ < *K-Pro-H 'snake' (cf. Written Tibetan sbrul and Caodeng rGyalrong qa-preʔ)

Perhaps his secondary yod is from medial *-r-, just as Burmese *-j- is from an even earlier *-r-. That derivation would account for -y- in 2phyo 'snake', but not for -y- in 1ryeq'2 'horse' since I doubt ry- could come from *rr-.

9.9.10:09: Gong's -i- generally corresponds to Arakawa's -y-; that -i- would also come from *-r-. However, Gong did not reconstruct medial -i- in

1rjijr = Arakawa's 1ryeq'2 and my 1rieʳ

or in any other r-syllable. This could imply that his *-i- is always from *-r-. *ri- does not exist in his reconstruction because it would be from an improbable *rr-.

