The Thai title for The Karate Kid is

คิด คิด ต้องสู้

khit khit tɔɔŋsuu

A machine translation program might turn that into

'think think Karen'

since khit means 'think' and tɔɔŋsuu is a Thai word for the Karen people.

What does the Thai title really mean?

khit looks like a Thai version of English kid:

English k [kh] is like Thai aspirated kh (whereas Thai unaspirated k sounds like English g)

English and Thai i more or less match

English allows -d at the ends of words but Thai doesn't, so Thai -t is the best possible substitute; this -t is spelled ด <d> to indicate that it corresponds to a foreign -d.

The Thai titles of The Karate Kid, Part II, The Karate Kid, Part III, and The Next Karate Kid confirm my khit = kid equation:

คาราเต้ คิด 2

khaaraatee khit 2 (rather than khit khit tɔɔŋsuu 2!)

คาราเต้ คิด 3 เค้นเลือดสู้

khaaraatee khit 3 khen lɯat suu (lit. 'squeeze blood fight')

คาราเต้ คิด 4

khaaraatee khit 4

They also contain the Thai word for 'karate'.

Could tɔŋsuu be a Thai word for 'karate' as well as a term for the Karen? tɔŋsuu sounds like some hypothetical non-Mandarin Chinese reading of 唐手 'karate' (lit. 'Chinese (!) hand' even though karate is from Okinawa; now normally written 空手 'empty hand'):

Hakka thɔŋsu

Cantonese thɔŋsau

Amoy thɔŋsiu (lit.) or thŋtshiu (colloq.; first syllable has a syllabic nasal)

Chaozhou tɯŋtshiu

Fuzhou touŋtshieu

However, none of the above are a perfect match for tɔŋsuu.

The actual solution is much simpler: tɔɔŋsuu is two native Thai words: tɔŋ 'must' (with a short vowel) and suu 'fight'. -ɔɔŋ and -ɔŋ with a falling tone are written identically and spaces do not normally separate words in written Thai, so


tɔɔŋsuu 'Karen people' (both syllables with falling tones)


tɔŋ suu 'must fight' (both syllables with falling tones)

can only be distinguished through context.

I am still puzzled by why khit appears twice in khit khit tɔŋ suu. Does the title literally mean 'kid kid must fight'? Does reduplication indicate plurality: 'kids must fight'? Cf. เด็ก ๆ dek dek 'children' < เด็ก   dek 'child'. But note the presence of the reduplication sign ๆ in เด็ก ๆ dek dek and its absence in คิด คิด khit khit. Is the reduplication sign not used with the foreign word khit?

To confuse us all further, the Thai title

เดอะ คาราเต้ คิด

dəə khaaraatee khit

'The Karate Kid'

refers to the 2010 movie which involves kung fu in China! CONVER-Z-ENCE AT 的 DE 底 BOTTOM?

In Ukrainian, two prepositions, съ 'with'* and из(ъ) 'from'**, have merged into one form (з) whose function is determined by the case of the following noun phrase: e.g.,

з Ольгою 'with Ol'ha' (instrumental case; cf. Russian с Ольгой 'with Ol'ga')

з Ольги 'from Ol'ha' (genitive case; cf. Russian из Ольги 'from Ol'ga')

The Ukrainian preposition з "is prone to devoice before voiceless consonants" (Shevelov 1993: 951). This may be the key to the fusion of the two forms:

- earlier из(ъ) 'from' could have become [is] before voiceless consonants

- [is]/[iz] 'from' is very close to [s]/[z] 'with'

- [i]-loss made the two identical in form (though the cases of following noun phrases were unaffected)

I presume voicing assimilation is also the reason for z-forms for 'from' in different Slavic languages: e.g.,

Polish z(e) (with nonpalatal consonant + e < *ъ)

Cassubian s ~ z (former in Vasmer; latter in Stone 1993: 784)

Polabian ~ (former in Polański 1993: 798, 800; latter in Vasmer)

cf. Slovak s otcom [zotsom] 'with father', so sestrou [zosestrou] 'with sister' in which s assimilates to a following vowel in pronunciation but not spelling (Short 1993: 536)

In my previous entry, I defended the view that the Mandarin adnominal/nominalizer 的 de is simply a respelling of Old Chinese 之 *tə. However, could 的 be a fusion of a colloquial *tə (< 之) and the source that Yap, Choi, and Cheung (YCC) proposed (Middle Chinese 底 *tejʔ 'bottom')? A grammaticalized *tejʔ could be pronounced as unstressed *təʔ or even just *tə, merging with exisitng *tə (< 之).

According to YCC (2007: 27):

[...] previous literature (e.g. Jiang 1999) has provided ample evidence to show that 底 di had clearly evolved locative and pronominal functions by the Tang and Song period, then further into adnominal and nominalizer functions as well

Perhaps this is what happened:

1. *tə maintained its adnominal and nominalizer functions in colloquial (unwritten) Middle Chinese

2. 底 *tejʔ 'bottom' developed locative and pronominal functions:

'bottom' > 'be at the bottom' > 'at the end' > 'at' (> relative clause marker?)

'bottom' > 'the below' > 'this'

(Are there other languages in which 'top', 'bottom', etc. became pronouns? Is Cantonese 邊個 bingo 'who' really from 'side' + 'item', as its spelling indicates? Cf. the use of 底 for 'what'.)

3. 底 became unstressed *tə(ʔ) in colloquial Middle Chinese, merging its functions with those of the spoken adnominal/nominalizer *tə.

4. At this point, *tə emerges as 底 and 的 in written texts with adnominal and nominalizer functions.

5. The spelling 底 becomes largely obsolete, though Lin Yutang (1972) notes that it is

Used in place ofas particle sign of adj., affected by some modn. writers to imitate European languages: 金屬底 metallic, 亞細亞底 Asiatic; dist. from possessive part. 的 (我的,你的) and fromas adv. part. corresponding to Eng. "-ly" (快快地 quickly)--usage among these imitators not established and not uniform, the three  的,底,地 being pr. alike as unaccented '[de1] (sic; I would write [de0]).

My assumption is that written texts lag behind speech.

*From Proto-Slavic *sъ(n), cognate to Greek homo 'same' and the San- 'with' of Sanskrit.

**From Proto-Slavic *jьz-, cognate to Latin ex 'from'. More cognates here. GETTING TO 的 DE 底 BOTTOM OF CHINESE NOMINALIZATION

This 2007 paper by Yap, Choi, and Cheung (YCC) indirectly debunks two common myths about Chinese:

The Pictographic Myth: Chinese characters are WYSIWYG (what you see is what you get) symbols for ideas, not words. Pictures that directly convey meaning without (ew, gross) sound.

The Eternal Myth: An extension of the pictographic myth. Since Chinese characters are (supposedly) transparent semantic symbols, their meanings are eternal, and anyone literate in Chinese characters can read any text from any period.

YCC list ranges of meanings for 底 'bottom' and its near-homophones and potential cognates such as 砥 'grinding stone' (< something for flattening = making floor-like) over time (Tables 1-6). The graph 底 has 54 different meanings over roughly two millennia (Table 6). The two myths above do not predict such diversity. Moreover, during the following period (the Tang and Song Dynasties),

[...] the character de (的), which is derived from a lexical noun meaning ‘target’, came to be used interchangeably with di (底), and both gradually came to replace most of the adnominal and nominalizer functions of suo, xu, zhe and zhi. In time, de (的) came to completely replace di (底), and is now the ubiquitous adnominal as well as one of the most versatile nominalizers in Modern [Standard] Chinese [= Mandarin, not modern Chinese languages in general].

Notice the wording (emphasis mine): "the character [...] derived from a lexical noun". YCC confuse characters with words. The two categories overlap to a considerable extent in Chinese, but the overlap has never been absolute, as there have always been multicharacter, polysyllabic words in Chinese. Skimming YCC's paper, I don't see an attempt to deal with the issue of disentangling characters from words. One character can serve as a phonetic symbol representing a range of (nearly) homophonous words which may or may not be related.

Equating character with word can result in dubious etymologies: e.g, 地 Md di 'earth' is used to write the Mandarin adverbial suffix -de. That usage does not necessarily mean that the adverbial suffix -de is a grammaticalization (grammatical derivative) of di 'earth', in spite of one spelling for both items.

If one wrote English in a Chinese-like script and used the same character 目 (a drawing of an eye) for both eye and I, should one conclude that I is a grammaticalization of eye, or should one simply note that one character stands for two unrelated homophonous words?

I suspect that the one character 底 stood for members of two or more unrelated homophonous word families. Someone proposed that the modern Mandarin adnominal/nominalizer suffix 的 de is really the same word as Old Chinese 之 (read as zhi in Mandarin) in spite of the different spellings and modern pronunciations, and I agree. Here's what I think happened:

1. In Old Chinese, there was a pronominal root *t- with derivatives 之 *tə 'this' and 者 *tjaʔ (< *tə-aʔ?) 'person'.

2. 之 *tə was grammaticalilzed into an adnominal and a nominalizer.

3. *tə regularly developed into *tɕɨ in formal Middle Chinese reading pronunciation (in which every syllable would have been stressed), whereas the unstressed pronunciation *tə was preserved in colloquial Middle Chinese.

4. Middle Chinese speakers no longer perceived an etymological connection between stressed, formal *tɕɨ and unstressed, informal *tə.

5. There is a strong tendency to write etymologically related, nearly homophonous words with the same character (or at least with the same phonetic). However, *tɕɨ and *tə were no longer homophonous. When informal MC *tə crept into written formal Middle Chinese, it was written with the graphs for 底 *tejʔ 'bottom' and 的 *tek 'target' (implying a colloquial pronunciation like *təʔ?) since there were no graphs pronounced *tə in Middle Chinese.

6. The 'bottom' spelling fell out of favor, but the 'target' spelling persists to the present.

In this scenario, the root *t- has remained in use for over three millennia, though this fact is disguised by different spellings (之底的). If this sounds improbable, consider how some grammatical morphemes in Indo-European have remained stable over an even longer period of time.

In YCC's scenario, 之 *tə 'this' is replaced by a grammaticalization of 底 *tejʔ 'bottom' which happens to evolve into a homophone of the lost *tə 'this'. I would rather not reconstruct such a coincidence. I don't think *tə was lost at all, even though it lost its association with its original spelling 之.

Why such different conclusions? YCC are not historical phonologists. They examine texts through the anachronous prism of modern standard Mandarin. Just as a connection between MC *tɕɨ and *tə may have seemed improbable to a Middle Chinese speaker, a connection between Md 之 zhi and 底 di / 的 de may seem improbable to a modern standard Mandarin speaker. But a knowledge of sound change opens up the possibility that di/de may be archaisms reflecting an earlier pronunciation of zhi.

A known case of such a grammatical archaism is the Mandarin third person pronoun 他 ta [tha] which has survived more or less unchanged from Middle Chinese *tha 'other', bypassing the sound change

MC *-a > Md -uo after dentals: e.g.,

多 MC *ta > Md duo [two]

陀 MC *da > Md tuo [thwo]

儺 MC *na > Md nuo [nwo]

羅 MC *la > Md luo [lwo]

(The tones and vowels of MC *tha and Md ta may differ to some unknown degree.) The regular reading tuo for 他 is rarely used today.

