Last night, I ate at a classy Korean restaurant in NYC and was surprised to see the common dish

육개장 (肉개醬)


lit. 'meat dog sauce*'

(see here kaejang is short for 개장국 kaejangguk 'dog sauce soup', a.k.a. 보신탕 (補身湯) poshinthang 'repair body soup'; more here)

misspelled as


on the menu. (Oddly, it was romanized on the menu as "Yookgae Jang" with "ae".)

This error reflects the homophony of -ae, -e, and -ye after consonants in modern Korean. In theory -ae is [ɛ] and -e is [e], but in practice, the two are often homophonous. And -ye is always homophonous with -e unless no consonant precedes it:

ye [je] (not [e])

Cye [Ce] (not [Cje])

In premodern Korean orthography, -ye could appear after all consonants but in modern orthography, it appears only after k- and r-. Its appearance after r is justifiable on synchronic morphological grounds as well as historical phonology since rye-morphemes have ye-variants in initial position:

rye ~ 예 ye (禮) 'ritual'

세례 serye [sere] (洗禮) 'baptism' (< 'wash-ritual')

예절 yejŏl [jedʑɔl] (禮禮節) 'manners' (< 'ritual-integrity')

But why retain ye after k if kye-morphemes are never currently pronounced with y? Was ky- simplifying to k before e during or shortly after the orthographic reform of 1933 that is the basis (with subsequent modifications) for modern South and North Korean spelling? Similar questions could be asked about the silent y in the syllables

mye [me], homophonous with 메 me [me]

phye [phe], homophonous with 페 phe [phe]

hye [he], homophonous with 헤 he [he]

The misspelling yukkyejang may also reflects the fact that yukkaejang (the dish, not the word) doesn't actually contain dog (kae). kye is not a simple typo because on a standard tupulshik keyboard, ㅖ ye requires pressing a shift key (<shift> + <p>), whereas ae and e are on adjacent keys ㅐ <o> and ㅔ <p> not requiring a shift key. Hence I doubt the typist would misspell 개 kae 'dog' as 계 kye. Yukkyejang sounds like it could be wholly Sino-Korean. After all, two-thirds of it consists of Chinese loans: 肉 yuk 'meat' and 醬 jang 'sauce'. So if asked about the etymology of yukkaejang (not that this is ever likely to happen), the typist might assume [ke] was from some Sino-Korean morpheme kye: e.g., 界 計 系 係 械 溪 繼 契 階, etc. The only kye in this list (the first nine provided by Vista's input method) that I've seen in English is 契 kye 'mutual aid society'.

At least the typist got 게 ke 'crab' right elsewhere on the menu!

*11.29.3:19: 醬 'sauce' is the sho of 醬油 shoyu, Japanese for 'soy sauce'. 油 yu is 'oil'. HOW DID HINDU END UP AS HODU?

In my last post, I mentioned the literal translation of the Hebrew term for 'turkey'. Here's the term itself:

תרנגול הודו

w-d-w-h l-w-g-n-r-t (right-to-left transliteration)

tarnegol hodu

'rooster India' = 'Indian rooster' = 'turkey'

And Balashon has an entire post about the term. (One correction: The Sanskrit term for the Indus river is Sindhu, not Sind.) Unfortunately it doesn't explain why hodu has -o- instead of -i-, and I'd like to know whether -n-loss really happens "often". I know of no variant of Indus with -o-.

Since I can't get online with my laptop, I am looking at the Wikipedia article on the Indus River on my iPhone and see two errors:

Sindhu is misspelled in Devanagari as



Now that my laptop's Internet connection is working again, I guess my iPhone may have trouble with Devanagari (a big ouch for Hindi-literate users!). The ि i should connect with स sa, cancelling out the a vowel, and न् n should form a ligature with धु dhu:



Even more bizarre is the 'Avestan', given in Arabic (!) script as


Hendu (modern Persian for 'Hindu')

but in romanization as Harahauvati (dubious; looks like a distorted cognate of Sarasvati).

The third Chinese name (印度河 Md Yindu He) is the normal name - the others (森格藏布 Md Senge Zangbu and 狮泉河 Md Shiquan He) are Mandarinized Tibetan and should not be listed first and second.

I don't feel like researching the other names.

I noticed that there's a Fiji Hindi page on the Indus River - in romanization!

What is Fiji Hindi?

The relation between Fiji Hindi and Hindi is similar to the relation between Afrikaans and Dutch.

I'm wary of that claim or anything else at Wikipedia at the moment. This shall pass ...

11.28.0:07: Balashon also has a post about the etymology of tarnegol 'rooster'. DID YOU EAT A SEVEN-FACED BIRD?

In Japanese and Korean, turkeys are called 'seven-faced birds':

Jpn シチメンチョウ (七面鳥) shichimenchou

Kor 칠면조 (七面鳥) chilmyŏnjo

According to Wikipedia, the name refers to how the skin on an excited turkey's neck changes color (seven colors > seven faces).

I presume the name was invented in Japan and spread to Korea during the colonial period because the Chinese and Vietnamese use different terms:

Mandarin 火雞 huoji 'fire-chicken'

Vietnamese gà tây 'Western chicken' (tây < Chn 西 'west')

In Thai, a turkey is a ไก่งวง kai nguang 'proboscis chicken'. nguang is native, but kai is borrowed from southern Chinese and is hence cognate to Mandarin 雞 ji < *ki < *ke. *e broke to *ai in early southern Chinese (e.g., Cantonese kai 'chicken') but not in early northern Chinese (i.e., the ancestor of Mandarin).

23:43: Wikipedia has an article devoted to words for 'turkey':

Arabic: 'Roman rooster', 'Ethiopian bird'

Blackfoot: 'big bird'

Bulgarian: derived from 'Egypt'

Catalan: 'Indian chicken'

Egyptian Arabic: 'Greek bird'

French: 'from India'

Greek: 'French chicken'

Hebrew: 'Indian rooster'

Malay: 'Dutch chicken'

Portuguese: 'Peru'

Urdu: 'elephant chicken'

and in Turkish, hindi 'Indian'! LITTLE COLUMNS ARE A BIG HELP

When I lived in Berkeley, I used to eat Kentucky Fried Chicken for Thanksgiving. Not a turkey, but at least it was a bird. So it's fitting that I post about the word colonel. Why is it pronounced like kernel, even though it doesn't contain an -r-?

According to DW Cummings' American English Spelling (p. 449):

"Etymologists do not agree completely on colonel, but whatever the historical dynamics of the word, it is a clear case of mixed convergence, the pronunciation of one, apparently earlier form, coronel, having become attached to the spelling of another."

According to the Oxford English Dictionary (2nd ed., vol. 3, p. 494), the word began as a diminutive of 'column': 'little column' > 'one who leads a little column'. French colonne 'column' has no r, so its derivative colonel didn't have one either. But there was also an earlier French variant coronel with an -r- to avoid having a sequence of two l-s in a row (dissimilation). Both colonel and coronel coexisted in English until 1650, when coronel dropped out of use in *writing*. But the [r] of coronel persisted in pronunciation even after the second vowel was dropped starting in the mid-17th century. A 1701 guide to pronunciation says that colonel is [kʌlnəl] - like 'cull-null'. That pronunciation is long gone, but the double-l spelling still lives.

If 'colonel' were literally translated into Chinese, it might be something like 柱子 (柱 = pillar; -子 = diminutive suffix). But none of the Sinospheric terms for 'colonel' have anything to do with pillars:

Language Orthography Pronunciation Etymology
Japanese 大佐 taisa big assistant
Vietnamese đại tá (大佐) [ɗaaj taa]
North Korean 상좌 (上佐) sangjwa top assistant
South Korean 대령 (大領) taeryŏng big leader
Mandarin 上校 shangxiao top senior officer

The North Korean term seems like a compromise between the Japanese and Chinese terms. (North Korea also has a rank 대좌 (大佐) taejwa 'big assistant', but this refers to a 'senior colonel', a rank without any Japanese equivalent. Its Chinese equivalent is 大校 dajiao 'big senior officer'.)

In East Asian languages, officers are divided into three groups generally regardless of branch of service. Each group has its own suffix preceded by 少 'few', 中 'medium', 上 'top', and/or 大 'big'. I find this prefix-suffix system easier to learn than its English equivalent. For brevity I have only listed US Army ranks in the top row. A naval lieutenant would be in group B, an admiral would be in group C, etc.

Language Group A: lieutenants and captains Group B: majors and colonels Group C: generals
Japanese -尉 -i < wi -佐 -sa -将/將 -shou
Vietnamese -uý (尉) -tá (佐) -tướng (將)
North Korean -위 (尉) -wi -좌 (佐) -jwa -장 (將) -jang
South Korean -령 (領) -ryŏng
Mandarin -尉 -wei -校 -xiao -将/將 -jiang

-尉 means 'junior officer' and -将/將 means 'general'.

North Korea carried over the group B -佐 suffix from Japanese, but South Korea has its own group B suffix -領. IS THERE A CHINESE CHARACTER FOR 'WHITE MAN'?

(Originally written 09.10.6; revised 11.24-25.)

This Chinese character doesn't seem very PC. It looks like 白 'white' plus 男 'man'. My initial impulse is to guess that it fits the normal pattern:

semantic left-hand element phonetic right-hand element

So I'd expect 白+男 to have something to do with whiteness like

jiao 'white; bright'

hao 'white and clean'

ai 'white and clean'

hao 'white; bright'

po 'white'

(definitions from 遠東袖珍英漢‧漢英辭典, 1985)

and sound like 男 nan 'man'. But 白+男 is read jiu. So is it a semantic compound for jiu 'white man'? No, there is no such word. 白+男 turns out to be one of a set of variants for 舅 jiu 'maternal uncle'. 男 is semantic and 白 'white' is a distortion of the phonetic 臼 jiu 'mortar'.

One of the variants of 舅 jiu 'maternal uncle' is 旧, identical to a simplification of 舊 jiu 'old' (with the same phonetic 臼 jiu 'mortar'). 旧 shows how far sinography is from 'ideography' - how does a line to the left of a 日 sun convey the idea of 'maternal uncle'?

There is another variant with 女 'woman' on the left, presumably to clarify that a 男 man who sounds like 旧 jiu 'old' is on the mother's side of the family. SHOULD I LINK *XLAK AND *LAWK?

Last night, I looked at some members of the Old Chinese *xlak word family. I also linked to my old post on 的 *t-lewk 'bright' (now used to write the unrelated morphemes 'target' and 'possessive particle'), a member of the *l-wk 'bright' word family:

*r-lawk 'bright'

耀曜燿 *lawk-s 'brilliant'

*t-lawk 'to burn'

瀹爚 *lawk 'to shine'

*hlawk < ?*s-lawk 'to shine'

Could *xlak and *l-wk ultimately be derived from a single root *l-k 'bright'?

*xlak < ?*xV-lak

*l-wk < ?*l-k-pV or ?*l-k-u

And could the root *laŋ 'bright' in

*laŋ 'sunshine' > 'yang; male principle; opposite of yin'

*laŋ 'shining'

*r-ʔ-laŋ 'brilliant' > 'flower'

phonetic is 央 *ʔlaŋ 'center'

and perhaps

*t-hlaŋ or *k-hlaŋ 'splendid'

*s-t-laŋ 'crystal; sparkling'

also be related via a nasal suffix?

*laŋ < *lak-NV

Is it a coincidence that there is a similar-sounding root *raŋ 'light' in

*raŋ-s 'light'

*raŋ-ʔ 'bright'

*p-raŋ-ʔ 'bright'

*m-raŋ 'bright'

*s-raŋ-ʔ 'twilight'

*k-raŋ-s 'mirror'

*ʔ-raŋ-ʔ 'shadow'

Were *laŋ and *raŋ variants of a single earlier root? Just as *laŋ has a stop-final counterpart *lak, *raŋ may have a stop-final counterpart in

*rak 'to burn'

though that word could also be reconstructed as *kʌ-lak (cf. Siamese คลอก khlɔɔk < *gl- 'to burn' which may be a loanword from Chinese). BIG TWO HUNDRED

I don't have two hundred unpublished posts, but I do have a lot, and since I am out of time, I'm going to start uploading them. Here's one I started writing on 09.10.6 and finally got around to completing tonight:

I saw the name 강대석 (姜大奭) Kang Tae-sŏk in this photo exhibition (via Brian Deutsch) and couldn't remember what the low-frequency character*sŏk meant. It doesn't sound like its components

tae 'big'

paek 'hundred'

pyŏk 'two hundred'**

in modern Korean pronunciation. 民衆活用玉篇 Minjung hwaryong okphyŏn (2006) glosses 奭 as

클 석 (大也) 'big sŏk' (大 'big' is obviously semantic in 奭)

성낼 석 (怒也) 'getting angry sŏk'

붉은모양 석 (赤貌) 'red appearance sŏk'

There is nothing in 奭 hinting at anger or redness. Were 'angry' and 'red' written with a graph 奭 for an unrelated homophone 'big'? See below.

According to Shuowen,pyŏk is the phonetic of 奭 sŏk. How is this possible?

In Early Old Chinese, 百 'hundred' was *prak. 皕 'two hundred' is not attested in Early Old Chinese, but was *pɨək in Late Old Chinese. I wonder if the vocalism of *pɨək is due to a lost prefix reduced from 二 *nis 'two':

*nis-prak > *ni-prak > *ni-priak > *ni-prɨək > *prɨək > *pɨək

EOC *prak, on the other hand, became LOC *pæk.

奭 might be expected to have an initial *p- in LOC because of its phonetic 皕, but its LOC reading was *ɕiak. Xu Shen confirmed the -reading by writing that 奭 was read like the name 郝 LOC *ɕiak (with alternate readings *tɕhiak, *xɑk, *ɣɑk***). How could 奭 sound like 皕 and 郝?

I suspect that 奭 was *pɯ-hliak at the time of its creation. This is not too different from 皕 when it was *prɨək. 奭 lost its presyllable, and *hliak regularly developed into LOC *ɕiak.

Furthermore, I think 奭 *pɯ-hliak 'big; angry; red' could be cognate to

赫 LOC*xæk < EOC *r-xlak 'majestic; angry; red'

whose phonetic/cognate is

赤 LOC *tɕhiak < EOC *k-hliak < ?*ki-xlak 'red'

Maybe *pɯ-hliak goes back to *pi-xlak.

I have no idea how to explain the fanqie 㪯朱切 in Yupian indicating that 奭 was also read as Early Middle Chinese *kuo < LOC *kuo.

The various readings of 郝 reflect different presyllables:

LOC *ɕiak < EOC *pi-xlak (no emphasis due to root harmony with nonemphatic prefix)

LOC *tɕhiak < EOC *ki-xlak (no emphasis due to root harmony with nonemphatic prefix)

LOC *xɑk < EOC *xlak (bare root with automatic emphasis due to low vowel)

LOC *ɣɑk < EOC *N-xlak (nasal prefix harmonized with emphatic root)

*奭 is less frequent in Chinese and Japanese than in Korean. Google has more hits for 奭 on .kr sites than elsewhere in East Asia:

.kr: 34,600

.cn: 30,600

.tw: 7,900

.jp: 6,630

.hk: 740

奭 is not frequent enough in Korea to be among the 1,800 Chinese characters taught in South Korean schools, but it is authorized for use in personal names in South Korea.

Chih-Hao Tsai found that 奭 appeared only 42 times in a corpus of nearly 172 million Chinese characters in Chinese Usenet postings. For comparison, the most frequent character 的 (analyzed here) appeared 6.5 million times, and 5,699 characters appeared more than 42 times.

**Some variants of 奭 have other components instead of 皕: e.g., 日 'sun' or 目 'eye' twice or 明 'bright' or even 人 'person' times four.

***The various readings of 郝 reflect different presyllables:

LOC *ɕiak < EOC *pi-xlak (no emphasis due to root harmony with nonemphatic prefix)

LOC *tɕhiak < EOC *ki-xlak (no emphasis due to root harmony with nonemphatic prefix)

LOC *xɑk < EOC *xlak (bare root with automatic emphasis due to low vowel)

LOC *ɣɑk < EOC *N-xlak (nasal prefix harmonized with emphatic root) WHY DOESN'T MIDDLE KOREAN HAVE MORE S-CLUSTERS?

Middle Korean (MK) has many but not all of the s-clusters I would expect:

sp- st- sn- ss- sy- sk-

but not

sm- sr- s-ts- s-h-

There is only one sn-word in MK: snahʌy 'man' with later variants sʌnahʌy, sʌnahɯy, sʌnahi. There are also a number of sʌm-, sɯm-, and sʌn- words (but no sɯn-words - a chance gap?). Perhaps snahʌy is an irregular contraction of a trisyllabic word. Proto-Korean *s- + minimal vowel + nasal may have normaly remained intact instead of contracting into an sN-cluster.

Proto-Korean also had an *l distinct from *r. I wonder if *sl and *sr merged:

Stage 1 Stage 2 Stage 3 Middle Korean
*sʌl-, *sɯl- *sl- *sr- st- or sy-?
*sʌr-, *sɯr- *sr-

*s-ts- may have merged with ts-, as in Lhasa Tibetan:

Classical Tibetan Lhasa Tibetan
ts- ts-

*s-h- may have simplified to s- which is aspirated in modern pronunciation (and perhaps also MK pronunciation?).

