On Thursday night, I wrote,

I would expect *metV to be transcribed as *met [e.g., 滅] or *me tV [e.g., 米都], not *met ʔV [i.e., 滅烏]

I should not have dismissed the third possibliity. Here are three scripts allowing <C-V> junctures at morpheme boundaries before vowel-initial suffixes:

Akkadian cuneiform (Cooper 1996: 48):

<ip-ru-us-am> (instead of <ip-ru-sam>)

iprus-am ?'he came to decide'

'he decided' + ventive suffix

Khitan small script (Kane 2009: 142):

<oŋ-od-en> (instead of <o-ŋo-den>)

oŋ-od-en 'of princes'

'prince' + plural suffix + genitive suffix

Modern Korean hangul


<s-i-p Ø-i> (instead of 시비 <s-i p-i>)

shib-i 'twelve'

'ten' + 'two'

C-V junctures in Akkadian cuneiform could also represent CVʔV sequences: e.g.,


imʔidu 'they have become many'.

All of the above leads me to think that if 滅烏 'destroy crow' represented something like Koguryo *metV, that name must have been bimorphemic: *met(V)-V or *mer-V. (I still doubt that a Koguryo intervocalic *-r- would be written with a *-t Chinese character, though I now concede that possiblity at a morpheme boundary.)

Another possibility is that 滅烏 stood for a monomorphemic *metVV with 烏 representing the final vowel.

There are still other possibilities: e.g., 滅 was pronounced like a Koguryo root for 'destroy' (or represented that root) and/or 烏 was pronounced like a Koguryo word for 'crow' (or represented that word).

There is no guarantee that 滅烏 had any phonetic or semantic similarity to the other Koguryo name 駒城 'colt fortress' or any later names. The two names could be as unrelated as 東京 Tokyo 'eastern capital' and 江戸 Edo 'river-door'. Or they could have a partial relationship like Nieuw-Amsterdam and New York. (Nieuw means 'new', but Amsterdam is not Dutch for 'York'!) COLT FORTRESS DESTROYS CROWS

When I continue my series on Korean numerals, I'll be looking at the very earliest possible transcriptions of numerals on the Korean peninsula.

These transcriptions are in place names which have changed over time. Korean historical linguists tend to assume that if a place changes name from AB to CD, A is somehow equivalent to C and B is somehow equivalent to D. Thus if a place was once called XY and was later called 一Z (one-Z), X might have been a phonetic transcription of an early word for 'one'.

The trouble is that different names for the same place may not be related: e.g., 東京 Tokyo (Toukyou) 'eastern capital' was once known as 江戸 Edo 'river-door'. 東Tou 'eastern' has nothing to do with 江 e 'river', and 京 kyou 'capital' has nothing to do with 戸 -do 'door'.

Are these names for modern Yong'in related? (Thanks to John Bentley for asking me about 滅 in the second name.)

Period Chinese characters Modern Sino-Korean readings Middle Korean Sino-Korean readings Reconstructed Middle Chinese readings Meanings of characters
Koguryo 駒城 kusŏng ku syəŋ (< *seŋ?) *kuo dʑieŋ colt fortress
滅烏 myŏro myərʔ ʔo (< *merʔ?) *met ʔo destroy crow
Late Shilla 巨黍 kŏsŏ kə syə (< *se?) *gɨə ɕɨəʔ giant millet
Koryo 龍駒 yonggu ryoŋ ku *luoŋ kuo dragon colt
Today 龍仁 yong'in ryoŋ zin *luoŋ ɲin dragon benevolence

(Needless to say, the Middle Chinese reconstructions are not relevant to Koryo or modern Korean.)

The place had two names in the Koguryo period.

駒城 'colt fortress' makes sense and may be a Chinese translation of a lost Koguryo name. (I will use 'Koguryo' to refer to the unknown language underlying the transcriptions.)

On the other hand, 滅烏 'destroy crow' makes less sense and therefore is likely to be a native Koguryo name transcribed phonetically as 滅烏. The problem is that we do not know for sure if 滅烏 was read as

a pair of Chinese(-based) readings: *met ʔo

a pair of Koguryo readings: *X Y

some mixture of the two: *X ʔo or *met Y

Itabashi (2003: 146) reconstructed 滅烏 as *meru 'colt' on the basis of the Chinese readings.

Lyu Lyel (1983: 173) reconstructed 滅烏 as *mira-kara/*mara-kuru, a mixture of a Chinese-based reading for 滅 with a hypothetical native Koguryo reading meaning 'black' for 烏 'crow' (inspired by Classical Mongolian qara 'black' and Japanese karasu 'crow'?).

All of these reconstructions have *-r- even though 滅 was not read with *-r in Chinese until around the 8th century, decades after the collapse of Koguryo in 668. If the intent was to transcribe a Koguryo word of the type *mVrV, why not transcribe it with Chinese characters for *mV lV, with Chinese *l for Koguryo *r?

I briefly considered the possibility that 滅 represented a form like *mer < *mör < *möri(n) < *morin (cf. Classical Mongolian morin 'horse'). Since Chinese had no *-r, *-t might represent Koguryo *-r. (But I think a medial rather than a final Koguryo *-r- would be transcribed with Chinese *l-; see above.)

However, I now wonder if 滅 represented a Koguryo cognate *met(s) of Middle Korean mʌyatsi 'colt'.

I don't know what 烏 represented. If 滅烏 is a very old transcription, 烏 may have represented Koguryo *a (cf. Late Old Chinese *ʔɑ). But I suspect 烏 had some Koguryo native reading which did not begin with a vowel. (I would expect *metV to be transcribed as *met or *me tV, not *met ʔV.) (Wrong!)

I think 巨黍 *kə se or *kə syə might be a Shilla transcription of a Sino-Koguryo pronunciation of 駒城 (*kɔ se?) which didn't sound like the Shilla pronunciation of 駒城 as *ku seŋ or *ku syəŋ (i.e., more or less identical to the Middle Korean pronunciation of 駒城 seven centuries later).

2.19.0:35: Why do later names have 龍 'dragon'? In Middle Korean, 'dragon' was mirɯ. MK -r- may be a lenition of an earlier *-t-, so a pre-MK (i.e., Old Korean / Shilla) word might've been *mitɯ. Perhaps Koguryo *met(s) 'colt' was Shillafied as *mitɯ 'dragon', a similar-sounding but unrelated Shilla word.

The modern name 龍仁 Yong'in is a combination of 龍駒 Yonggu and 處仁 Chŏin (now a district of Yong'in). WHY IS RECONSTRUCTING KOREAN NUMERALS SO COMPLICATED?

The short answer is that

- there is so little data

- the data is extremely ambiguous

- and seemingly contradictory

I don't know where my copy of the 二中暦 Nichuureki transcriptions of Korean numerals is and I don't have my copy of 鷄林遺事 Kyerim yusa with me, so I've completed the table below using data from a comment by & lt;丶`∀´>(´・ω・`)(`ハ´  )さん . (I do not recommend reading that discussion.) I hope that data is accurate. I recall two different sets of transcriptions in Nichuureki, but I don't remember if the second one was from Kikaijima*. Fortunately, this is just a blog entry and not a serious academic publication, so errors are embarrassing but not fatal. MK = Middle Korean.

Gloss Long form (LF) Short forms (SF) Root (cf. day terms) Old Korean hyangchal spelling My revision of Kim Wan-jin's Old Korean reconstruction Nichuureki: 高麗語'Koryo language' Nichuureki: 喜界島語'Kikaijima language' Chinese transcription in Kyerim yusa My reconstruction of the Chinese readings of the Kyerim yusa transcriptions
1 하나 hana < MK hʌnah han *han- 一等 *hʌtʌn katana katana 河屯 *xɔtun
2 tul < MK turh < ?*tupurh tu ?*tupu- 二尸, 二肹 *tuɣɯl tufuri tofu 途孛 *tupur
3 set < MK səyh se,sŏk, *sə- toki tofi *ʃay
4 net < MK nəyh ne,nŏk, *nə- saki sofi *nay
5 다섯 tasŏt < MK tasʌs tat *tas- yesu yeso 打戍 *tasür
6 여섯 yŏsŏt < MK yəsɯs yŏt *yəs- fasu faso 逸戍 *yisür
7 일곱 ilgop < MK nirkup *nir- tariuni saso 一急 *irkïp
8 여덟 yŏdŏl < MK yətɯrp *yətɯr- tiriuni tiriuni 逸答 *yirtap
9 아홉 ahop < MK ahop *a- yetaro yetaro 鴉好 *axo
10 yŏl < MK yərh *yər- yeturo yero *yer
1000 즈문 chŭmun < MK tsɯmɯn chŭmu *tsɯmɯ- 千隱 *tsɯmɯn (none?) (none?) ? ?

I have no pre-MK data for modern/MK on 'hundred'.

1. All earlier data point to a *-t- absent in MK. Did this consonant irregularly lenite to zero (rather than to the expected *-r-)? The k- of Nichuureki and the *x- of Kyerim yusa are attempts to transcribe a Korean *h absent in Japanese and Chinese at that time.

一等 is a combination of the semantogram 一 'one' and the phonogram 等 for the second syllable. It is not to be confused with 一等 ilttŭng 'first class'. Middle Chinese 等 *təŋʔ is an approximation of Old Korean *tʌn, a syllable absent from Middle Chinese.

2. Alexander Vovin proposed that the second consonant of 'two' lenited to *ɣ in the dialect transcribed in hyangchal poetry. This velar fricative is implied by the back fricative of 肹 (early Sino-Korean hɯrʔ). Kyerim reflects another dialect with *p. There was no special symbol for pu in Japanese when Nichuureki was compiled, so fu could represent *pu, *fu, *βu, etc.

二尸 and 二肹 are combinations of the semantogram 二 'two' with the phonograms 尸 and 肹 for *-l and *-l. 尸 was presumably used for its Old Chinese reading *hli to represent a final *-l.

3 and 4. The Nichuureki transcriptions are probably reversed. The t- of NIchuureki may represent [nd] or [d] corresponding to MK n-.

5. The Nichuureki transcription is probably for 'six'.

6. Is fa in the Nichuureki transcriptions an error for the ta- of 'five'?

7. Could saso represent a variant of 'five' with *t- > *s-? The ta- of tariuni matches 'five', but the rest doesn't.

8. The Nichuureki transcriptions could represent 'seven' if the t- represents [nd] or [d] corresponding to MK n-. But why is there no trace of -k- or -p? And what is -ni?

9. The Nichuureki transcription is probably for 'eight'. Maybe a final *-p was ignored in Nichuureki as well as in Kyerim yusa.

10. The -t- in Nichuureki may have lenited to zero like the *-t- of 'one'.

1000. 千隱 is a combination of the semantogram 千 'thousand' and the phonogram 隱 for the *-ɯn of the second syllable.

*Is this an error, or was Korean once spoken there? WHY IS COUNTING SO COMPLICATED IN KOREAN? (PART 3)

The English words for 'twenty' through 'ninety' consist of 'two' through 'nine' plus -ty and some modifications: e.g., 'twenty' is not twoty. The modifications needed to derive the Korean words for 'thirty' and 'fifty' through 'ninety' are more drastic, and 'twenty', 'forty', and 'fifty' have their own roots. MK = Middle Korean.

Gloss Numeral Number of tens (long forms) Number of tens (short forms) Root for number of tens (cf. day terms) Notes Type
10 yŏl < MK yərh 하나 hana < MK hʌnah han *han- -h < *-k? n/a
20 스물 sŭmul < MK sɯmɯr tul < MK turh < ?*tupurh tu ?*tupu- Unrelated to 'two'; short form is 스무 sŭmu; same *-r suffix as two? A
30 서른 sŏrŭn < nonpreferred variant sŏrhŭn? < MK ? set < MK səyh se,sŏk, *sə- *-r-(h)ɯn; is the -h- in the variant original or due to analogy with other -hŭn forms? B3?
40 마흔 mahŭn < MK mazʌn net < MK nəyh ne,nŏk, *nə- Unrelated to 'four'; MK z normally doesn't become later h, so perhaps mahŭn (by analogy with other -hŭn forms?) and mazʌn are parallel descendants of an Old Korean word for 'forty' A/B1
50 shwin < MK suyn < ?*suin(V); dialectal 쉬흔 shwihŭn 다섯 tasŏt < MK tasʌs tat *tas- Unrelated to 'five'; is dialectal form conservative or by analogy with other -hŭn forms? A(/B1?)
60 예순 yesun < yəysyun < MK ? 여섯 yŏsŏt < MK yəsɯs yŏt *yəs- *-syun; the -y- in the first syllable is presumably due to the influence of the -y- in the second syllable and is not an infix C
70 일흔 irhŭn < MK ? 일곱 ilgop < MK nirkup *nir- *-hɯn B1
80 여든 yŏdŭn < yətɯn < MK ? 여덟 yŏdŏl < MK yətɯrp *yətɯr- *-ɯn; did *-r- disappear intervocalically: *yətɯr-ɯn > yətɯn? B2
90 아흔 ahŭn < MK ? 아홉 ahop < MK ahop *a- *-hɯn B1

Type A numerals (20, 40, 50) have unique roots: cf. Turkish which has unique roots for 'twenty' through 'fifty'. Did a unique root once exist for 'thirty'? If so, why did it fall out of use? Resemblance to a taboo word?

Type B numerals (30, 40, 70-90) end in -(h)ŭn. 'Thirty' has an unexpected -r- before this suffix.

The sole type C numeral (60) ends in -sun < -syun. I wonder if the type B suffix is a reduction of *-syun: e.g., MK mazʌn < ?*ma-syun 'forty'. MK -z- is partly from earlier *-s-. WHY IS COUNTING SO COMPLICATED IN KOREAN? (PART 2)

Native Japanese numerals from one to nine all end in the suffix -tsu. There is no single suffix for their native Korean equivalents which have long and short forms. The long forms are used in isolation. The short forms are used in constructions: e.g., han saram 'one person' (not hana saram, chhaek han kwŏn 'one book' (not chhaek hana kwŏn; chhaek = 'book', kwŏn = 'volume'; counter for books). When multiple short forms are listed, the choice of short form is partly phonologically conditioned*. MK = Middle Korean.

Gloss Long form (LF) Short forms (SF) Root (cf. day terms) Notes Type
1 하나 hana < MK hʌnah han *han- LF -ah < *-ak?; cf. hanak 'one' used before sshik 'each'. A1
2 tul < MK turh < ?*tupurh tu ?*tupu- LF -rh < *-r-k? A2
3 set < MK səyh se,sŏk, *sə- LF -yh < *-y-k? A3
4 net < MK nəyh ne,nŏk, *nə-
5 다섯 tasŏt < MK tasʌs tat *tas- LF -Vs with vowel determined by harmony B
6 여섯 yŏsŏt < MK yəsɯs yŏt *yəs-
7 일곱 ilgop < MK nirkup *nir- LF -kVp with vowel determined by harmony C1
8 여덟 yŏdŏl < MK yətɯrp *yətɯr- Is -p a remnant of the *-kVp suffix in '7' and '9'? Was '8' once *yətɯr-kup? C3 or D?
9 아홉 ahop < MK ahop *a- LF -hVp < ?*-kVp with vowel determined by harmony C2
10 yŏl < MK yərh *yər- LF -h < *-k A4

None of these numerals are related to Japanese numerals.

The long terms all incorporate suffixes:

Type A (1, 2, 3, 10): -h < *-k suffixes usually preceded by other suffixes (*-a, *-r, *-y) except in 'ten'.

Type B (5, 6): *-Vs suffixes. The final -s of the roots may also be a suffix, but I know of no forms with ta- for 'five' and yŏ- for 'six' without a trace of -s-.

Type C (7, 8?, 9): *-kVp suffixes. Does 'eight' actually belong to a type D with an unrelated suffix *-p?

With the inexplicable exception of 'ten' which patterns like 'one' through 'four', each suffix type is associated with a set of adjacent numerals. What would these suffixes have meant: 'low numeral' (but what about 'ten'?), 'middle numeral', and 'high numeral'? And how did they originate? Are they reductions of earlier full words?

The obsolete higher native numerals have none of the above suffixes: 온 on 'hundred', 즈믄 chŭmŭ-n 'thousand'.

*For example, nŏk is a short form of 'four' that is used before s-, but one says ne saram 'four people' rather than nŏk saram (0 Google hits) even though one says nŏk sŏm (4 x 180.5 liters of rice). WHY IS COUNTING SO COMPLICATED IN KOREAN? (PART 1)

Korean has a set of words for counting days that still puzzles me after twenty years:

Numbers of days/day of the month Short day term Long day term Corresponding native numeral Day term type
1 하루 haru 하룻날haru.n-nal 하나 hana A
2 이틀 ithŭl 이튿날 ithŭ.n-nal tul B'': root change
3 사흘 sahŭl 사흗날 sahŭ.n-nal set B': root vowel change
4 나흘 nahŭl 나흗날 nahŭ.n-nal net
5 닷새 tassae 닷샛날 tassae.n-nal 다섯 tasŏt C
6 엿새 yŏssae 엿샛날 yŏssae.n-nal 여섯 yŏsŏt
7 이레 ire 이렛날 ire.n-nal 일곱 ilgop D
8 여드레 yŏdŭre 여드렛날 yŏdŭre.n-nal 여덟 yŏdŏl
9 아흐레 ahŭre 아흐렛날 ahŭre.n-nal 아홉 ahop E
10 열흘 yŏrhŭl 열흘날 yŏrhŭl-lal yŏl B: no root vowel change

The long terms are almost entirely predictable. They all contain the short terms plus 날 nal 'day'.

nal is always preceded by the genitive suffix -n- < -s- except in 'tenth'. This suffix is normally spelled as ㅅ s, but is spelled as ㄷ t in 'second', 'third', and 'fourth day'. (2.15.0:54: Are there dialects in which 'two', 'three', and 'four days' are pronounced with a final -t?)

'Tenth' ends in -lal instead of -nal because n- becomes l- before -l.

However, the short terms cannot be easily derived from the numerals, or vice versa. I have grouped them in four categories:

Type A: 하루 haru 'one day; first' < Middle Korean hʌrʌ shares a root ha < with hana < hʌnah 'one' (shortened to han < hʌn before nouns). -ru < -rʌ (< *-tʌ?) is a suffix of unknown meaning.

Type B: The words for 'two', 'three', 'four', and 'ten days' end in -hŭl < -hɯl ~ hʌl (depending on vowel harmony). This suffix attaches to the roots for 'three' and 'four' with a vowel change

> sahŭl (not sŏhul; set 'three' < sŏ-i-s)

> nahŭl (not nŏhul; net 'four' < nŏ-i-s)

but 'two days' is ithŭl instead of the expected tuhŭl or tahŭl. Could it- be an archaic root for 'two'? It does not resemble any other 'Altaic' word for two: e.g., Turkish iki, Proto-Mongolian *ji-ri-n, Manchu juwe, or Old Japanese puta-. Although it- has the same vowel as modern Sino-Korean 二 i 'two', SK i is from Middle Korean zi with an initial z- and a high pitch, whereas Middle Korean ithɯr had a low-high pitch pattern (implying that it- had low pitch).

(2.15.1:05: Martin 1992: 178 mentions 읻째, 잇째 it-tchae 'second' with the root it- which "seem to be artificial". I cannot Google any examples of these words.)

'Tenth' is rhŭl with -r- since -l- becomes -r- before -(h)V.

2.15.00:35: -hŭl is reduced to -hŭ before the genitive suffix -n- < -s- except in 'tenth'. I briefly thought that -hŭl might be an abbreviation of *-hV-s-nal, but 'second day' is both ithɯr-s-nal and ithɯ-s-nal in early hangul texts. I cannot find any early attestations of 'third' and 'fourth' with the genitive suffix and -nal.

2.15.22:16: The -h- in the type B suffix is probably another suffix (see the analysis in part 2), so the type B suffix is really -ŭl.

Type C: The words for 'five' and 'six days' consist of the roots for 'five' and 'six' plus -sae. Although -ae < *-ay is a vowel harmony variant of -e in types D and E, I cannot reconcile the -s- of -sae with the zero of type D or the -r- of type E.

Type D: The words for 'seven' and 'eight days' consist of the roots for 'seven' and 'eight' plus -e. Note that an archaic root for 'eight' (yŏdŭl- with -ŭ-) appears before -e. -l- becomes -r- intervocalically: il-e > ire.

Type E: The word for 'nine days' consists of the root for 'nine' with an unexpected vowel change plus -re (with an -r- by analogy with the type D words?).

2.15.0:20: Perhaps type D once had the same *-rVy suffix (with vowel harmony variation) as type E. The *-r- of this suffix was lost after a root-final *-r:

*nir-rəy > modern ire

*yʌtʌr-ray > Middle Korean yətʌray > modern yŏdŭre

*aho-ray > Middle Korean ahʌray > modern ahŭre

