Amaravati: Abode of Amritas

13.11.16.23:51: KUDARA = BEARFORD?

While looking up bear in Wiktionary last night for my second 'bear'-iation post, I found Paekche (= Baekje) 固麻 koma listed as a translation equivalent.

In 2005, I proposed that the Japanese name Kudara < *kundara for Paekche could have contained a Paekche *qon 'hundred' cognate to Korean on 'hundred'. The word would have been borrowed into pre-Old Japanese as *kondara before its *o was raised to *u. I posted this etymology last year and listed five problems with it.

Last night I considered deriving Kudara from *kom(a)-dʌrʌ, a hypothetical Paekche cognate of komanʌrʌ 'bear-ford', the native Middle Korean equivalent of Ungjin, the Sino-Korean reading of 熊津, the capital of Paekche. This new etymology has its own set of issues in addition to carryovers from its predecessor*:

1. All other evidence points to disyllabic *koma for 'bear' in Paekche, not monosyllabic *kom like Middle Korean and modern Korean kom 'bear'. (The -a- in Middle Korean komanʌrʌ is an archaism.)

2. The presence of phonemic voiced stops in early Koreanic is controversial.**

3. The shift of *d- to *n- is highly speculative.***

4. The earliest Japanese form may have been *kutara****, and *-md- would not have been borrowed as *-t-.

*11.17.0:58: As I wrote last year, "There is no guarantee that the u of OJ [Old Japanese] Kundara is from *o rather than *u."

Moreover, the problem of relating *kom(a)-dʌrʌ to the Chinese character spelling 百濟 'hundred-cross' for Kudara remains. A ford (*dʌrʌ) can be crossed, but 'hundred' has nothing to do with 'bear' (*kom(a)).

11.17.2:15: My Paekche *qon 'hundred' (whose *q- is questionable) does, however, vaguely sound like *kom 'bear'. Was 百 a symbol for *KoN-syllables?

**11.17.1:04: Chinese character transcriptions of early Koreanic have voiceless and voiced obstruent alternations in spellings for the same name, implying that voicing was not phonemic. On the other hand, Ramsey proposed voiced obstruents as the source of some Middle Korean medial consonants. I think those voiced obstruents may have been short-lived products of intervocalic voicing in pre-Middle Korean and am hesitant to project them very far back.

***11.17.2:05: Late 19th and early 20th century northern Korean forms of 'four' have nd- or even d- corresponding to standard n- (see Martin 1992: 28) for details. It is tempting to reconstruct *d- as a source of standard n- in 'four' since other 'Altaic' languages have d-words for 'four': Proto-Turkic *dȫrt, Proto-Mongolic *dörbe-n (Janhunen 2003: 16), and Proto-Tungusic *dügin. (It is debatable whether Proto-Japonic *yə should be reconstructed as *də.) Starostin reconstructed Proto-Altaic *tṓj- (with *t- instead of *d-!) on the basis of Turkic, Mongolic, Tungusic, and Japonic (but not Koreanic).

But even if there was a pan-'Altaic' areal word 'four', no complete 'Altaic' numeral system can be reconstructed: e.g., 'three' is Proto-Turkic *üč, Proto-Mongolic *gurba-n (Janhunen 2003: 16), Proto-Tungusic *ilan, early Koreanic *s-ki, and Proto-Japonic *mi. Starostin derived Proto-Turkic *otuŕ 'thirty' and the Proto-Mongolic and Proto-Japonic forms for 'three' from a dubious Proto-Altaic *ŋ[i̯u].

****11.17.2:10: The Zushoryou edition of Ruiju myougishou has the spelling 久太良 indicating *kutara in Early Middle Japanese (Oono et al. 1990: 412).

13.11.15.23:20: 'BEAR'-IATION AND EAST SLAVIC INTERLANGUAGES

Tonight I wonder if Surzhyk and Trasianka words for 'bear', 'rose', 'apple', and 'cucumber', resemble my Russian-based guesses for Ukrainian and Belarusian. I can imagine "the imperfect attempts of urban professionals to use Ukrainian in spheres where they had only used Russian before" (Bilaniuk and Melnyk 2008: 71) to contain words such as *medvid' instead of vedmid' for 'bear'. And/or are there Surzhyk and/or Trasianka forms which are Ukrainian and Belarusian-based guesses for Russian: e.g.,

Gloss	Russian	Ukrainian	Ukrainian-based guess for Russian	Belarusian	Belarusian-based guess for Russian
bear	medved'	vedmid'	vedmed' or vedmod'	mjadzvedz'	medved'
rose	roza	trojanda	trojanda	ruža	ruža
apple	jabloko	jabluko	jabluko	jablyk	jablyk
cucumber	ogurec	ohirok	ogerok or ogorok	ahurok	ogurok

My guesses above incorporate the following rules based on regular correspondences:

Ukrainian and Belarusian h > Russian g

Ukrainian i > Russian e or o (unpredictable without a knowledge of historical phonology that most Ukrainian speakers wouldn't have)

Belarusian unstressed ja > Russian e

Belarusian dz' > Russian d'

Belarusian unstressed a > Russian unstressed o [ʌ]

Belarusian nonpalatalized dz is "marginal" (Mayo 1993: 896), and I don't know what it would correspond to in Russian. I also don't know if -dz- in mjadzvedz' is palatalized or not. If it isn't, I would guess that a Belarusian speaker might Russianize nonpalatalized /dz/ as nonpalatalized /d/ by analogy with palatalized /dzʲ/ which corresponds to Russian palatalized /dʲ/.

13.11.14.23:32: 'BEAR'-IATION IN SLAVIC

Using my knowledge of Russian and rules of sound correspondences, I can roughly guess what the non-Russian cognates of Russian words are. But this imgur image gallery demonstrates the limits to this approach:

Gloss	Interslavic	Russian	Ukrainian	Belarusian	Polish
bear	medvěď	medved'	vedmid' (not *medvid'!)	mjadzvedz' (not *mjadvedz'!)	niedźwiedź (not *miedwiedź!)
rose	roza	roza	trojanda	ruža (not roza!)	róża [ruʐa] (not roza or róza!)
apple	jabloko	jabloko	jabluko (not *jabloko!)	jablyk (not *jablaka!)	jabłko (not *jabłoko!)
cucumber	(none)	ogurec	ohirok (not *ohurek!)	ahurok (not *ahurek!)	ogórek (not *ogurec, though ó is now homophonous with u)

Interslavic is an artificial generic Slavic.

Bear: Were m and v flipped in Ukrainian for taboo deformation? Did mi- /mʲ/ become ni- /ɲ/ in Polish for the same reason?

The final *-d of 'honey' seems to have assimilated to the following palatal wi- /vʲ/ in Polish, becoming palatal -dź-. Is Belarusian -dz- nonpalatalized like Russian -d- or palatalized /dzʲ/ corresponding to the Polish affricate -dź-?

Rose: The Belarusian word was borrowed from Polish after Polish ó raised to [u]. Why does Polish have -ż- [ʐ] instead of -z-?

Ukrainian trojanda is obviously unrelated to the others.

Apple: Why are the second vowels so different (or absent in the case of Polish)?

Cucumber: Why does Ukrainian have -i- which normally doesn't correspond to -u- in other Slavic languages? -ec, -ok, and -ek are diminiutive suffixes; Czech and Slovak have another suffix -ka. (Is Hungarian uborka 'cucumber' a loan from Slovak uhorka, and if so, why does it have -b-?)

13.11.13.21:09: A PERPLEXING 'PASS'-IVE PHONETIC

In the Hua-Yi yiyu, Jurchen juwa 'ten' (from my last post) was transcribed as 撾 *tʂwa in Ming Dynasty Chinese. 撾 'to knock, beat' is still read as [tʂwa] in standard Mandarin today. It and its homophone 檛 'horsewhip' have semantic left components: 扌 is 'hand' and 木 is 'wood'. Right components are usually phonetic, but [tʂwa] does not sound like 過 [kwo] 'to pass'. Although phonetic mismatches are often cleared up by going back to earlier stages of Chinese phonology, that is not the case here: in Late Old Chinese, 撾 and 檛 were pronounced as *ʈwæi with an initial retroflex stop, whereas 過 *kwai(h) still had an initial velar stop. 𥬲, an equivalent of 檛 in Shuowen (c. 121 AD), has 竹 'bamboo' atop 朵 *twaiʔ which was a better phonetic match for *ʈwæi than 過 *kwai(h).

eastling.org has no Early Old Chinese reconstructions for 撾, though it lists (generates?) a Zhengzhang Shangfang-style reconstruction *kr'ool and a Pan Wuyun-style reconstruction *k-rool for 檛. However, I can't find any attestation of 檛 earlier than Han shu (111 AD), and by then it was read as *ʈwæi. I can't find any attestation of 撾 earlier than Hou Han shu (5th c. AD). So I am hesitant to reconstruct 𥬲/撾/檛 before the first millennium AD.

The change of the cluster *k(-)r- to a retroflex stop *ʈ- is phonetically plausible, but such a change is unlikely to have occurred as late as the first millennium AD when the 過 *k-spellings first appear. Moreover, the Shuowen spelling 𥬲 has a dental-initial phonetic 朵 whose Early Old Chinese reading was *toolʔ in both the Zhengzhang Shangfang and Pan Wuyun-style reconstructions at eastling.org.

I don't know of any characters pronounced *ʈwæi in Late Old Chinese other than 撾 and 檛. Was 過 chosen as a phonetic for 撾 and 檛 in spite of the mismatch in initials because there was no better match available (other than 朵)?

I would rather not reconstruct some exotic cluster like *ktr- or *tkr- to account for 過 in 撾 and 檛 since such clusters were gone in recorded mainstream varieties by the first millennium AD. (They could have survived in colloquial and/or peripheral varieties. Were 撾 and 檛 created by a speaker of such a variety?)

Some further complications:

撾 has a second reading -wo in standard Mandarin 老撾 Laowo 'Laos', a phonetic transcription of Lao ລາວ Laaw 'Lao'. How old is that reading and the word it appears in?

zdic.net lists some non-Mandarin readings of 撾 and 檛:

Sinograph Cantonese Chaozhou Hakka

撾 zaa [tsaː], gwo [kwɔː] tshuaⁿ, ko (none listed)

檛 zaa (none listed) ko

Are the k-readings by analogy with the k- of 過, or are they evidence for reconstructing *k- in 撾 and 檛?

Why does Chaozhou tshuaⁿ have aspiration and nasalization?

13.11.12.23:59: WHAT'S SO SPECIAL ABOUT 'ELEVEN' IN JURCHEN?

Today is November 12, 2013 - 11/12/13 in a date format often used in the US - and that made me think about the Jurchen numerals

omšo jirhon gorhon

'eleven twelve thirteen'

The -hon of 'twelve' and 'thirteen' is not related to Jurchen juwa 'ten' or to omšo 'eleven', though I wonder if it's a Khitan compression of some longer form that was also ancestral to Proto-Mongolic *xarban 'ten' (as reconstructed by Janhunen 2003: 16):

*xarban > *xarvan > *xawn > *xɔn > *hon?

Last night I realized that omšo vaguely resembles Jurchen emu 'one' plus juwa 'ten'. But j > š is not a known Jurchen sound change, so omšo cannot be related to juwa.

Today I thought it might be fun if -sho were related to Japanese -so '-ty' in miso 'thirty' < mi 'three' + -so, etc.

Korean also has a '-ty' morpheme with various shapes: -hŭn, -ŭn, -un, -n (Martin 1992: 176).

John Whitman (1985: 169) reconstruted Proto-Koreo-Japonic *sh which became Japanese š and Korean h. Although I don't think Proto-Koreo-Japonic existed, what if there were some Manchurian source language word *šon 'ten' underlying Jurchen -šo (and Manchu -šon in omšon biya 'eleventh month'), Japanese -so, and Korean -(h)Un, but not the basic words for 'ten':

Gloss	'ten' (free)	'ten' (bound)
Jurchen	juwa	-šo
Manchu	juwan	-šon
Korean	yŏl	-hŭn, -ŭn, -un, -n
Japanese	tō < *təwO	-so

Om- could be a shortened version of emu with a vowel that was rounded to harmonize with -šo(n). However, I don't know of any other cases of regressive harmony in Jurchen compounds.

Andrew West (who wrote about the Jurchen numerals and their graphs last year) reminded me about Janhunen's (2003: 399) more plausible proposal to derive Jurchen-Manchu omšo(n) from a Para-Mongolic *omco cognate to Proto-Mongolic *onca 'special, additional'. Is there any other language in which 'eleven' is from 'special' or 'additional'?

13.11.11.23:32: BURNING QUESTION OF THE DAY

Today is Veterans Day in the US, so I've been thinking about the etymologies of both words in its name. Veteran goes back to Proto-Indo-European *wet 'year', but what is the origin of day? Watkins (2011: 1), like its 2000 predecessor, derived day from Proto-Indo-European *agh-, the root of Sanskrit ahan 'day'. But where does the d- come from? Watkins called it "obscure". Could Proto-Germanic *dagaz 'day' simply be a substratal word whose root *dag just happens to match Proto-Indo-European *agh- (which would go back to *ʕekʰ- in a Leiden-style reconstruction*)? (*-az is from Proto-Indo-European *-os.)

On the other hand, Wiktionary derived day from Proto-Indo-European *dʰegʷʰ- 'to burn'. I don't see any phonological problems, though the semantic fit is not absolute: perhaps 'burn' > 'time of heat' > 'daytime' > 'day'. I do, however, see phonological problems with deriving Proto-Slavic *žeťi 'to burn' from *dʰegʷʰ- since *dʰ- should normally become Proto-Slavic *d-: e.g., *dʰeʔ- 'to place' became Proto-Slavic *děti 'to do', not *žěti. According to Pokorny, the sequence *d-g- might have been assimilated to *g-g-: e.g., *degǫ 'I burn' became *gegǫ and then *žegǫ in Proto-Slavic. This assimilated version of the root coexisted alongside the *d-version which survived in Slovene dę́gniti 'to burn'** and Russian djogot 'tar'.

*See Beekes (1995: 124). Is there any language in which voiceless aspirated stops became voiced stops? The opposite change (*g- > kʰ-) is attested in Thai, Lao, and some varieties of Chinese.

**Is this word extinct? Googling "degniti" (which I think would be its spelling in modern Slovenian) in the .si domain, I only got five results, and none were in running text.

13.11.10.23:19: WATER + EARTH + SMALL

What is the Tangut word for 'island'? I couldn't find one in Li Fanwen 2008 or Grinstead 1972. I wish Kychanov and Arakawa's 2006 dictionary had a reverse index.

I wouldn't expect the landlocked Tangut ao have a native word for 'island', but they might have had a phonetic borrowing of Tangut period northwestern Chinese 島 *taw 'island' or a calque of Tibetan gling phran 'small continent'. I don't know how the Tangut translated gling 'continent' (though 2lhiẹ 'country' might be cognate).

For fun I created a pseudo-Tangut character for 'island' out of 'water', 'earth', and 'small':

=++

(no reading) 'island' =

left of 2ziəəʳ 'water' +

left of 2lɨə̣ 'earth' +

right of 1tsẽ 'small'

It would be fun to see a Tangut version of this kanji creation contest.

Sinograph	Cantonese	Chaozhou	Hakka
撾	zaa [tsaː], gwo [kwɔː]	tshuaⁿ, ko	(none listed)
檛	zaa	(none listed)	ko