Home

12.3.10.23:59: SINHALESE *C > H, *J > ?

Most of the Sinhalese numerals (see my last two posts) are cognate to English numerals, though sound changes have obscured their relationship. Hence hatərə 'four' is cognate to four as well as Sanskrit catvaaras 'four'. Turner's A Comparative Dictionary of the Indo-Aryan Languages lists another Sinhalese form satara with s-. I suspect that there was a shift

c > s > h

Other entries in Turner confirm this shift:

'sandalwood': Skt candana > Si san̆dun, han̆dun

(The s- of sandalwood reflects a Greek s- for a Dravidian c-borrowing from Sanskrit, not Sinhalese s-.)

'moon': Skt candra > Old Sinhalese cada, Si san̆da, han̆da ( = prenasalization)

The three s-consonants of Sanskrit also became h:

'five': Skt pañca > Si pas (indef. paha)

'six': Skt aṣ > Si hayə

3.11.1:28: Turner also lists Si sa-, saya-, ha-; -ya may be a suffix

'seven': Skt sapta > Si hat

'ten': Skt daśa > Si daha

but 'eight': Skt aṭa > Si aṭə (with the cluster ṣṭ simplifying to ṭ; cf. how pt in 'seven' simplified to t; was there an intermediate stage hṭ?)

If c became h, would its voiced counterpart j have become voiced ɦ? I was surprised to learn that c and j developed in different directions: the former softened to a fricative but the latter hardened to a stop:

'person': Skt jana > Si danaa (cognate to gene)

'tongue': Skt jihvaa > Si diva

'life': Skt jivita > Si divi (cognate to quick, vital, bio, zoo)

If original c and j became s/h and d, where did modern Sinhalese c and j come from? Clusters and/or loanwords?


12.3.9.23:59: ONE, TONE, THROUGH ...

I was surprised by some of the Sinhalese numerals. The first three are

eka 'one'

deka 'two'

tuna 'three'

The first looks like Sanskrit eka. Was it inherited more or less intact from Sanskrit eka (whose e is long [ee] unlike Sinhalese short e) or borrowed into Sinhalese?

The second and third are quite different from their probable Sanskrit sources.

Turner's A Comparative Dictionary of the Indo-Aryan Languages lists OSi. (Old Sinhalese?) de and do 'two'. I presume these are from Sanskrit dve (f./n. nom.) and dvau (m. nom.). Sinhalese has no grammatical gender, so perhaps do was lost, and de was changed to deka to match eka 'one'. (Cf. how *n- became d- in 'nine' to match the d- of 'ten' in Slavic: e.g., Russian devjat' and desjat', etc.)

tuna has an -u- unlike Skt trayas (m.), tiiṇi (n.), or tisras (f.)  Where did -u- come from?

3.10.1:35: Gair (2003: 784) lists the stems of the first three Sinhala numerals as

ek 'one' (inanimate definite ekə)

de 'two' (inanimate definite dekə)

tun 'three' (inanimate definitef tunə)

Apparently Turner and Wikipedia transcribed the Sinhala spellings for the inanimate definite forms without distinguishing between the two possible inherent vowels [a] and [ə].

Turner lists OSi. (Old Sinhalese?) tiṇ and tun. The former resembles Skt tiiṇi (n.), but the latter remains a mystery to me.


12.3.8.23:54: SINHALESE V > M?

According to the Wikipedia article on Sinhalese, "formal" navaya 'nine' (cf. Skt nava) corresponds to "contemporary spoken" namaya. Does a v to m shift also occur in other words?

Given that Skt nava is also 'new', I thought that Sinhalese might have a nama-like word for 'new', but sinhala-online.com translated 'new' as nava.

Turner's A Comparative Dictionary of the Indo-Aryan Languages lists words for 'nine' and 'new' with m: e.g., Kalami (Bashkarik) nim 'new' (f.*), num (also nab) 'nine'.

I'm accustomed to shifts in the other direction:

Middle Chinese *m before *u > w or v in some modern Chinese languages**: e.g.,

萬 Middle Chinese *muanh > Mandarin wan, Meixian Hakka van 'ten thousand'

Irish lenited *m > w, vʲ (both spelled mh): e.g., a mháthair 'mother' (voc. sg.; cf. nom. sg. máthair without lenition)

Lenition is more common than fortition, and loss of nasality is more common than gaining nasality.

*3.9.00:15: The masculine is illegible in the online edition: n*lm (sic). I assume *l is an OCR error for a vowel letter.

**3.9.2:48: A major exception is Cantonese which retains *m: e.g., maan 'ten thousand'.

I used to wonder if 旺角 Mong Kok in Hong Kong was an anomalous case of *w > m in Cantonese. The phonetic of 旺 is 王 wong. However, it turns out that the original name of Mong Kok was 芒角 Mong Kok which was changed to 旺角 Wong Kok in the 30s, though its English name Mong Kok did not change.

3.9.22:48: Chinese borrowings in Vietnamese seem to be from an early Cantonese-like dialect. 'Ten thousand' was borrowed twice: the older borrowing muôn retains the *m- absent from the newer borrowing vạn. Did earlier Cantonese have a *mw- or *mv- that was borrowed into Vietnamese as *w- or *v- but was simplified to m- in Cantonese?


12.3.7.23:57: 'RUSSIAN LANGUAGE IS.'

That's all that the Sanskrit Wikipedia entry for Russian literally says:

रूसी भाषा अस्ति।

ruusii bhaaṣaa asti (cognate to Latin est, German ist, English is, etc.)

Maybe it's longer by the time you read this. And maybe it's telling that there's no Sanskrit article on Russia.

Does anyone read the Sanskrit विकिपीडिया Vikipiiḍiyaa for information? What's the longest article there? I wouldn't be surprised if there are even shorter 'articles'.

At least there is a Sanskrit Wikipedia article on Russian. There is no such article on the Sinhala Wikipedia which has 6,550 articles - fewer than the Sanskrit Wikipedia with ७,३७३ लेखाः '7,363 articles'. Which of the two Wikipedias has more readers?

I visited the Sanskrit Wikipedia to see what its adjective for 'Russian' is. भाषा bhaaṣaa 'language' is feminine, so रूसी  ruusii must be the feminine nominative singular, and I assume that रूसस् ruusas and रूसम् ruusam are the corresponding masculine and neuter (cf. the Latin noun endings -us and -um).


12.3.6.4:41: SIIRILIK

is Hindi for 'Cyrillic', judging from the title of this Wikipedia article. Why is the first vowel ii instead of i? I assume that the word was borrowed from English, and a careful English pronunciation of Cyrillic is [sɪrɪlɪk], not *[siirɪlɪk].

The article has a table of Cyrillic-Devanagari correspondences with the following oddities:

1. Cyrillic а <a> and и <i> are transliterated with both short and long Devanagari vowel symbols. What determines whether a short or long symbol is used? The Hindi Wikipedia article on Russian only lists long symbols for those vowels and a short इ <i> for ы (though язык is Devanagarized as यज़ीक् <yaziik>!)

Cyrillic е <je> and ю <ju> are only transliterated with short Devanagari vowel symbols.

Cyrillic ё <jo> (see below), о <o>, у <u>, э <e>, and я <ja> (see below) are only transliterated with long Devanagari vowel symbols.

But vowel length has nothing to do with the differences between all of the above Cyrillic vowel letters!

2. Cyrillic д and т are transliterated with both dental and retroflex Devanagari symbols, even though д and т are dental in Russian. Do the retroflexes reflect the influence of English? (English d and t are alveolar and are Devanagarized as retroflexes.)

3. Cyrillic е <je> is transliterated as ऍ <ɛ> [ɛ], but Cyrillic э <e> is transliterated as ये <ye> [jee] with <y> as well as ए <e> [ee]. The Devanagarizations give the false impression that е and э have different vowels.

4. Cyrillic ё <jo> is transliterated as ये <ye> [jee] rather than यो <yo> [joo].

5. Cyrillic щ <shch> [ɕɕ] is transliterated as the unusual character श़ <ṣ́> rather than श्च <ṣ́c>. This is the one oddity that makes sense.

6. The Cyrillic hard sign ъ is transliterated as ओ <o> [oo] even though it's no longer a vowel in modern Russian and has never represented long [oo] or even short [o].

7. The Cyrillic vowel ы <y> [ɨ] is transliterated as both the consonant य् <y> and the vowel इ <i>.

8. The Cyrillic soft sign ь is transliterated as ए <e> [ee] even though it's no longer a vowel in modern Russian and has never represented long [ee] or even short [e].

Whoever devised the Devanagarization seems unfamiliar with Russian pronunciation.

Given India's "strong strategic, military, economic and diplomatic relationship" with the USSR, I would have expected a much more straightforward Devanagarization of Cyrillic. Does such a system exist?


Tangut fonts by Mojikyo.org
Tangut radical and Khitan fonts by Andrew West
Jurchen font by Jason Glavy
All other content copyright © 2002-2012 Amritavision