Amaravati: Abode of Amritas

19.8.24.23:57: THE ALTERNATE SCRIPT BUREAU'S KHMER SCRIPT FOR ENGLISH (PART 2)

1. Unlike modern Khmer script which deviates from the one-symbol-per-sound ideal with two or more consonant letters per consonant phoneme and two or more readings per vowel symbol (see part 1), the Alternate Script Bureau's (ASB) proposal for writing English in the Khmer script has just one symbol per consonant (not including subscript variants) and one reading per vowel symbol. Compare Huffman and Proum's (1983: 31; hereafter H&P) transcription of English for modern Khmer speakers with the ASB system and my own system:

English		H&P		ASB		This site
spelling	IPA	Khmer script	transliteration	Khmer script	transliteration	Khmer script	transliteration
pay	pʰej	ផេ	phe	បែ	pɛ	បេ	pe
pea	pʰiː	ភី	bhī	បី	pī	បី	pī

ASB, my own English-in-Khmer system, and my Khmer transliterations are based on the old values of Khmer characters: e.g.,

ប <p> represents English /p/ (initial [pʰ]) which is quite different from [ɓ], its primary phonetic value in Khmer.
ី <ī> represents English [iː] which is quite different from [əj], its phonetic value in Khmer after *voiceless obstruents.

A Khmer reader would pronounce

ASB បែ <pɛ> as [ɓae], not [pʰej]
my បេ <pe> as [ɓej], not [pʰej]
ASB and my បី <pī> as [ɓəj], not [pʰiː]

ASB and my system agree most of the time but not all of the time: e.g., the treatment of English [ej]. I'll start covering the differences in part 3.

2. 稚内 <YOUNG INSIDE> Wakkanai is an Ainu name in disguise. Ainu wakka 'water' is written as 稚 <YOUNG> since Japanese waka (not wakka!) is 'young'. And Ainu nai 'river' is written as 内 <INSIDE> since it sounds like Sino-Japanese nai 'inside'. This makes me wonder

- how many other cases of -VCCV- sequences (like wakka) are written with characters normally read with -VCV- sequences (like 稚 waka)

- if any Ainu words are written semantically in Japanese place names: e.g., is wakka 'water' ever written as 水 <WATER>? Is nai 'river' ever written as 川 <RIVER> or 河 <RIVER> or 江 <RIVER>?

Ainu nai 'river' happens to coincidentally resemble Middle Korean nayh 'river'. The resemblance fades if the Middle Korean word is traced to Old Korean 川理 <RIVER ri> *(na?)ri (cf. its Paekche cognate 'stream', recorded in Japanese as 那禮 ~ ナレ nare ~ ナリ nari).

3. Today I was listening to 坪能克裕 Tsubonō Katsuhiro's score for Aura Battler Dunbine (1983-84). Tsubonō's name is spelled as an unusual combination of a native Japanese 坪 <TSUBO> tsubo 'unit of area' with a Sino-Japanese 能 <CAN> nō. Is the name a variant of 坪野 <TSUBO FIELD> Tsubono with a long final vowel?

4. Today I discovered 金芝河 Kim Chi-ha's respellings of the names of what he called 五賊 오적 Ojŏk 'Five Bandits':

Normal hanja	Kim's hanja	Hangul	Pronunciation	Translation
財閥	狾䋢	재벌	chaebŏl	conglomerate
國會議員	𠣮獪狋猿	국회의원	kukhoe ŭiwŏn	National Assembly member
高級公務員	跍礏功無獂	고급공무원	kogŭp kongmuwŏn	high-ranking public official
將星	長猩	장성	changsŏng	general
長・次官	瞕𤠝矔	장차관	changchhagwan	minister and vice-minister

I'm out of time, so I'll comment on Kim's respellings and Brother Anthony of Taizé's translations later.

5. Via Bitxəšï-史 today: Blažek et al.'s Altaic Languages: History of Research, Survey, Classification and a Sketch of Comparative Grammar (2019) can be freely downloaded here (click on "KE STAŽENÍ"). I use 'Altaic' on this site to refer to an areal grouping of languages, but that book treats it as a genetic language family. A quick look at the book leaves me unconvinced.

19.8.23.23:59: THE ALTERNATE SCRIPT BUREAU'S KHMER SCRIPT FOR ENGLISH (PART 1)

Last night I disccovered the Alternate Script Bureau's (ASB) proposal for writing English in the Khmer script.

It reminded me of how I came up with a way to write English in hangul when I first learned that alphabet in 1987. Unaware of the wealth of obsolete hangul letters, I recall inventing a letter 巳 for /l/ based on ㄹ /r/. I might have made up other letters as well.

Khmer has so many characters that it's not necessary to invent new ones for English.

Obstruent devoicing and vowel warping conditioned by the *voicing of preceding consonants have resulted in many pairs of homophonous consonant characters on the one hand and vowel characters with double readings on the other: e.g.,

ផ <ph> *ph > [pʰ]
ភ <bh> *bh > [pʰ]
េ <e> *eː > [ej] after *voiceless consonants, [eː] after *voiced consonants
ី <ī> *iː > [əj] after *voiceless consonants, [iː] after *voiced consonants

A Khmer script for English designed for maximum compatibility with modern Khmer script for Khmer would carry over those characteristics. Huffman and Proum's (1983: 31) transcription of English for modern Khmer speakers has those characteristics: e.g.,

ផេ <phe> for pay [pʰej]
ភី <bhī> for pea [pʰiː]

Note how English p [pʰ] has to be written differently depending on the following vowel.

ASB takes a simpler approach which I'll describe next time.

(8.24.0:12: Huffman and Proum probably have no transcriptions like ភេ <bhe> or ផី <phī> because the syllables [pʰeː] and [pʰəj] do not exist in American English.)

19.8.22.23:57: KHMER INDEPENDENT VOWEL LIGATURES

I didn't figure out that these Khmer independent vowel characters were ligatures until two days ago:

ឨ <°û> (= ឧក <°uka> /ʔok/) < ឧ <°u> + ក <ka> (<ˆ> symbolizes the upper 'hair' stroke of <ka>)

ឪ <°ǔ> (= ឩវ <°ūva> /ʔəw/) < ឧ <°u> (not ឩ <°ū>!) + វ <va> (<ˇ> symbolizes the upper 'hair' stroke of <va>)

Duh. Two mysteries down, more to go:

Why was the now-obsolete ligature ឨ <°û> created? One might guess there vwas a high-frequency word /ʔok/ (< earlier /ʔuk/). There is no likely candidate for such a word in modern Khmer. Here are all the meanings of ឧក <°uka> ~ អុក <ʔuka> /ʔok/ in Headley's dictionary at SEAlang:

1. 'bellyband, cinch, girth (of a harness)'

2. 'to reproach, blame, censure; to scold; to abuse, to criticize severely'

3a. 'to slam something down; to fall down hard'

3b. 'check, checkmate'

4. 'kind of vulture'

None of those words would seem to be frequent enough to motivate the creation of a ligature for them.

However, there is an Old Khmer word /ʔuk/ 'also' (by coincidence resembling Dutch ook 'also'!) that Jenner transliterates as <ukk> ~ <uk> ~ <ukka>. So I'm guessing that's the word that was once represented by ឨ; once that word became obsolete, its ligature vanished along with it.

What I don't understand about Jenner's transliteration system for Old Khmer is when he chooses to write final <-a>. He and Sidwell do not explain this in their Old Khmer Grammar. I don't know how <ukk> differed from <ukka> in the original script. (I confess I have only seen Old Khmer in transliteration.) Jenner's <ukka> is presumably ឧក្ក <°ukka>, but what is Jenner's <ukk>? Is it ឧក្ក៑ <°ukk·> with a virāma? In Old Khmer Grammar, final <-a> is only in Indic loanwords with the exceptions of CV syllables (ka 'clause conjunction', ta 'subordinating conjunction', sa 'white'). Did Old Khmer scribes carefully write virāmas all over the place - a practice abandoned in modern Khmer?

(Wikipedia says the Khmer virāma is "mostly obsolete". Huffman [1970: 53] says it is "sometimes used in the transcription of Sanskrit words"; his exercises do not mention it at all. I do see it in វេយ្យាករណ៍សំស្ក្រឹត veyyākaraṇa˟ saṁskrïta> 'Sanskrit Grammar' [1999].)

I have wondered if Jenner simply omits <-a> in transliterations of native Khmer words ending in consonants, but that does not explain cases like <ukk> ~ <ukka> or

<ʼāyatta> ~ <ʼāyatt> 'dependence' < Skt āyatta-

If I were right, <ukka> and <ʼāyatt> shouldn't exist, but they do. Is Jenner in fact consistently writing all word-final consonant symbols without any other dependent symbols as <Ca>? If so, then do transliterations like <ukk> and <ʼāyatt> reflect spellings of words before consonant initials of other words?

That new hypothesis predicts that the courtesy title <poñ> should always precede a consonant. And yet ... Old Khmer Grammar example 307 begins with

<poñ uy oy kñuṃ ...> (K.557/600N: 1, 612 AD)

poñ Uy give slave

'The poñ Uy has given slaves ...'

I would expect ˟<poña uya oy kñuṃ>. How were those words written in the original? As

(a) four akṣaras without virāmas

បោញុយោយ្ក្ញុំ

<po ñu yo ykñuṁ> (I prefer <ṁ> for anusvāra to <ṃ> which I reserve for Pyu subscript dots.)

(b) four akṣaras with virāmas <·>

បោញ៑ឧយ៑ឱយ៑ក្ញុំ

<poñ· uy· oy· kñuṁ>

(c) something even more bizarre with subscript independent vowel symbols ឧ <°u> and ឱ <°o> instead of the dependent vowel symbols in (a)?

If not for Jenner's transliteration, I would have imagined something with seven akṣaras like

បោញឧយឱយក្ញុំ

<po ña °u ya °o ya kñuṁ>

But Jenner's dictionary doesn't list a spelling <poña> for the courtesy title.

For comparison, a modern Khmer word-for-word translation of the phrase would be

បងឧយឲ្យខ្ញុំ

<pa ṅa °u ya °oya¹ khñuṁ>

without any indication of which <a> are silent.

If I am right about ឪ <°ǔ> being from ឧ <°u> + វ <va>, I might expect to find earlier spellings of ឪ-words with <°uva>. The most important ឪ-word might be <°ǔbuka> /ʔəwpuk/ 'father'. The earliest attestations of this word that I can find are <°ābbhūka> (1599) and <°ābuka> (1602) which are close to Sanskrit āvuka- 'father'. A regular reflex of <°ābuka>, អាពុក <°ābuka> /ʔaːpuk/, exists today alongside <°ǔbuka> /ʔəwpuk/. How did /aː/ change to /əw/? I don't know of any other instance of such a change.

I think /əw/ is the regular reflex of */əw/ (not */aː/!) after *voiceless consonants. (Is /ʔəwpuk/ < */ʔaːbbuk/ with reduction of */aː/ and lenition of */b/?)

*/əw/ raised to /ɨw/ after *voiced consonants. Compare:

ត្រូវ <trūva> /trəw/ < *trəw 'correct'
នូវ <nūva> /nɨw/ < *nəw 'with'

Do the <ū> spellings reflect earlier vowel length? Conversely, the absence of the extra stroke for vowel length in ឪ <°ǔ> could be interpreted as indicating an absence of earlier vowel length, but I don't think ឪ <°ǔ> and <ūva> had distinct *rhymes. The extra stroke of ឩ <°ū> may have been dropped from the abbreviation ឪ <°ǔ> as redundant since there was no contrast between <uva> and <ūva>. Did a transitional character combining ឩ <°ū> with the 'hair' of វ <va> ever exist?

¹In modern Khmer, ឲ្យ /aoj/ 'to give' has an unusual spelling <°ȯya> (not ˟ឱយ <°o ya>!). My transliteration has no space to indicate that <-ya> is a subscript rather than an independent akṣara យ <ya>. That subscript is unusual because it represents a final glide /j/ rather than a medial glide /j/ (as in ខ្យល់ <khya la'> /kjɑl/ 'wind') or zero (as in ពាក្យ <bā kya> /piəʔ/ 'word' < Skt vākya-).

ឲ <°ȯ> is a rare character that is, as far as I know, unique to ឲ្យ <°ȯya>. I transliterate it as ឲ <°ȯ> to distinguish it from regular ឱ <°o>.

Could <°ȯya> have originated as an abbreviation of ឲយ្យ <°ȯyya> (my guess for what Jenner's 17th century oyya represents)? Or is subscript <y> for final /j/ a remnant of this practice?

Subscripts were previously also used to write final consonants; in modern Khmer this may be done, optionally, in some words ending -ng or -y, such as ឲ្យ aôy ("give").

I would like to know which other words have subscript <ṅ> and <y> for codas.

19.8.21.23:41: XIANGNAN TUHUA INTERLUDE: GITHUB JIANGYONG

1. Tonight I discovered sgalal's list of '江永 Jiangyong dialect' readings for 女書 Nüshu 'women's writing' characters on GitHub. This dialect which I'll call GJY (Github Jiangyong) differs yet again from the ones I talked about last night:

Sinograph	Gloss	MC	OXT	Daoxian	Baishuicun	GJY	Mandarin
話	speech	*ɣwæjʰ	fuə³³	xu⁵²	fə³³	fwe⁴⁴	xwa⁵¹
花	flower	*xwæ		xu⁵⁴	fə⁴⁴	fwe⁴⁴	xwa⁵⁵
會	association	*ɣwajʰ		ui²²	uɯ³³ ~ fɯ³³	vwe³³	xwej⁵¹

GJY is written without non-ASCII characters, so I am unsure if w is really ɯ, e is really ə, etc. In any case, GJY is distinct from OXT, Daoxian, and Baishuicun.

All three morphemes are written with one character in OXT Nüshu, but are written with two different characters in GJY Nüshu. Given the mismatches above, I expect many other dialectal differences in Nüshu spelling.

I found GJY via nushuscript.org which also has two different hanzi-to-Nüshu converters.

2. Those converters use images to allow people without Nüshu fonts to see Nüshu characters. I'm one of those people. I went looking for a free Nüshu font and found a page about Chelsy Jiayi Wu's NVSHU SANS (V being a common substitution for Mandarin Ü). Some of the issues she mentions are relevant for Tangut, Jurchen, and Khitan as well as Nüshu: e.g.,

Every handwritten sample reflects the varying styles of its author. Without a history of standardization, it is difficult for me to identify what elements of each character are necessary for letterform identification. Where should a stroke begin and end? Which elements are ornamental and which are absolutely essential? Where should this dot be positioned relative to the stroke?

Chelsy Wu has an interesting background: "Born in Tokyo, raised in Shanghai", and a triple native speaker of Japanese, Mandarin, and English. (No Shanghainese?) She runs the site Explorations in Global Language Justice.

19.8.20.23:35: OMNIGLOT'S XIANGNAN TUHUA SAMPLE (PART 1: INTRODUCTION)

1. Omniglot has a sample of 女書 Nüshu 'women's writing' characters with readings in an unspecified variety of 湘南土話 Xiangnan Tuhua 'Southern Hunan local speech'. This variety (hereafter 'OXT') is not the same as the 道縣 Daoxian and 白水村 Baishuicun¹ varieties at 小學堂 Xiaoxuetang. Compare their reflexes of Middle Chinese 話 *ɣwæjʰ 'speech':

OXT fuə³³
Daoxian xu⁵²
Baishuicun fə³³

Daoxian and Baishuicun are about 45 km apart. I have no idea how far they are from OXT.

Omniglot gives the local name of Xiangnan Tuhua as [tifɯə] without specifying tones. I suspect [tifɯə] is etymologically 地話 'earth speech', so [fɯə] may be from a fourth variety of Xiangnan Tuhua (OXT2?).

Unless I'm misreading the OXT sample, the OXT reflexes of Middle Chinese 話 *ɣwæjʰ 'speech', 花 *xwæ 'flower, and 會 *ɣwajʰ 'association' are homophonous. But that is not the case with Daoxian and Baishuicun:

Sinograph	Gloss	MC	OXT	Daoxian	Baishuicun	Mandarin
話	speech	*ɣwæjʰ	fuə³³	xu⁵²	fə³³	xwa⁵¹
花	flower	*xwæ		xu⁵⁴	fə⁴⁴	xwa⁵⁵
會	association	*ɣwajʰ		ui²²	uɯ³³ ~ fɯ³³	xwej⁵¹

會 in 會不會 lit. 'would or wouldn't' (hard to translate; more examples here) has another pronunciation in Daoxian: xo⁵². As association-會 and 會不會-會 are usually homophonous, I suspect that ui²² and xo⁵² belong to different strata of Daoxian: at least one may be borrowed, and if both are borrowed, one is newer than the other.

I added a Mandarin column to show how different Xiangnan Tuhua varieties from it as well as from each other.

That glimpse at Xiangnan Tuhua-internal variation makes me wonder how that variation maps onto Nüshu. I betray my ignorance of Nüshu here with some basic questions:

How diverse are the varieties of Xiangnan Tuhua represented in Nüshu?
Is it possible to determine that Nüshu was originally developed to represent a specific variant? E.g., if Nüshu has a single character to represent 'speech', 'flower', and 'association', that coud suggest it was originally developed to represent OXT or a dialect like OXT rather than varieties like Daoxian or Baishuicun in which the three morphemes are not homophones.
Is there internal evidence within Nüshu suggestive of sound changes that occurred since it was first developed? For example, does Nüshu have two homophonous characters that could have represented two different syllables that later merged? Conversely, have there been recent phonemic splits not reflected in Nüshu?

2. Last night - shortly after mentioning Wanzi Gelao - I was horrified to learn of this fake 'Gelao' manuscript. A fake is bad enough; a fake that is simply disguised Chinese is even worse. To pretend that the language that is replacing Gelao is 'ancient Gelao' is tasteless.

¹Five years ago, I wrote a ten-part series on Baishuicun: 1-4 / 5-8 / 9-10.

19.8.19.23:59: THE VELAR NASAL IN MARQUESAN AND TAHITIAN

(I started writing this last night but got distracted and didn't finish before midnight. So I missed a day of blogging for the first time in a long while. But since I barely post anything I write anymore, that lacuna makes no difference for the public. And at least I started writing an entry - I hadn't completely forgotten to blog.)

Three days ago I learned of the reflexes of Proto-Polynesian *ŋ in Marquesan dialects - and tonight I found out that Tahitian has a glottal stop reflex:

Proto-Polynesian	*ŋ	*k	*n	*t
Tai Pi Marquesan	ŋk	k	n	t
South Marquesan	n	ʔ	n	t
North Marquesan	k	k	n	t
Tahitian	ʔ	ʔ	n	t
Hawaiian	n	ʔ	n	k/t
Samoan (formal)	ŋ	ʔ	n	t
Samoan (colloquial)	ŋ	ʔ	ŋ	k

I was surprised by Tai Pi ŋk because

I rarely see a language in which a nasal became a prenasalized voiceless stop (the only other one I know of is Wanzi Gelao in which *ŋ > ŋk)
Unlike Wanzi Gelao in which *n > nt and *m > mp, *n and *m survive intact in North and South Marquesan (and, I imagine, in Tai Pi). (Wanzi Gelao also has nasal reflexes of proto-nasals. Ostapirat [2000: 111] says "It is unclear whether these variants might point to an early distinction or are simply due to dialect mixture.")
ŋk is the only complex consonant I've ever seen in a Polynesian language

I suppose North Marquesan k might be from a Tai Pi-like *ŋk whereas South Marquesan n may be directly from *ŋ.

In Tahitian, *ŋ may have become *ŋk as in Tai Pi and then *k as in North Marquesan. That *k - a merger of earlier *ŋ and original *k - then shifted to ʔ.

In Hawaiian, *k shifted to ʔ, but *ŋ merged with *n (as in South Marquesan) rather than with *k.

In Samoan, *k shifted to ʔ, but n merges with ŋ in colloquial speech. This n > ŋ shift is parallel to the colloquial shift of t > k.

In standard Hawaiian, t shifted to k, but n did not undergo a parallel, Samoan-like shift to ŋ. t survives on Niihau and - at least as of forty years ago - in pockets on Maui and Molokai (Elbert and Pukui 1979 apud Schütz 1994: 116).

Tangut Yinchuan font copyright © Prof. 景永时 Jing Yongshi
Tangut character image fonts by Mojikyo.org
Tangut radical and Khitan fonts by Andrew West
Jurchen font by Jason Glavy
All other content copyright © 2002-2019 Amritavision