1. Fonts are the blood of the digital world. We can't read on machines without them. And all fonts have typefaces. As ZURB puts it,

A font is a container of type.


A typeface is the design of a set of characters — letters, numbers and punctuation.

In other words, fonts can't be empty containers. There are no fonts without designs. And as the Department of Typography & Graphic Communication at the University of Reading explains, typeface designs

rely on a deep web or of historical, cultural, and technical understanding, as well as plain-old form-making skills. From the impact of traditional forms of writing, the developments in the technologies of type-making and typesetting, the typeface designer needs to be aware of how texts are transmitted and shared in each society, and respond to the editorial practices and conventions of each market.

That university's MA Typeface Design (MATD) program trains students to tap into that "deep web", to be capable of producing scholarly knowledge as well as applying that knowledge to the practical task of creating beautiful texts in many scripts.

I've been slowly reading Zachary Quinn Scheuren's MATD dissertation "Khmer Printing Types and the Introduction of Print in Cambodia: 1877-1977".

I found that last night while trying to Google whether Franklin Huffman's coauthor Im Proum's surname was Im or Proum. Google Scholar treats Proum as the surname, but I still don't really know.

2. This morning I found this long list of samples of predigital Khmer script at khmerfonts.info whose front page is a list of samples of digital Khmer fonts.

And tonight I found khmerfonts' page on how to make a Khmer font. (But see my next entry!)

3. Not long ago I wrote a post showing my ignorance of Old Khmer script. I no longer have an excuse to be in the dark. This morning I found SEAlang's Old Khmer images page which allows me to see Old Khmer texts. But at the moment I can't figure out how to get these features to work:

Clicking an image will load either:

depending on which button in the upper right is blue.

But maybe I'm just a prisoner of my computer illiteracy (exemplified by the deliberately primitive design of this website - my philosophy is not to do anything I don't understand ... which doesn't explain the innumerable forays into the unknown [for me] on this blog ... so maybe that's not my philosophy).

Fortunately, I can check my readings of the texts using the Corpus of Khmer Inscriptions. Unfortunately, only inscriptions with Jenner's readings are up, so I'm out of luck with K.27 which isn't one of them. Looking at K.28, I see virāmas which look like superscript dashes. When did they fall out of favor in Khmer?

4. I've also been slowly reading Meredith McKinney's "Classical Prose" in The Routledge Handbook of Literary Translation (Kelly Washbourne and Ben Van Wyke, eds.). McKinney mentions Flora Best Harris, an early translator of Japanese into English. I wonder how Harris learned Japanese in those days. THE ALTERNATE SCRIPT BUREAU'S KHMER SCRIPT FOR ENGLISH (PART 14)

Indic scripts generally have two types of vowel symbols:

The Khmer script has both types of symbols, though they are not quite used the way one might expect:

The Alternate Script Bureau's (ASB) proposal for writing English in the Khmer script uses the independent vowel symbols for many (not all) vowels, even though it would be possible to write all English word-initial vowels as <ʔa> + vowel character combinations. Some of the phonemic assignments of independent vowel symbols surprised me:

independent vowel symbol
modern Khmer
dependent vowel symbol transliteration modern Khmer
after *voiceless
after *voiced

<ā> (not the inherent vowel <a>!)
<ʔa"> -

<ʔa'> -

<a'> -

ʔə, ʔəj, ʔɨ
<i> e, ə i, ɨ /ɪ/

<ī> əj


ʔo, ʔu, ʔao
<u> o

ʔou, ʔuː ុអ់ <u'> (not <°ū>!) -


ej, ə eː, ɨ /ɛ/
<°ai> ʔaj
<°aiʔa'> -
<aiʔa'> -




<°o₂ḥ> -



<°au> ʔaw
ឳអ់ <°auʔa'> -
<auʔa'> -
<yu> (not <yū>!) ju

អឹអ់ <ʔïʔa'> -
ឹអ់ <ïʔa'> -

<va> (not <va'>!) vɔː ុះ

(9.7.21:59: Added the next six paragraphs and greatly expanded the table above.)

I use hyphens to indicate that <'> and <"> have no sound values of their own:

Other hyphens indicate that a symbol combination is not used in Khmer as far as I know.

Note how the phonemic assignments of ASB independent vowels and their dependent counterparts do not always match: e.g., independent <°ǔ> corresponds to dependent <ū> since Khmer has no dependent <ǔ>.

Some ASB dependent vowels have no independent counterparts. I presume they are written as <ʔa> + vowel character combinations.

ASB takes advantage of the existence of two <°o>-characters to assign them to different vowels.

Once again (see part 11), ASB <ḥ> represents /j/.

ASB regards <yu> and <va> as independent vowel symbols.

¹9.7.21:49: Khmer has two homophonous independent vowel symbols for <°o>. I transliterate them as ឱ <°o₁> and ឲ <°o₂>. Their Unicode names are KHMER INDEPENDENT VOWEL OO TYPE ONE and KHMER INDEPENDENT VOWEL OO TYPE TWO. Huffman (1970: 118) says ឱ <o₁> "is the more common of the two", so I'm not surprised by the numbers in the Unicode names.

²I was taught ឲ្យ <°oya> which is the main spelling in  the online editions of Headley's dictionaries and the only spelling given in Huffman's 1970 textbook and Jacob's 1974 dictionary. Ehrman's grammatical sketch in Contemporary Cambodian has ឲយ <°oya> (with full-sized rather than subscript <ya>)  as the only the spelling. Has the regular spelling <ʔoya> become popular in recent years?

³9.7.12:54: I think /ɛɪ/ in the ASB key to independent vowel symbols should be /eɪ/ as in the ASB key to dependent vowel symbols. THE ALTERNATE SCRIPT BUREAU'S KHMER SCRIPT FOR ENGLISH (PART 13)

The Alternate Script Bureau's (ASB) proposal for writing English in the Khmer script is based on an nonrhotic dialect. Thus it has symbols for vocalic sequences corresponding to /Vr/-sequences in rhotic dialects:

This site
example word
Khmer script transliteration Khmer script transliteration Khmer script transliteration
ីរ <īra>
ូរ? <ūra>? ុះ
<āyïra>? ៃអ់

យើរ <yīera> ឹអ់

Question marks indicate my guesses for sequences I couldn't find in Huffman and Proum (1983). (here is on p. 43 and cure is on p. 44 of H&P.) The spelling <īe> represents [əː] after *voiced consonants in Khmer (e.g., <y>). Does Huffman pronounce cure as [kʰjəːɹ]?

I'm surprised there's no ASB symbol for /ɛə/ as in square. Perhaps the ASB dialect has no /ɛə/. Did it shift /ɛə/ to /ɪə/? The ASB /ə/-vowel subsystem is almost symmetrical except for the lack of a /jɪə/:

/ɪə/ /ʊə/
/aɪə/ /aʊə/

<ḥ> has no consistent function in the ASB system; it corresponds to /ə/ above and to /j/ in <uaḥ> for /ɔj/.

I would have expected /ə/ to be <uʔa'> instead of <uḥ>. (<ua>  isn't available because ASB already assigned that to /ɔ/.)

Modern standard Khmer is also nonrhotic. However, unlike nonrhotic English varieties, *-r has been lost without a trace in modern standard Khmer: e.g.,

(Examples from Huffman 1970: 20.)

I used to think there were a few exceptions ending in <-ăra> and Sanskrit <-arCa>: e.g.,

(Examples from Huffman 1970: 50.)

I regarded the final [ə] as a trace of /r/, but it's not - [ɔə] is the regular reflex of short *a (via *ɔ) after *voiced consonants and before *nonvelar codas. Khmer words could not end in *short vowels. It seems that *ɔ-breaking occurred before *-r was (recently?) lost:

modern spelling
ជ័រ ធម៌ មាន់
stage 1
stage 2: *a-raising after *voiced consonants *ɟɔr *dhɔr *mɔn
stage 3: *ɔ-breaking
*Cɔər *Chɔər *mɔən
stage 4: *r-loss
[cɔə] [tʰɔə] [mɔən]

9.6.0:10: I have left the consonants for 'resin' and 'dharma' unspecified in stage 3 since I do not know whether obstruent devoicing preceded or followed stage 3. THE ALTERNATE SCRIPT BUREAU'S KHMER SCRIPT FOR ENGLISH (PART 12)

1. Here are the last two vowel symbols¹ in the Alternate Script Bureau's (ASB) proposal for writing English in the Khmer script with their counterparts in Huffman and Proum (H&P; 1983) and my own preferences:

This site
example word
Khmer script transliteration Khmer script transliteration Khmer script transliteration




<ï> យូ

In modern Khmer, <o> is pronounced [ao] after *voiceless consonants and [oː] after *voiced consonants. H&P must have the first phonetic value in mind.

In modern Khmer, <au> is pronounced [aw] after *voiceless consonants and [ɨw] after *voiced consonants. H&P must think English /aw/ is closer to Khmer [ao] than Khmer [aw].

H&P and I have the historical sound values of Khmer symbols in mind. In earlier Khmer, there was no [ao], so <au> would have been the best choice for English /aw/.

H&P do not have a special symbol

for /juː/, so I speculate they would write /juː/ with their symbols for /j/ and /uː/.

ASB uses the short neutral (i.e., nonpalatal and nonlabial) vowel symbol ឹ<ï> for the palatal-labial   sequence /juː/ even though <ï> is pronounced [ə] after *voiceless consonants and [ɨ] after *voiced consonants in Khmer.

9.5.0:29: The logic here seems to be that a simple, common Khmer symbol is preferred to a symbol sequence for a common English phoneme sequence.

¹From a rhotic speaker's perspective. ASB is designed for nonrhotic English, as part 13 will make clear.

2. On Sunday I learned of three martial arts that originated in Hawaii. They all have interesting names that I could call 英制和語 <ENG MAKE JPN WORD> Eisei wago 'Japanese words made by English speakers' or 布制和語 <HI MAKE JPN WORD> Fusei wago 'Japanese words made in Hawaii²' - terms intended to sound like the actual term 和製英語 <JPN MAKE ENG WORD> Wasei eigo 'made-in-Japan English words':

2a. カジュケンボ Kajukenbo is from 空手 <EMPTY HAND> karate + 柔道 <SOFT WAY> + 拳法 <FIST METHOD> ken 'martial arts' (see 2b below) + boxing. Note how the long vowel of jū is absent from Kajukenbo. It could be spelled in kanji as 空柔拳法菩 'bo(dhisattva) of the empty and soft martial arts'.

2b. 唐法拳法 Kara-ho Kempo looks redundant in kanji:

Kara is the archaic Japanese word for continental Asia (China and Korea; the word is ultimately cognate to Korea). Here it is written as <TANG> (i.e., Tang dynasty) to specify that Kara refers to China rather than Korea.

法 <METHOD> is read as in most contexts (but see below). Kara-hō is presumably 'Chinese method'³.

拳 <FIST> ken (pronounced [kem] before p-) in Japanese is homophonous with 劍 <SWORD> ken, so 劍法 <SWORD METHOD> 'swordsmanship' (now spelled 剣法 in Japan) is also kenpō (or kem if one prefers to romanize phonetically). That is not a case of 50/50 ambiguity, though. In Google, 拳法 kenpō 'martial art' outnumbers 剣法 kenpō 'swordsmanship' by a ratio of almost 32 : 1 (1.81 million to 57,000).

法 <METHOD> appears again at the end but is read as after ken. 法 was originally borrowed with initial p- in Japanese, but that p- was weakened to h- except in the clusters -np- and -pp-.

Tonight I was puzzled by "DIAN HSUHE" on the official Kara-Ho shield until I figured it referred to Mandarin 點穴  diǎn xué <POINT HOLE>, a.k.a. the 'touch of death'. "HSUHE" is from the Wade-Giles romanization hsüeh with the letters of eh reversed.

2c. 檀山流 Danzan-ryū 'Sandalwood Mountain School' contains a Japanization of Chinese 檀山 'Sandalwood Mountain' (Taansaan in the Cantonese spoken by most Chinese here), an archaic name for Hawaii unknown in Japanese.

I just realized that sandal- in sandalwood looks like an Anglicization of Sanskrit candana- 'sandalwood'. (Middle Chinese 檀 *dan is an abbreviation of 栴檀那 *tɕiendanna, a borrowing of candana-.) It's not - Wiktionary shows that the Europeanization of candana- occurred much earlier in Greek which borrowed the word as σάνδανον sándanon. (Latin in turn borrowed the Greek word as sandalum an unexpected -l-. Perhaps the word was remodelled after the similar-sounding but unrelated word sandalium, the source of  English sandal.)

²9.5.0:27: 布 fu is short for 布哇 Hawai 'Hawaii' which looks as if it should be read Fuai: i.e., the sum of its parts 布 fu and 哇 ai. I've never been able to explain how Hawai came to be spelled 布哇. Usually mysteries of this type can be solved by reading the kanji in Mandarin (i.e., the spelling is imported from Chinese), but 布哇 isn't in use in Chinese (the Chinese name for Hawaii is 夏威夷), and as far as  I know, 布 is not read ha in any language.

³唐法 Kara-hō is an invented 湯桶 yutō-style collocation unique to this proper noun. If I didn't already know that noun, I would read it as hō Kenpō with the Sino-Japanese reading for 唐, since two-kanji words are mostly read with two Sino-Japanese readings, often even from the same stratum of borrowing.

3. I can't remember anymore if I ever wrote a guide to how I assign grades to Tangut syllables, so here goes:

In general, I follow Gong Hwang-cherng's grade assignments though I do not use his notation:

How do I determine whether Gong's -j- corresponds to my -3 or -4?

STEP 1: Is the j-rhyme listed twice in Gong's reconstruction? For instance, Gong reconstructs both rhyme 10 and rhyme 11 as -ji.

If the rhyme is listed twice (like rhyme 10/11 -ji), go to step 2. If not (like rhyme 62 -jụ), go to step 3.

STEP 2: If there are two j-rhymes that Gong reconstructs identically, I assign Grade III to the first rhyme and Grade IV to the second: e.g., Gong's rhyme 10 -ji is my -i3 and his rhyme 11 -ji is my -i4.

STEP 3: If Gong only reconstructs a j-rhyme once, I assign grades mechanically depending on the initial. I assign Grade III if the initial is

All other j-syllables with a nonduplicate j-rhyme have Grade IV.

That assignment is not arbitrary; it follows the general pattern of initials in syllables to which I assigned Grade III and IV according to the methodology in step 2.

That pattern seems to be phonetically motivated. Grade IV was apparently more palatal than Grade III, and the initials associated with Grade III may have been 'antipalatal': v-, l- (phonetically velar or velarized?), and the class VII initials and zh- (phonetically retroflex?).

9.5.2:40: I am reminded of Polish which has retroflex consonants with Tangut parallels:





Polish nonpalatalized velarized [ɫ] became [w] in standard Polish (but is retained in some dialects). Tangut l- and v- could have been like Polish [ɫ] and [w].

The nonpalatalized [l] ~ [w] alternations of Ukrainian and Belarusian also come to mind:

The masculine forms originally ended in *-l.

In all of the above Slavic languages, a lateral and [w] originated from nonpalatalized *l, whereas in Tangut, l- and v- are distinct initial phonemes with distinct histories. I do not intend to draw any deep parallels between Slavic and Tangut. I cite Slavic merely to show how a lateral and [w] can be phonetically similar enough so that one can change into the other. l- and v- must have been phonetically similar in Tangut too.

As for why l- and v- behave like the retroflexes, I am reminded of the unetymological -w- after some Mandarin retroflexes: e.g., in 霜 shuāngwaŋ] 'frost' < Late Middle Chinese *ʂaŋ. And Wikipedia agrees with my perception of English /tʃ, dʒ, ʃ, ʒ/ as "often slightly labialized: [tʃʷ dʒʷ ʃʷ ʒʷ]." So the Grade III consonants are united by some sort of w-ish-ness.

