14.11.29.23:58: ROY ANDREW MILLER (1924-2014)
Today I learned that Roy Andrew Miller had passed away in Honolulu last August.My understanding was that he had been living here since retiring from the University of Washington in 1989. I heard he was using the University of Hawaii library while I was a graduate student in the 90s. But I didn't recognize him because I didn't know what he looked like until tonight. I might have unknowingly walked past the man whose books changed my life.
About twenty-five years ago, I read Miller's The Japanese Language. At the time I was taking introductory linguistics. It was one thing to learn generic principles; it was another to see Miller apply those principles to a language that had always been part of my life. He made abstractions concrete and relevant for me.
I walked away from Chomskyan linguistics after a year, but I became a fan of Miller's, and I moved on to his other books.
I had learned of the hypothesis of a shared ancestor of Korean and Japanese back in high school when I did a term project on Korea. Miller's Japanese and the Other Altaic Languages went even further, introducing me to the idea of 'Altaic', a huge family encompassing Turkic, Mongolic, and Tungusic as well as Korean and Japanese.
I went to graduate school intent on proving the existence of Altaic. However, I actually ended up rejecting the Altaic hypothesis once I studied Old Turkic, Mongolian, and Manchu. To this day I don't think the five 'branches' of Altaic are related; they are certainly part of the same linguistic area, but that does not entail shared ancestry. Neighbors need not be blood relatives.
Although I now disagree with Miller about Altaic, I still owe him a great debt for expanding my horizons beyond Korea and Japan. I knew of the Mongols and Manchus from Chinese history in high school, yet I had never thought about their languages until I read Miller's book on Altaic. Studying Mongolian and Manchu led me in turn to the study of Khitan and Jurchen, two lesser-known languages that fascinate me even now.
Miller's books not only got me interested in the history of Japanese but also inoculated me against the vast mythology that grew around the language in the 20th century. Japan's Modern Myth and Nihongo: In Defence of Japanese appealed to me as a fan of Fight Back! with David Horowitz, James Randi, and John DeFrancis' The Chinese Language: Fact and Fantasy. How I love to see bogus claims debunked!
Linguists shouldn't just talk to themselves; they should reach out to the general public. Miller showed me how. I could understand his books as a teenager without extensive linguistic training. As an adult I struggle* to set the record straight about languages even in casual conversations.
Thank you for everything, RA Miller. Requiscat in pace.
*11.30.4:11: I wrote "struggle" because the burden is on me to make myself understood. I must get my point across without jargon or oversimplification. And it's hard to find a balance between clarity and accuracy.
14.11.28.23:46: WHAT THE *-HƐːK IS GOING ON?
In languages with Chinese-type phonologies, tonal categories 'split' following the loss of a voicing contrast in initial consonants: e.g.,
Before the split: three tones
following *voiced and *voiceless consonants | A | B | C |
After the split: six tones
following consonant that was once *voiceless | A1 | B1 | C1 |
following consonant that was once *voiced | A2 | B2 | C2 |
Examples
All nonimplosive obstruents became voiceless:
*pa A > pa A1
*ba A > pa A2
All sonorants became voiced:
*hma B > ma B1*ma B > ma B2
Original implosives may have conditioned series 1 or 2 tones depending on the language:
*ɓa C > ba C1 (as in standard Thai)
*ɓa C > ba C2 (as in the Thai of Chiang Mai)
Although consonants may have changed, the series of the tone (1 or 2) gives away the original quality of the consonant.
The six tones may later partly merge: e.g., Saigon Vietnamese has a five tone system because tones B1 and B2 merged.
The six tones of Vietnamese
following consonant that was once *voiceless | A1: ngang | B1: hỏi | C1: sắc |
following consonant that was once *voiced | A2: huyền | B2: ngã | C2: nặng |
Certain consonant-tone combinations may be aberrant. Suppose there is a language which never had *voiceless sonorants. So all its sonorants should be followed by series 2 tones. Yet the language occasionally has series 1 tones after sonorants. How is that possible? It turns out the odd sonorant-1 sequences are only in recent loanwords and onomatopoeia that were created after the tonal split. All words, native or otherwise, predating the split follow the regular pattern.
Last week, I wrote about Tai Viet letters which I think were created to handle such anomalous consonant-tone combinations. I thought those combinations might have been in Vietnamese loanwords.
It turns out that Vietnamese itself also has such odd combinations. In Vietnamese, voiceless aspirates and fricatives normally precede series 1 tones: e.g.,
*kʰih > khỉ B1 'monkey' (*-h conditioned B tones)
*haːl > hai A1 'two'
(I will redundantly write the letter-number code for each tone even though the tone is indicated in Vietnamese orthography.)
Exceptions tend to be Chinese loanwords: e.g.,
佛 *but > *fət > Phật C2 'Buddha'
核 *ɣɤek > *ɣɤac > hạch C2 'nucleus'
But what about exceptions that are not obviously Chinese: e.g., hạt C2 'seed'? One could mechanically reconstruct voiced aspirates and fricatives to account for them: e.g., *ɦat. In fact, that is what Thompson (1976: 1185) did; he reconstructed Proto-Viet-Muong *ɦot*. However, my impression is that other Mon-Khmer languages which did not lose a lot of initial consonants do not have voiced aspirates and fricatives. (I can't claim to have seen the phonemic inventory of every Austroasiatic language.) In fact the SEAlang Mon-Khmer Comparative Dictionary (MKCD)'s IPA input method doesn't even offer the option of typing voiced ɦ, implying that the consonant may not occur in the data. It's possible that Vietnamese preserved a *ɦ lost elsewhere, but I'd rather not go that far.
The MKCD doesn't list hạt C2 'seed', but it does list hạch C2 'seeds' which I can't find anywhere else. I thought hạch C2 meat 'nucleus' (as I glossed it above). That Chinese loanword must have entered Proto-Vietic, as the MKCD has Proto-Vietic *-hɛːk 'seed' which is close to Early Middle Chinese *ɣɤek (phonetically *[ɣɤɛk] with vowel height dissimilation that later became even more pronounced after the palatality moved to the coda in *ɣɤac). I suppose the hyphen indicates the presence of a lost prefix. Perhaps that prefix had a voiced initial and was present at the time of the tone split, conditioning a series 2 tone that remained after the prefix was lost.
Conversely, maybe words like
*CV-taːw > *CV-ðaːw > dao A1 'knife'
had prefixes with voiceless initials at the time of the tone split, conditioning a series 1 tone that remained after the prefix was lost.
Returning to hạt C2 'seed', it has no lookalikes outside Vietnamese in MKCD other than Tho hɒːt⁸ with rounding absent in Vietnamese. I assume the even number indicates a series 2 tone. (If I am correct about Thompson's o being a typo, perhaps his Muong Khen hot should be hat.) Hạt resembles hạch, but not as much as the spelling might imply, as hạt has a long vowel and hạch a short vowel. In southern Vietnamese, hạch is pronounced [hat]. Could hạt [hat] be a phonetic respelling of hạch [hat]? That would not account for the difference in vowel length. Are the two words unrelated lookalikes? Is Tho hɒːt⁸ a borrowing from Vietnamese?
*I don't know why he reconstructed *o. I think the o in his data and reconstruction is a typo for a.
14.11.27.23:57: CAN RAM'S HORNS ACCOUNT FOR SINO-KOREAN GRADE II?
Middle Chinese (MC) Grade II rhymes in premodern Sino-Korean usually have the vowel a: e.g.,
-ang in 江 kaŋ 'river'.
-aj in 解 haj 'untie, understand' (from "Stumped by the Sea Camel")
Most of the exceptions* have long puzzled me. They are unlike the corresponding rhymes in Sino-Japanese and Sino-Vietnamese and modern Chinese languages: e.g.,
Sinograph | Premodern Sino-Korean | Sino-Japanese (Kan-on) | Sino-Vietnamese | Cantonese | Mandarin |
隔 | kjək | kaku | cách [kac] | gaak | ge [kɤ] |
庚 | kjəŋ | kau | canh [kaɲ] | gang | geng [kɤŋ] |
界 | kjəj | kai | giái [zaːj] | gaai | jie [tɕjɛ] < *kjaj |
皆 | kʌj | kai | giai [zaːj] | gaai | jie [tɕjɛ] < *kjaj |
更 | kʌjŋ | kau | canh, cánh [kaɲ] | gang | geng [kɤŋ] |
客 | kʌjk | kaku | khách [xac] | haak | ke [kʰɤ] |
Kan-on has a in all six readings.
Sino-Vietnamese has short a before palatals and long a elsewhere.
Cantonese has short a before -ng and long a elsewhere.
Mandarin has either [ɤ] or [ɛ] (both romanized as e).
MC reconstructions** (disregarding tones) don't match those Sino-Korean forms either:
Sinograph | Karlgren | Wang Li | Li Rong | Shao Rongfen | Zhengzhang Shangfang | Pan Wuyun | Pulleyblank | Baxter | This site until recently |
隔 | *kɛk | *kæk | *kɛk | *kɐk | *kɣɛk | *kɯæk | *kjaːk | keak | *kɛk |
庚 | *kɐŋ | *kɐŋ | *kɐŋ | *kaŋ | *kɣæŋ | *kɯaŋ | *kjaːjŋ | kaeng | *kæŋ |
界 | *kăi | *kɐi | *kɛi | *kɐi | *kɣɛi | *kɯæi | *kjaːj | keaj | *kɛj |
皆 | |||||||||
更 | *kɐŋ | *kɐŋ | *kɐŋ | *kaŋ | *kɣæŋ | *kɯaŋ | *kjaːjŋ | kaeng | *kæŋ |
客 | *kʰɐk | *kʰɐk | *kʰɐk | *kʰak | *kʰɣæk | *kʰɯak | *kʰjaːjk | khaek | *kʰæk |
In June, I proposed that MC Grade II was characterized by a medial -ɤ- ('ram's horns') from an earlier emphatic *-ʀˁ-. Ram's horns is the vocalic counterpart of Zhengzhang's medial consonant *-ɣ-.
I think premodern Sino-Korean had three different simplifications of MC *ɤV-clusters:
Either of the last two could apply to *ɤe.1. *ɤa > a (ignoring the first vowel; e.g., in 江 *kɤaŋ > SK kaŋ)
2. *ɤe > *e > jə (ignoring the first vowel; e.g., in 界 *kɤej > SK *kej > kjəj)
3. *ɤe > ʌ (ignoring the second vowel; e.g., in 皆 *kɤej > SK kʌj)
Two shifts occurred prior to those simplications in the Chinese source dialect:
The second vowel of *ɤa sometimes assimilated to a following palatal:
Conversely, the second vowel of *ɤe sometimes dissimilated from a following palatal:*ɤaJ > *ɤeJ (e.g., in 庚 *kɤajŋ > *kɤejŋ > SK *keŋ > kjəŋ and 更 *kɤajŋ > *kɤejŋ > SK kʌjŋ; Tongguk chŏng'un has SK kʌjŋ for both)
*ɤeJ > *ɤaJ (e.g., in 介 *kɤej > *kɤaj > SK kaj)
Notice how two originally homophonous Middle Chinese syllables (界 and 介) became distinct in the source dialect and remained distinct in Korean.
11.28.1:04: Standard Mandarin may preserve *ɤ in 隔, 庚, 更, and 客. The [jɛ] of 界 and 皆 is from *ɤej > *ɤaj > *eaj > *jaj.
*11.28.1:20: The one exception that isn't surprising is SK -jo from MC *-jaw < *-eaw < *-ɤaw. This is parallel to SK -o from MC *-ɑw. Korean does not have syllables ending in -Vw.
**11.28.1:08: Forms are from http://ytenx.org/ except for Baxter's, mine, and Karlgren's *kɛk (a correction from GSR).
Strictly speaking, Baxter's MC forms are transcriptions, not reconstructions. They are not meant to be taken literally: e.g., keaj is not [keaj].
14.11.26.23:52: A HEART THAT REPORTED ALL TO MY LORD
Having mentioned the Early Middle Korean poem 悼二將歌 To i chang ka 'A Song Mourning Two Generals' (1120 AD) last night, I thought it might be interesting to examine its first line and demonstrate the problems involved in trying to decipher hyangga (early Korean poetry). The poem is in the hyangchhal script mixing semantograms (red) and phonograms (blue). Purple indicates words written as combinations of semantograms and phonograms. For simplicity I have transliterated all scholars' reconstructions in the same way except for Kim Wan-jin's which reflects the vowel shift hypothesis that I and others have rejected (e.g., Oh Sang-suk 1998 and Ko Seongyeon 2013).
Sinograph | 主 | 乙 | 完 | 乎 | 白 | 乎 | 心 | 聞 |
Premodern Sino-Korean reading | tsyu | ɯr | wan | ho | pʌyk | ho | sim | mun |
Chinese gloss | lord | second Heavenly Stem | complete | question particle; preposition | white, to report (< make clear [i.e., white]) | question particle; preposition | heart | hear |
Late Middle Korean translation equivalent | nim | - | o(ɣ)ʌro | hɯy- 'white', sʌrp- 'report' | - | mʌzʌm | tɯt- | |
Yang Chu-dong (1942) | *nim | *ɯr | *oʌrɣo | *sʌrβ-ɯn | *mʌzʌm-ʌn | |||
Chi Hyŏn-yŏng (1948) | *orɣo | *sʌrp-on | *mʌzʌm-ɯn | |||||
Kim Wan-jin (1980) | *ni(li)m | *ər | *uɔrɣu | *sɔrβ-ən | *mɔzɔm-ɔn | |||
Yu Chhang-gyun (1994) | *nim | *ɯr | *oʌrɣo | *sʌrβ-ɯn | *mʌzʌm-ɯn | |||
This site | (*nilim?) | *(oɣʌr?-ɣ/h)o | *(sʌr?)ɣo(n?) | *(mʌzʌ?)m-Vn | ||||
Gloss | lord | ACC | wholly | report-ed | heart-TOP | |||
Translation | As for the heart that reported all to my lord ... |
1. 主
There is no way to be sure how this word was read. We know that in the Koreanic Paekche language, 'lord' was transcribed in sinographs as 爾林 *ɲi(e) lim which represented something like *n(y)elim or *nilim, cognate to Late Middle Korean nim. Medial -l- survived in nali 'stream' in the Koryŏ kayo 'songs of Koryŏ' (Kim Wan-jin 1980: 211). So it is likely - though not certain - that 主 was *nilim. (As a convention I write the lost liquid of Korean as l to differentiate it from r which was retained.) The Chinese-Korean glossary Jilin leishi only tells us that 主 'lord' was 主 'lord' in Korean, which doesn't help; the informant may have used the Sino-Korean word to try to impress his Chinese interlocutor.
2. 乙
This cannot be a semantogram because the calendrical term 'second Heavenly Stem' makes no sense here. It must be a phonogram for the accusative ending *-ɯr which surprises me since I would expect 'lord' to be the direct object.
3. 完乎
The first sinograph is a semantograph whose reading is uncertain.
I don't know why Chi did not reconstruct *ʌ; Late Middle Korean o(ɣ)ʌ did not arise from the breaking of o.
乎 could represent *o (cf. early Sino-Japanese wo from Sino-Paekche), *ɣo, or *ho. Could Late Middle Korean o(ɣ)ʌro be a contraction of Early Middle Korean *o(ɣ)ʌrɣo? Or is 完乎 some unrelated synonymous adverb ending in *-(ɣ/h)o?
4. 白乎
If 白 is a semantogram for *sʌrp- and if lenition took place (if *ɣ was in the previous word - there was no *ɣ before lenition), then perhaps *p lenited to *ɣ before *-o in this dialect. (In standard Late Middle Korean, *p lenited to the β that Yang, Kim, and Yu projected back into Early Middle Korean.)
乎 *o/ɣo/ho is a poor phonetic match for Yang and Yu's *-ɯn and Kim's *-ən. Those reconstructions were influenced by the Late Middle Korean adnominal suffix -ʌn which would be expected in this context. Maybe *-o is an Early Middle Korean ending without a Late Middle Korean descendant. Or Chi is right and the ending was *-on. Its *o could have been reduced to ʌ in Late Middle Korean, and the final *-n might not have been written because it assimilated with the initial m- of the next word: *-on m- > *-om m- (analyzed in writing as <-o.m->). Further examples of <-V.N-> for expected *-VN N-sequences could verify this hypothesis.
5. 心聞
心 is a semantogram for an *-m-final word for 'heart'. If the ɣ-s above are correct - i.e., if lenition took place - then the medial consonant of 'heart' probably already lenited to *-z- (unless this poem was composed after *ɣ-lenition but before *z-lenition).
One might be tempted to regard 聞 as a verb 'hear', but that's not possible since Korean sentences do not end in bare verb stems. I think it represents the final *-m of 'heart' followed by a topic suffix *-ɯn that split into Late Middle Korean -ʌn and -ɯn depending on the height of the vowels of the preceding noun. There was no charcter with a Sino-Korean reading *mɯn, so 聞 mun was the best available match.
Yang projected Late Middle Korean -ʌn back into Early Middle Korean even though lower mid unrounded *ʌ is not a good match for the high rounded u of Sino-Korean mun. Kim's *-ən has the same problem.
The only way to make Yang and Kim's readings work is to suppose that the scribe had the early Sino-Korean reading *mən for 聞 (cf. Sino-Japanese mon < Sino-Paekche) in mind.
If Chi's -on became Late Middle Korean -ʌn, then perhaps there was a rounded allomorph *-un of the topic particle due to labial harmony after 'heart' (whose Late Middle Korean ʌ might be a reduction of an earlier *a or *o) that was reduced to -ʌn and -ɯn in Late Middle Korean:
*ma/ozom-un > *-ɯn > mʌzʌm-ʌn
(11.27.1:36: *mazam and *mozam would not trigger labial harmony since the vowel closest to the suffix would not be labial. There is no way to be certain about the vowels since Jilin leishi only tells us that 心 'heart' in Korean sounded like Chinese 心 'heart' pronounced like 尋 which sounded like 心 with a different tone.)
I am skeptical about vowel harmony in Old and Early Middle Korean. If it existed, it might have worked differently from that of Late Middle Korean: e.g., the potential labial harmony after 'heart'.)
Oddly, Yang and Kim respectively reconstructed *-ʌn and *-ɔn in accordance with vowel harmony after 'heart' but violated vowel harmony by reconstructing *-ɯn and *-ən after 'report' which belonged to the same lower vowel class as 'heart'. On the other hand, Chi and Yu disregarded vowel harmony after 'heart'.
14.11.25.23:56: A NEW BOOK: KIAER'S THE OLD KOREAN POETRY
I didn't know about The Old Korean Poetry: Grammatical Analysis and Translation or its author Jieun Kiaer until today. I would like to see it.
The The in its title is unusual; I'm surprised it wasn't removed in the editing process.
I am also surprised that a syntactician wrote that book. I initially assumed she was a historian whom I had not heard of.
I wonder how she dealt with Old Korean whose hyangchhal script is highly problematic. I have seen eight different complete decipherments, and I would like to see the decipherment in Ryu and Pak (2003). As Ramsey and Lee (2011: 57) wrote,
[... I]nterpretation of the hyangga retains a monumental task. We quite honestly do not know what some hyangga mean, much less what they sounded like.
I am curious to see if she has her own decipherment.
Moreover, since the description only mentions fourteen hyangga (early Korean poems) in Old Korean*, I assume the other twenty poems covered in the book are in (Late) Middle Korean, as only twenty-six** hyangga have survived.
11.26.1:39: I am not sure what this line means:
This book provides linguistic explanations for each poem and essential vocabulary – both in Middle and Contemporary Korean.
What are the criteria for "essential" status? I assume all grammatical morphemes are included.
Does this vocabulary accompany each poem, or is it in an appendix?
There is no mention of Old Korean vocabulary, even though two-fifths of the book is about Old Korean poetry. Is that because there is no universally accepted decipherment of Old Korean?
Are the Contemporary Korean forms translations of the Middle Korean forms?*11.26.2:24: I am guessing that these are the fourteen Shilla poems preserved in Samguk yusa (late 13th c. AD).
**11.26.2:46: I am counting the Koryŏ poem 悼二將歌 To i chang ka 'A Song Mourning Two Generals' (1120 AD) in the total. Lee and Ramsey (2011: 57) exclude it.
14.11.24.23:54: STUMPED BY THE SEA CAMEL
Last night, I rediscovered the Haitai (해태 Haethae) brand and noticed that the English Wikipedia lists the Chinese characters for it as 海陀 'sea hill', one of the many variants of the name of the xiezhi:
Sinographs | Sino-Korean | Source | Late Middle Chinese | Late Old Chinese | Character 1 gloss | Character 2 gloss | Notes |
獬豸 | 해태 haethae, 해치 haechi | Naver | *xɤaj ʈhɤaj ~ ʈhi | *ɣɤeʔ ɖɤɑjʔ ~ ɖɨɑjʔ | first syllable of 'xiezhi' | worm; crawl like a feline beast or reptile; disperse | no other uses |
獬廌 | 'xiezhi' | ||||||
解廌 | Shuowen* | understand (< 'untie') | |||||
懈怠 | 해태 haethae | Daum | *xɤaj thəj | *ɣɤeh dəɰʔ | slack (cognate to 'untie') | idle | normally 'laziness' |
懈惰 | 해타 haetha | *ɣɤaj thwɑ | *ɣɤeh dwɑjʔ/h | lazy | |||
咳唾 | *xəj thwɑ | *ɣəɰ thwɑjh | cough | spit | normally 'spittle' | ||
海苔 | 해태 haethae | *xəj thəj | *xəɰʔ dəɰ | sea | moss | normally 'seaweed' | |
海陀 | 해타 haetha, 해태 haethae | Wikipedia | *xəj thɑ | *xəɰʔ dɑj | hill; usually a phonetic symbol | no other uses? | |
海駝 | Daum | camel |
The inclusion of Middle and Late Old Chinese readings does not mean all these terms existed in Middle and/or Late Old Chinese. See below.
陀 is normally read 타 tha, not 태 thae, a reading which seems to simultaneously reflect Late Middle Chinese *thɑ (< Early Middle Chinese *dɑ) and Late Old Chinese *dɑj. How is such a mixture of new and old possible? Although *th- ~ *d-variation is possible**, I doubt that Sino-Korean thae reflects an Old Chinese variant reading *thɑj for 陀 in 海陀, as I cannot find any attestation of that word before the History of Liao (1344), centuries after the Middle Chinese period. (How old is the Chinese place name 海陀山?)
Here's what I think happened (revised and expanded 11.25.23:23):
- The word may be a late 1st millennium BC loan from some non-Chinese language, as it has no Chinese etymology and I cannot find any attestations prior to 獬廌 in the Records of the Grand Historian (c. 109 BC). Moreover, all spellings are either partly or completely phonetic.
- 'Xiezhi' developed an abbreviated monosyllabic form 廌 (unless 廌 is a loanword and 獬廌 is a Chinese-foreign hybrid compound 'understanding 廌').
- Some of the phonetic variation may indicate multiple borrowings of the same word from different dialects of the same language or a set of related languages.
- Some of the spellings appear to be puns, and some may be of Korean origin, as I have not seen the last six for 'xiezhi' in a Chinese text.
- The earliest forms had two syllables with voiced initial consonants. Forms that would have been read with one or two voiceless initial consonants may be later spellings coined after voiced initials had devoiced (and often aspirated) in Late Middle Chinese: e.g., it is unlikely that 咳唾 is an old spelling because it would have been pronounced *ɣəɰ thwɑjh with a voiceless aspirate in Late Old Chinese absent from the earliest attested forms.
- The 海駝 'sea camel' spelling could reflect a folk etymology.
- The pronunciation haethae spread to the spellings 海陀 and 海駝 in Korean.
11.25.23:30: The mismatch between the characters 海陀/海駝 and the pronunciation haethae in Korean is reminiscent of the mismatch between the spelling colonel and its pronunciation. Cummings (1988: 449) wrote,
Etymologists do not agree completely on colonel, but whatever the historical dynamics of the word, it is a clear case of mixed convergence, the pronunciation of one, apparently earlier form, coronel, having become attached to the spelling of another.
*11.25.1:51: Obviously Shuowen is not a Korean reference source, but any word in a Classical Chinese text has a Sino-Korean reading.
**11.25.23:33: Late Old Chinese 太 *thɑs 'greatest' and 大 *dɑs 'great' go back to Early Old Chinese *hlats and *lats and share a root *lats. The *hl- of 太 must be from a voiceless prefix plus root-initial *l-.
14.11.23.23:53: AVERAGING THAI SONG TONES
Two nights ago, I wrote,
I have no data on Thai Song, the third language written with Tai Viet, but I expect its *implosives to follow the same [tonal] pattern as Black Tai and White Tai.
In other words, I expected Thai Song tones to tend to be higher after reflexes of *voiced initials (other than *glottals including *implosives) and lower after reflexes of *voiceless and *glottal initials. Hence the heights of the tones would roughly match the names of the consonant letters they were associated with: i.e., HIGH and LOW.
Last night I found Somsonge Burusphat's 2012 compliation of Thai Song tones at twelve locations*. Her paper even includes tonal contours of individual speakers. Here are the average heights of each tone on a five-point scale (1 = lowest, 5 = highest) as described on page 37. I did not include varieties whose tones were only described in words in my calculations. As an example of how I calculated the averages, the Loei A tone is 241, 2 + 4 + 1 = 7, and 7 divided by 3 is .2.33. Then I added 2.33 to 3 (< 24, the average of Donyaihom), 3 (< 24, the average of Dontoom), 3.5 (< 34, the average of Suantaeng), etc. and divided that total by 11 (the number of languages with numerical tone descriptions), resulting in 2.9.
Proto-Tai tone | A | B | C | D |
'low' tone class: *voiceless and *glottal initial | 2.9 | 3.6 | 2.3 | 3.6 |
'high' tone class: *voiced initial | 3.8 | 3.2 | 3.2 | 3.6 |
As in Black Tai and White Tai, the *voiced A and C tones are higher than their *voiceless/glottal counterparts, but there is litle or no height difference between the *voiced and *voiceless/glottal B and D tones. The *voiced vs. voiceless/glottal distinction correlates with contours that are masked by single-number averages:
Thai Song: usually level (sometimes falling) vs. rising
Black Tai: level vs. rising
White Tai: falling vs. rising
11.24.23:16: Here are the average heights of Thai Song tones at three points followed by their average contours:
Average starting points
Proto-Tai tone | A | B | C | D |
'low' tone class: *voiceless and *glottal initial | 2.3 | 2.5 | 2.5 | 2.7 |
'high' tone class: *voiced initial | 3.3 | 3.3 | 3.7 | 3.6 |
Average mid points
Proto-Tai tone | A | B | C | D |
'low' tone class: *voiceless and *glottal initial | 3 | 3.6 | 2.3 | 3.7 |
'high' tone class: *voiced initial | 4.5 | 3.3 | 3.6 | 3.6 |
Average ending points
Proto-Tai tone | A | B | C | D |
'low' tone class: *voiceless and *glottal initial | 3.5 | 4.6 | 2.1 | 4.6 |
'high' tone class: *voiced initial | 3.5 | 3.1 | 2.1 | 3.5 |
Average contour
Proto-Tai tone | A | B | C | D |
'low' tone class: *voiceless and *glottal initial | 24 | 35 | 32 | 35 |
'high' tone class: *voiced initial | 354 | 33 | 42 | 44 |
The composite *voiced tones always start higher, though this is obscured in the average contour table since 2.5 is rounded up to 3 and 3.3 is rounded down to 3.
I have not tried to average the presence or absence of glottalization in the C tone.
*11.24.23:24: I have excluded the Black Tai data from Vietnam, so all figures here are from eleven locations in Thailand. Although Black Tai and Thai Song lie on eastern and western ends of a spectrum, I was only interested in the tones of Thai Song (i.e., the varieties spoken in Thailand). See this table for Black Tai tones.