Ostapirat (2000) reconstructed only one Proto-Kra etymon with retroflex *dʐ-:

Pages of Ostapirat (2000) Gloss Ostapirat's Proto-Kra This site's Proto-Kra Western Kra Southern and Central-Eastern Kra
Qiaoshang Laozhai Wanzi Lachi (no cognates?)
114, 210, 229 mountain *dʐu A (not reconstructible) zɤu dʐɯ tsha tɕɦi

I cannot reconstruct this word in Proto-Kra because I can't find any Central-Eastern Kra forms in Ostapirat (2000). The word may only be in Western Kra. Moreover, I can't find any cognates outside Kra. If the word were in Kam-Tai, then it could be a KT loan into Western Kra or (less likely) vice versa. Or it could be a Proto-Kra-Dai word inherited by Proto-Kam-Tai and Proto-Kra but lost in all non-Western branches of Kra.

I tentatively assume the word is a Proto-Western Kra innovation. Since I don't reconstruct retroflexes in earlier stages of Kra, I think the PWK (and Proto-Gelao) form was something like *rV-dzu:

- the Proto-Gelao presyllabic vowel conditioned medial lenition (*-dz- > -z-) in Qiaoshang

- Proto-Gelao*rV-dz- fused into dʐ- in Laozhai

- the Proto-Gelao presyllable was lost without a trace in Wanzi; Wanzi tsh- could also come from *dz-

- PWK *rV-dz- fused into pre-Lachi*dʐ- which then partly devoiced to become Lachi tɕɦ- (whereas PWK *dz- became Lachi tɦ- [or did it?; I'll explain my doubts later]) DID PROTO-KRA HAVE RETROFLEX INITIALS? (PART 8: PROTO-KRA ETYMA WITH RETROFLEX *tʂ)

Ostapirat (2000) reconstructed three Proto-Kra etyma with retroflex *tʂ-. Forms implying a lost presyllable are in bold.

Pages of Ostapirat (2000) Gloss Ostapirat's Proto-Kra This site's Proto-Kra Southwestern Kra Central-Eastern Kra
Qiaoshang Laozhai Wanzi Laha Paha Pubiao
114, 232 pillar *m-tʂu A *N-tsu tɕɯ sa cou dʑhuu tɕau
114, 243 teach *tʂun A *tsun tʂɿ səɯ (no cognate) (no cognate) θuan
245 one *tʂəm A *tsəm (unknown) tʂɿ (no cognate) cam tɕja

I don't think a presyllable is reconstructible for any of these three etyma at the Proto-Kra level.

The Qiaoshang word for 'pillar' implies an earlier presyllable, but its rhyme does not regularly correspond with the other forms. I would expect Qiaoshang zɤu, not which implies an earlier *CV-Ti (*T = an unknown alveolar). So I don't think it's cognate.

The A2 tone of Laha (implying a *voiced initial) and the voiced initial dʑh- of Paha point to an earlier Proto-Kra cluster *N-ts-.

I do not reconstruct *NV-ts- because a presyllabic vowel might have lenited the medial *-ts- in pre-Paha, resulting in ðhuu. I would rather not claim that presyllables lenited pre-Paha alveolars to Paha ð(h)- with the exception of pre-Paha *-ts-. A single rule operating on all pre-Paha alveolars following presyllables is simpler.

Both Guillaume Jacques and I independently noticed that *N-tsu resembles the Chinese word for 'pillar':

柱 Old Chinese (?*N-rtoʔ >) *droʔ > Late Old Chinese *ɖuoʔ

I would expect the Proto-Kra borrowing to be *(N-)duʔ or *(N-)doʔ. (Since it's not clear when *N- disappeared from OC, its preservation in the OC dialect known to PK speakers cannot be ruled out. Ostapirat's [2000] PK reconstruction has no diphthong *-uo, so PK speakers would have to approximate it with *-u or *-o.)

Maybe Proto-Kra borrowed *N-tsu from a southern LOC form like *N-tsuoʔ (< OC *N-toʔ without *-r- and with an alternate, Japanese-like development of *-t- to *-ts- after *-u). However, I have no independent evidence to suggest the existence of such a Proto-Kra-like form, and the correspondence of PK zero to OC *-ʔ remains irregular.)

There is no Central-Eastern branch evidence for a presyllable in 'teach', so the lenited initial of Qiaoshang and the retroflexion in Laozhai could reflect a rhotic-initial prefix *rV- added at some point between their common ancestor Proto-Gelao and its ancestor Proto-Southwestern-Kra.

The only evidence for a presyllable in 'one' is in Laozhai. Perhaps a rhotic-initial prefix *rV- was an innovation at some point between pre-Laozhai and Proto-Southwestern-Kra. I assume that -ɿ is the regular Laozhai reflex of Proto-Gelao *-an < PK *-əm after retroflexes, though Ostapirat (2000: 131-132) did not explicitly state that.

I am puzzled by the two different reflexes of PK *ts- in Pubiao: tɕ- and θ-.

Pages of Ostapirat (2000) Gloss Pubiao Ostapirat's Proto-Kra This site's Proto-Kra
181, 228 root tɕaaŋ *tsaŋ A *tsaŋ
243 pillar tɕau *m-tʂu A *N-tsu
245 one tɕja *tʂəm A *tsəm
243 teach θuan *tʂun A *tsun
229 hail θap *tsep D *tsep

I wonder if one reflex was conditioned by lost presyllables which may not have corresponded to presyllables in other early Kra languages. THE FURY OF ETYMOLOGY

Boë, Bessière, and Vallée (2003) reject Ruhlen's monogenetic hypothesis on probabilistic grounds:

We demonstrate, by a simple application of probability theory, that the world roots proposed for a Proto-Sapiens language by Merritt Ruhlen in The Origin of Languages are the result of random chance ... The author used too few roots, too many equivalent meanings, too many languages per family, and too many phonological equivalences for atoo small number of different phonological shapes ... The null hypothesis, which stipulates that similarities among words of the world languages arise by chance, cannot be rejected since its probability is equal to one!

Do we splitters have any equivalent of la furia dell’etimologia?

We would expect Ruhlen to select world roots without any semantic overlap. But his quest for equivalences led him to extend the meaning of each root. Eco coined the expression 'la furia dell’etimologia' to dub this frantic etymological hunting.

This paper is part of a larger enterprise:

For almost five years, this phenomenon of propagation that Dan Sperber calls 'contagion of ideas' has seemed to us important to analyze. We came up with a project to try to understand why some interdisciplinary fields, such as linguistics, are attracted by theories that are ill founded. We intend to investigate why there is continued propagation of such theories while the proof of their validity has not been established or even that they have been falsified. Our project is called Representation and diffusion of scientific ideas in speech and languagesciences.

I'd love to learn more about their project. .PNG, .PDF, AND .XLS FILES AT STARLING.RINET.RU

Although I've been using the site of the late Sergei Starostin for ten years, I don't keep track of all of the changes since I usually focus on whatever I'm looking for at the moment. So I didn't notice these additions to the databases page until Sunday:

.png language family trees: The most controversial is this tree of 'Borean' and its offshoots (e.g., 'Nostratic' and its unnamed sister [Austric-Dene-Caucasian?]), complete with dates presumably generated through glottochronology.

.pdfs: These fall into four categories:

1. Comparative Swadesh 100-word lists

2. Comparative vocabulary databases (going beyond the Swadesh 100)

3. Merritt Ruhlen's typological database; I haven't used this, as I would rather use WALS and UPSID. Ruhlen and I are at opposite ends of the lumper-splitter spectrum.

.xls files: Same data as the .pdfs but in a more convenient format. Unfortunately, the typology database is full of sinographs where I would expect phonetic symbols. "STRATIFICATION IN THE PEOPLING OF CHINA"

I found this 2004 paper by Roger Blench on Saturday. For me, it was a nice follow-up to Guillaume Jacques' "The Geography of South Eurasian Languages", though it was obviously written earlier. Below are some highlights and comments.

p. ii: There are "around 200" languages in China, and non-Chinese "minorities constitute some 91,000,000". Here is a list of the 56 officially recognized ethnic groups. Note, however, that not all members of minorities speak their ancestral language: e.g., there were roughly 10 million Manchu in 2000, but the Manchu language is almost extinct. We may soon hear of the last Manchu speaker. (Xibe is considered to be distinct from Manchu.) 200 languages sound like a lot, but

this is probably a fraction of the number of languages that used to exist; the spread of the Han over the last 3000 years has probably eliminated considerably more diversity.

p. 3: Blench wrote that

since both archaeology and linguistics are direct reflections of human activities, they must, in some way, be congruent ... This potential for congruence is not necessarily the case with genetics; genes are not people, and they have a distributional logic quite different from languages and cultures.

I agree that language and culture are more closely tied to each other than either is tied to genes. Nonetheless, the tie between language and culture is hardly absolute. One can adopt elements of a culture without adopting the language associated with that culture. Could anyone tell that Finns, Hungarians, and Basques spoke non-Indo-European languages solely on the basis of archaeological evidence?

p. 4: "[A]rchaeology in particular is often prone to hijacking by nationalist agendas.". So is historical linguistics - and history in general. Sinocentrism is like anthrocentrism:

It is interesting to compare these [Sinocentric models] with Stephen J. Gould’s strictures on models of evolution that are structured so as they always finish with the evolution of modern humans, rather than being full of byways and forking paths that lead nowhere.

Do all roads lead to Han China and man? If China is the 'central country', is man the 'central species'?

p. 5: Blench correctly pointed out the artificial nature of Old Chinese reconstruction, though I disagree with the details of his stance:

Much historical scholarship has gone into the reconstruction of Old Chinese, a language that would consistently account for the system of ancient texts. But there is, and can be, no evidence that such a language was ever spoken, and no necessary link with proto-Sinitic, a language reconstructed from the wide range of modern dialects.

All reconstructions of ancestral languages are composites that cannot reflect their actual diversity. There is no single system underlying ancient texts. I view 'Old Chinese' as a generic guess.

Nonetheless, I still think that the reconstruftion of OC is a worthwhile endeavor because texts contain features that have long since been lost in all of the modern Chinese languages: e.g., the final *-s in Chinese transcriptions of foreign (mostly Indic) words. It would be mad to discard Latin in favor of Proto-Romance.

Probably if we had better proto-Sinitic, there would less problem about its place within the larger Sino-Tibetan schema.

Certainly, but the pure comparative method will result in a poorer Proto-Sinitic. An eclectic methodology is best.

Historical linguists tend to work with ‘tree’ models, where languages split, usually in binary fashion ... But some linguists are sceptical of these models and it is clear that languages do not always develop in such a convenient fashion. Indeed it seems likely that the common pattern of mainland East Asian languages with reduced morphology, complex tones and simplified word structures represents massive convergence between different language phyla.

I too am skeptical of tree models. I also think that the Altaic pattern, like the Sinoid pattern, is the result of "massive convergence".

p. 7: I am surprised that Tangut is closer to Lolo-Burmese than to rGyalrongic and Qiang in this version of the 'fallen leaves' model.

I've long wanted to look at the 白狼歌 Bailangge (Bailang Song), but have never gotten around to doing so. I was unaware of the 蠻書 Manshu (Book of Southern Barbarians) and 雲南記 Yunnanji (Records of Yunnan) transcriptions of Bai until now.

p. 8: This brings to mind what Guillaume Jacques wrote about the Chinese loanwords in Proto-Hmong Mien:

One important implication of this is that loanwords can be embedded to such a depth that it is difficult to distinguish them from fundamental vocabulary.

p. 9: Could members of the early 'Chinese' Hongshan culture (4000-3000 BC) really have been speakers of Mongolic or Koreanic? Why not Tungusic?

If Blench is correct, modern Hmong-Mien languages are but the tip of the iceberg, and reconstructions of Proto-Hmong-Mien lack details that could have been recovered if other branches had not gone extinct:

It seems as if the other more diverse relatives of Miao-Yao must have been eliminated by the Han expansion and the languages still in existence are the result of a secondary expansion.

The situation may be even more dire with Koreanic. Only one Koreanic language has survived, though others once existed.

4000 BP (why this date?) seems too early for the dispersal of Hmong-Mien. None of the Chinese loans I have seen in Hmong-Mien look like pre-OC.

p. 10: Are there any serious Altaicists who include Ainu in Altaic?

Khitan gives us a glimpse of earlier Mongolic diversity; it has apparently undergone some major changes and is not simply Mongolian in a siniform script.

p. 11: The Chinese and Korean forms listed here are modern.

The Old Chinese word for horse was *mraʔ, probably from an even earlier *mʌraŋʔ which resembles mʌr, the earliest attested Korean word for horse.

p. 12: Biao is listed as a sister of Kam-Sui here, but Ethnologue treats it as a Kam-Sui language. I wonder what Ostapirat thinks.

Daic looks like a branch of proto-Philippines ...

Why "proto-Philippines" rather than Proto-Malayo-Polynesian?

p. 13: Does "geometric cord-marked pottery that is found in South China prior to 5000 BC" tell us anything about proto-Austroasiatic? I say that pottery can't talk.

p. 14: What is the basis for dating the splits within Austroasiatic? Or the 17,000 year figure for the population expansion of Austroasiatic speakers? Genetics? But Austroasiatic speakers are genetically diverse. Do Munda speakers and Vietnamese have common ancestry?

p. 18: Did the 'mummies' speak Prakrit? What is Koränian?

I have never heard of 瓦乡话 Waxianghua (lit. 'Tile Village Speech'):

A language that is still puzzling is Waxianghua, spoken by 300,000 people in a 6000 km2 area in western Hunan Province, Wuling Mountains, including Yuanling, Chunxi, Jishou, Guzhang, and Dayong counties. It differs greatly from both Southwestern Mandarin (Xinan Guanhua) and Xiang Chinese (Hunanese), but is relatively uniform within itself. Neighbouring Han Chinese, Miao and Tujia people do not understand it.

p. 19: I think not!

This paper [Wang et al. 2000] compares mtDNa from ancient Chinese populations with those of today. It purports to show that the people of Linzi in Shandong province were closer to the Welsh than to modern-day Chinese and therefore that there has been significant population replacement. This has been comprehensively discredited.


p. 21:

Historical linguistics has a very long way to go ...


I'm going to interrupt my Proto-Kra retroflexion series for three posts consisting of links to files that may help to put Kra into a bigger perspective.

A week ago, I discovered Guillaume Jacques' "The Geography of South Eurasian Languages" from 2005.

The one must-see slide for readers of this site is number 19, showing where Tibetan, Tangut, Chinese, and Burmese (Gong's big four literary Sino-Tibetan languages) were spoken in 1100 AD.

My comments on a few other slides follow:

15: The 'big tent' version of Sino-Tibetan that I never believed in.

16: One common view of Sino-Tibetan in the West which I used to believe in.

17: I am inclined to accept this view because of the absence of innovations distinguishing Tibeto-Burman as a whole from Chinese. One such innovation might be Proto-Sino-Tibetan > *a (Schuessler 2007: 105) but Tangut may have different reflexes for PST and *a. I may expand on this point in the future after I wrap up my retroflexion series.

20-25: Which of these Sino-Tibetan Urheimat proposals is correct? I don't know.

27, 29, 31, 35: I don't know how the dates for proto-languages in these slides were estimated. My agnostic stance on dates is shared by RMW Dixon (1997: 48):

Surely, the only really honest answer to questions about dating a proto-language is 'We don't know.'

27: It's amazing that 98% (1248/1268!) of Austronesian languages descend from Proto-Malayo-Polynesian (PMP) which is just one branch at the bottom. The other branches are in Taiwan.

29: If "most of the reconstructible [Hmong-Mien] vocabulary [is] of Chinese origin", does that mean much of the basic vocabulary was replaced by Chinese loanwords? I'd like to see a Proto-Hmong-Mien Swadesh word list.

31: I wonder where Ostapirat would place Lakkia.

35: Why do Khasi and Khmu form their own branch separate from Mon-Khmer? What innovation do they share? And what innovations distinguish the Khmer-Viet branch from the Nicobar-Mon branch? I don't remember hearing about this tree in Diffloth's 1997 Austroasiatic class.

37-40: I'm a 'splitter', not a 'lumper'. So I don't endorse any of these macrophyla, though a connection between Kra-Dai and Austronesian is becoming more attractive to me. It's hard to evaluate such hypotheses in detail because one must have an in-depth diachronic and synchronic knowledge of multiple language families. Much careful lower-level work still needs to be done. As RMW Dixon (1997: 38) wrote,

But one should always proceed a step at a time. To suddenly link together seven (or eight) families is a little implausible, especially when, for some of them ... the proto-language of that family still needs considerable work.

There is still not much consensus on the reconstruction of Old Chinese, which is better understood than some other proto-languages of southeast Asia: e.g., I have yet to see a full-scale reconstruction of Proto-Austroasiatic.

Tonight, I learned that Guillaume's presentation was only one of many. I'd like to read some of the others soon.. DID PROTO-KRA HAVE RETROFLEX INITIALS? (PART 7: PROTO-KRA ETYMA WITH RETROFLEX *ɳ)

Ostapirat (2000) reconstructed four Proto-Kra etyma with retroflex *ɳ-. He reconstructed two etyma ('think' and 'give') with PK *n- even though their initial correspondences are like those of his *ɳ-etyma. Forms implying a lost presyllable are in bold.

Pages of Ostapirat (2000) Gloss Ostapirat's Proto-Kra This site's Proto-Kra Southwestern Kra Central-Eastern Kra
Qiaoshang Wanzi Laha Paha Pubiao
114, 224 bird *ɳok D *nok zau ntau nok nhook nokŋ
114, 230 snow *ɳui A *nui (unknown) ntai (no cognate) nii nɦei
114, 235 fat *(m-)ɳəl A *nəl nan mnal nan nɦin
114, 236 salty *ʔ-ɳəŋ B *CV-nəŋh za naŋ (no cognate) ðaŋ (no cognate)
114, 237 thick *C-na A *na ze ntau naa naa nɦee
114, 240 give *nak D *nak ni nak nhaak (no cognate)

(The -h- of Paha and the -ɦ- of Pubiao are tonally conditioned and are not evidence for presyllables.)

I think only 'salty' can be reconstructed with a PK *CV-n- sequence corresponding to *ʔ-ɳ- in Ostapirat's reconstruction. 'Salty' is the only etymon that has bold forms in both major divisions of Kra (Southwestern and Central-Eastern). All other bold forms are in the Southwestern branch (and mostly in Qiaoshang). Presyllables in 'bird', 'fat, 'thick', and 'give' could have been innovations in pre-Qiaoshang (or possibly Proto-Southwestern Kra in the case of 'fat') rather than retentions from Proto-Kra.

The initial of the presyllable of 'salty' must have been voiceless (like Ostapirat's *ʔ-) because it conditioned series 1 (< *voiceless initial) tones in Wanzi and Paha (but not Qiaoshang which has a series 2 tone! - did pre-Qiaoshang have a different presyllable with a voiced initial?).

I cannot find any explicit statement by Ostapirat deriving Paha ð- from his *(ʔ-)ɳ-, but that would not be an unreasonable assumption given how he derived Paha ð- from other *retroflexes. In my system, pre-Paha *-n- denasalized and lenited to -z- before a presyllable that was later lost.

The nt-/n- distinction in Wanzi might have something to do with the presence or absence of presyllables: e.g., a presyllable blocked denasalization:

*n- > nt-

but *CV-n- > n-

However, Qiaoshang z- < *CV-n- corresponds to Wanzi nt- and n-, implying that presyllables were irrelevant to the Wanzi distinction or that Wanzi had a different distribution of presyllables (i.e., Wanzi had monosyllables corresponding to some Qiaoshang sesquisyllables).

PK *nok 'bird' looks like it may be related to/borrowed from Proto-Malayo-Polynesian *manuk 'bird'. The presyllable in pre-Qiaoshang *CV-nuk 'bird' could have been *mV- or something else added by speakers who had no idea the word once had an initial *m-. Note that Laha nok 'bird' does not have mn-, implying that the original *mV- could have been lost in Proto-Southwestern Kra, the common ancestor of Laha and Qiaoshang.

I have no idea why Ostapirat reconstructed the PK etymon for 'snow' with a retroflex since he did not cite any Qiaoshang form with z-.

I don't know if the *mV- presyllable implied by Laha mnal 'fat' was the same as the presyllable in pre-Qiaoshang that conditioned the shift of *-n- to z-. Moreover, I don't know if *mV- in 'fat' was inherited from PK or was an innovation at the Southwestern or Southern Kra levels.

If Ostapirat's *C- in PK *C-na A 'thick' is voiceless, it might account for the unexpected A1 instead of A2 tone for Paha naa.

I don't understand why Ostapirat didn't reconstruct *ɳ- for that 'thick' or 'give'. If he reconstructed a retroflex nasal in PK *ɳok D 'bird' solely on the basis of Qiaoshang z-, he could have done so for those two etyma as well.

The methodology I outlined in part 6 only allows me to reconstruct PK presyllables if traces are found in both major divisions of Kra. A Qiaoshang form with z- is not sufficient evidence. Hence I regard presyllables in 'thick' and 'give' as pre-Qiaoshang innovations.

