Why are prospects for the survival of small languages so bleak?

It's difficult to make generalizations about speakers of thousands of different languages, but let me try. They

- have few other people they can talk to in their own language

(otherwise their languages wouldn't be small)

- have no countries (or governments) of their own

(and we have already seen how a government cannot sustain a language)

- are not wealthy

- are dependent on an outside world that doesn't speak their language

Their children speak the languages of the outside world - the keys to opportunity. They're busy working. They are not affluent middle-class people with time to sit around and study. They might be able to attend evening classes, but one or two hours a week cannot lead to a signficant command of a language. None can study their language full-time like these Marines (via Sarah). Moreover, even if they did somehow - against all odds - attain an adult-level total command of their ancestral language, what would they do with it? How would knowledge of Ainu - or Tangut's surviving sisters, the Qiangic languages - improve their daily lives? Feeling a connection with one's ancestors is not sufficient incentive for most people. 

If laypeople can't keep their ethnic group's languages alive, what can linguists do? In The Rise and Fall of Languages, RMW Dixon urges linguists to document those small languages before they all die. But he "doubt[s] if one linguist in twenty is doing this." (p. 157) That was back in 1997. I don't think the situation has changed since then.

Why aren't linguists taking Dixon's advice? Again, there is no economic incentive to do so. A graduate student in linguistics typically wants to get out of school and get a tenure-track position as quickly as possible. The easiest way to do this is to focus on some narrow aspect of a language one already knows. Learning an obscure language that is completely different from any of the big languages one has studied is time-consuming and difficult when there are no dictionaries or textbooks - when one has to reverse-engineer its grammar from speakers who have no conscious knowledge of why they say what they do. Fieldwork veteran Dixon estimated that "At least 3 years is needed to do a good job [of documenting a language]; the total cost will be (at 1997 values) around [US or Australian?] $200,000." (Dixon is Australian.)

But in fact the total cost for the fieldworker is even higher. The fieldworker has to abandon the comfort of an office in a college town for a rural area, possibly in the Third World. No Starbucks there.

Moreover, what happens after he comes back? Suppose he submits his grammar as his PhD dissertation. Knowing that he has made a permanent contribution to the world's linguistic knowledge doesn't pay the bills the way that a dissertation on a 'name' language based on a fashionable theory will. Universities are not looking for specialists in thousands of languages almost nobody ever heard of. A versatile, do-it-all linguist like Dixon has a good shot at employment, but most linguists are 'tunnelers' who lack Dixon's range. If they can only do one thing, they want that one thing to be marketable (e.g., Mandarin syntax rather than Muya). Still, it is appalling if the world's only expert on (insert name of obscure language here) cannot get a job.

I was told not to write my dissertation on Tangut for that reason. And Tangut is far more famous than, say, its living relative Muya. How many languages have an entire movie devoted to their lost empire?

Although it is in mankind's interest to preserve these languages before it's too late, it is usually not in the self-interest of either ethnic group members or linguists to do so.

Next: More reasons why few want to listen to Dixon. THE NEXT GREAT EXTINCTION

Life on Earth has gone through at least five mass extinctions that have wiped out 99% of all species that have ever existed. We could be living through a sixth extinction.

Languages also go extinct. In The Rise and Fall of Languages, RMW Dixon wrote,

It is estimated that of the 5,000 or so languages spoken in the world today at least three-quarters (some people say 90% or more) will have ceased to be spoken by the year 2100, as a consequence of the punctuations [disruptions] engendered in the first place by European colonization. (pp. 116-117)

One of those languages will be Ainu, the focus of my last three posts. In the very near future, the "over 10 Ainu people who are now in their 70s and 80s" mentioned in Bengtson's interview with Itabashi will die and take their language with them.

Of course, the death of Ainu in Japan will have nothing to do with "European colonization". Dixon is overgeneralizing. Languages are dying everywhere.

How long are the remaining relatively close relatives of Tangut going to last in China? Southern Qiang is said to have no monolinguals; it's spoken at home by "[o]lder adults" with a "[n]egative language attitude". They use Mandarin or Tibetan outside the home. I presume they speak to their children in Mandarin, the language of education (and employment). If they have TVs, the programming is in Mandarin. Eventually, their grandchildren will be in the position of today's last Ainu speakers:

No Ainu people are routinely using their language, but they are still able to speak their language because they grew up routinely hearing the language from their grandparents who lived together with them when they were very young.

These Ainu speakers almost certainly lack the full range of monolinguals. Their grandparents probably spoke to them in a mixture of Ainu and Japanese and the Ainu they heard was only a subset of an adult's knowledge of Ainu. Fortunately Ainu is not being recorded for the first time today. We have access to earlier records like Batchelor's 1905 dictionary. Nonetheless, there are many questions a dictionary cannot answer.

And they won't be answered by students taking Ainu classes:

... there are about ten weekly operated private Ainu night schools in Hokkaido (the northernmost island, where two thirds of the Ainu population live) to try to revive the language. (Many of the students are Japanese and they meet once or twice a week in the evenings.) But so far we have not succeeded in reviving the language. We lack Ainu teachers. The teachers are mostly middle-aged Japanese (many of them are university professors who teach Ainu at their university) who acquired the language at a university. The old Ainu people cannot afford to teach because of their poor physical conditions. There are no public organizations to support the Ainu language.

Although I am glad there is public (and non-Ainu - i.e., Japanese) interest in Ainu, none of these efforts can ever succeed "in reviving the language." If people who take foreign language classes several times a week for years cannot master a language, how can people taking a class once or twice a week sustain Ainu, even if their classes were taught by native speakers? Not every native speaker can teach a language, and there is no time to train elderly Ainu. "[P]ublic organizations to support the Ainu language" cannot make people keep Ainu alive.

Irish is barely alive in spite of the Irish government's support of the Irish language: e.g., mandatory Irish classes. (I seriously doubt there will ever be compulsory Ainu classes in Japan.) Only 3% of Irish people use Irish as their "main community and household language", and the actual proficiency of the 42% who "regard themselves as competent Irish speakers" is questionable. Are nearly half of all Irish claiming that the school system has made them "competent Irish speakers"? How many are claiming they know Irish simply because it's the PC thing to say - because they're embarrassed to admit the truth about their abilities in 'their' language?

In 2008, the Irish government received suggestions like these to promote Irish:

The introduction of an optional 6 month Gaeltacht [Irish-speaking region] work experience for Transition Year students in secondary school.

The creation of a national young persons' radio station in the language.

The introduction of statutory naming committees at council level to name new residential developments where the use of Irish names would be encouraged

The first will definitely help boost students' abilities in Irish, but how many would sign up, and how many would retain than boost in the long run?

As for the second, how many would actually listen to an Irish-language radio station? People do watch the official ostensibly Irish-language TV station TG4, but not for the 'right' reasons (emphasis mine):

Most of TG4's viewership, however, tends to come from showing Gaelic football, hurling, soccer and rugby union matches [i.e., events that require little Irish language knowledge] and also films in English, and English pop music programmes, although some of its Irish language programmes attract large audiences. In 2007 TG4 reported that overall it "has a share of 3% of the national television market". This market share is up from about 1.5% in the late 1990s [Why? Because of increased English-language programming?].

Until people need to learn Irish to keep up with fads and other things they're interested in, why bother? A youth Irish station would probably be redundant, transmitting information that is easily available in English. I was driven to learn Japanese when I was growing up because there were almost no English translations of the comics, TV shows, and movies I wanted to enjoy. No Irish children are in my situation. None need to read, say, Harry Potter in Irish, except perhaps as an academic exercise.

The third is superficial. A language is more than just its names. There may be more Hawaiian names in Hawaii today than a century ago, but Hawaiian is far less alive now than then.

Nationalistic pride prevents the Irish from admitting what is actually happening. TG4 has the "full support of all political parties in parliament" even if it is a largely futile exercise. No one can afford to say in public that Irish is going to die. That would be political suicide. So the pretense must continue:

On 19 December 2006 the government announced a 20-year strategy to help Ireland become a fully bilingual country. This involved a 13 point plan and encouraging the use of  [the Irish] language in all aspects of life.

Does anyone seriously believe that Ireland will be "fully bilingual" in 2026? Does anyone in Ireland actually read the Irish on signs like this? The Irish for 'skin care', etc. isn't there because someone wouldn't know what "Skin Care" means. It's there to reassure somebody that they're doing their patriotic duty to keep Irish alive - and/or some marketer to promote an Irish image for their store. It's no surprise that

the "100% Irish" SuperValu [an Ireland-based supermarket chain] has few if any Irish signs, and the German retailers Aldi and Lidl have none at all [in their stores in Ireland].

Ainu will never enjoy the level of promotion that Irish has. And if Irish is doomed, so is Ainu.

I want to conclude by pointing out that language extinction is not new. Languages have always been dying. The language(s) of Ireland which has been inhabited since 8000 BC were replaced by the ancestor of Irish which was brought there during the last few centuries BC. Pictish was displaced by Irish's sister Scottish Gaelic. Ainu is probably but one of many pre-Japanese languages of Japan. Whatever happened to the languages of the Kumaso and Hayato, or of the many tribal states named in Wei zhi's account of 3rd century Japan: e.g., 末盧 Matlo, 伊都 Ito, 不彌 Putmi, 投馬 Dəumæ, 斯馬 Siemæ, or 己百支 Kəpækkie? English and Japanese may appear triumphant for now, but what if both were replaced by some third language we can't foresee? IS THE ITAK AN ISOLATE? (PART 3)

If Ainu isn't related to its neighbor Japanese, could it be related to some more distant languages? John Bengtson links Ainu to the hypothetical Austric family whose existence is questionable. I wrote last night,

It is surprisingly easy to create lists of lookalikes in two or more languages.

It's even easier to create such a list if you have many languages to choose from. Austric would be a huge family if it were real, so Bengtson can find Ainu-like words in multiple languages rather than just in, say, Japanese.

His first example is Proto-Ainu *nOk 'egg' : -ok-type words for 'bird' in various Austric languages. Although there was a Kuril Ainu form n'ok 'bird's egg' that could bridge the two meanings, it's also possible that the Proto-Ainu meaning was 'egg' which narrowed to 'bird's egg'. If one allows comparisons of words with only loosely related meanings (eggs don't even have to be from birds), then the chance of finding unrelated lookalikes goes up (and the chance of finding genuine cognates or loanwords goes down). Later on, Bengtson compares Proto-Ainu *nOqi=pe 'brain' with, of all things, Proto-Tai *ŋ[ui] 'marrow'.

The Ainu and Austric words in his third example 'blood' seem to share nothing but a final -m (except for Proto-Miao which has final *-ŋ). Using this kind of methodology, I could claim the following -n words are cognates in a 'Eurasian' family:

English man

German Person

Thai khon 'person', chhon 'person; people'

Vietnamese nhân 'person'

Mandarin 漢 han 'man' (it even rhymes with man!)

but on closer investigation, this falls apart:

German Person is not native German ̣- it and English person are both borrowings from Latin (which in turn got persona from Etruscan phersu), and borrowings cannot be used to establish 'genetic' relationships.

Thai chhon 'person' is ultimately a borrowing from Sanskrit or Pali jana 'person; people' (cognate to Latin genus). About 15 years ago, I suggested Jana as a name for one of Robinson's future species on Hadanus.

Vietnamese nhân 'person' is (1) not even a word but a morpheme that cannot stand by itself and (2) is a borrowing from southern Late Middle Chinese 人 *ɲən 'person'. (Japanese nin 'person' is a borrowing from southern Early Middle Chinese 人 *ɲin 'person'.)

Mandarin 漢 han is originally a river name that came to be used for the Han Dynasty, the Chinese people in general, and 'man' in some contexts. 漢 has never been the normal word for 'person' in any Chinese language. Even if 漢 were the normal word for 'person' in some Chinese language today, it wouldn't matter because 漢 did not have this meaning in Old Chinese. If Chinese 漢 were really descended from some proto-Eurasian word for 'person', it's unlikely that it lost that meaning at an early period, somehow became a river name, and then cycled back to its 'original' meaning.

The problem with megafamilies like 'Austric' is that each word has its own story and it's impossible to know all the stories behind all the words in the relevant languages. I happen to know the stories behind all of the above words, but when I look at Bengtson's lists of forms, I can often only take them at face value and assume that they have no typos, have accurate glosses, etc. But even in a best case scenario, I am reluctant to accept lists of words sharing a single consonant as evidence for a 'genetic' relationship.

Bengtson increases the frequency of lookalikes by allowing himself to find Ainu matches for both the first and second halves of Austric words: e.g.,

Proto-Ainu *nOt 'chin, jaw' : first half of Proto-Austronesian *ŋut'uʔ 'lips' (I don't remember a *t' in any Proto-Austronesian reconstruction. Is it a typo?)

Proto-Ainu *nOk 'egg' : second syllable of Proto-Austronesian *manuk 'bird'

He even allows 'telescoping' matches for the beginnings and endings of Austric words: e.g.,

Proto-Ainu *sum 'oil' : Mundari sunum 'oil' (with an -nu- absent from Ainu)

This means that for any non-Ainu word ABCDE, Bengtson can propose an Ainu cognate resembling

1. ABC (first half)

2. CDE (second half)

3. ABE or ADE (telescoped)

Here are five Old Chinese partial lookalikes of English water using that methodology:

*wəts 'name of a river'; *-s may be a suffix

*wit 'flow' (this also looks like English wet)

*tujʔ (Starostin's reconstruction) 'water'; Old Chinese *-j is sometimes from*-l

*tlurʔ (Sagart's reconstruction) 'water'(rare word)

*wen 'to flow'; Old Chinese *-n is sometimes from *-r, so one might claim *w-n was once *w-r, telescoped from *w-t-r

Real reduction in languages tends to be monodirectional and not random: e.g., in Phanrang Cham, words have shorter variants retaining their second halves (Alieva 1994 in Sagart 1999: 15): e.g.,

tahlaʔ, thlaʔ, laʔ 'I' (but not ta or telescoped taʔ)

cơlan, clan, lan 'road' (but not or telescoped can)

Conversely, in Hindi, words have lost their final vowels and consonants: e.g.,

Sanskrit śiras > Hindi sir 'head' (cognate to cranium, cerebral)

Sanskrit deśa > Hindi das 'ten' (cognate to decade)

I can't think of a language in which telescoping is the norm, though it can happen if the medial consonant is lenited: e.g., English rain < Old English regen.

The only worthwhile part of Bengtson's article is a depressing interview in Appendix D at the end:

Bengtson: Do you record Ainu from people who are still speaking it?

Itabashi: Yes, we can still record Ainu.

Bengtson: Is Ainu really extinct (as usually regarded by us in the USA)?

Itabashi: We still have over 10 Ainu people who are now in their 70s and 80s. No Ainu people are routinely using their language, but they are still able to speak their language because they grew up routinely hearing the language from their grandparents who lived together with them when they were very young.

Bengtson: Is there any attempt to revive the Ainu language among the younger Ainus, either by the Japanese government or private organizations?

Itabashi: Yes, there are about ten weekly operated private Ainu night schools in Hokkaido (the northernmost island, where two thirds of the Ainu population live) to try to revive the language. (Many of the students are Japanese and they meet once or twice a week in the evenings.) But so far we have not succeeded in reviving the language. We lack Ainu teachers. The teachers are mostly middle-aged Japanese (many of them are university professors who teach Ainu at their university) who acquired the language at a university. The old Ainu people cannot afford to teach because of their poor physical conditions. There are no public organizations to support the Ainu language.

My teacher Alexander Vovin, author of A Reconstruction of Proto-Ainu, estimated that only 15 people in Japan spoke Ainu in the 90s. IS THE ITAK AN ISOLATE? (PART 2)

It is surprisingly easy to create lists of lookalikes in two or more languages. For example, Ainu itak 'language' looks like English talk (and is even closer if one ignores the silent -l- of talk). And Ainu pone looks like English bone. (No Ainu word can start with b-.) But the resemblance of these words is purely coincidental. (Can you guess the real etymology of pone? Answer here.*) Just last night I stumbled on Arabic قطع qata` 'cut' (corresponding to Hebrew קטע qata`) which looks like English cut and Middle Chinese 割 *kat 'cut' (< Old Chinese *qat?).

Historical linguists have to sort out the meaningful from the merely coincidental. Let's apply historical linguistic methodology to the five Ainu-Japanese pairs I presented yesterday to show why they are not evidence of a 'genetic' relationship between Ainu and Japanese:

1. Ainu cho 'lock' : Jpn jou 'lock'

This word isn't even originally Japanese; it's from southern Middle Chinese 錠 *diaŋ, borrowed into Old Japanese as *ndiyaũ which eventually ended up as modern Japanese jou, borrowed into Ainu as cho (since Ainu The meaning 'lock' for 錠 is a Japanese innovation. The 2nd century Chinese dictionary Shuowen defines 錠 as a 鐙 ritual vessel.

2. Ainu hitsuji 'sheep' : Jpn hitsuji 'sheep'

The earliest form of this word I can find is Middle Japanese fituzi. The Ainu form reflects modern Japanese changes (f > h, t > ts, z > j), so it must be a recent borrowing into Ainu.

3. Ainu kamui 'god' : Jpn kami 'god'

One might initially dismiss this pair as coincidental, since Ainu has a -u- absent from Japanese, and it is hard to explain why the Ainu would borrow Jpn kami as kamui. However, 'god' was something like *kamu-i in early Japanese, so the Ainu form must be a very old borrowing predating the fusing of early Japanese *-ui into -i.

('God' is one of a number of Japanese words with an independent form incorporating a suffix *-i and an combining form consisting of the raw root - in this case, *kamu-. An early word for 'kamikaze' [divine wind] is kamukaze with the combining form.)

(Leon Serafim speculated that the name of the Kamo Shrine could reflect a proto-root *kamo for 'god'.)

(Note that Japanese kami 'top' has, as far as I know, always been kami, and therefore is unrelated to the root *kamu- or *kamo- 'god'.)

4. Ainu nupuri 'mountain' : Jpn noboru 'climb'

One might dismiss this pair on semantic and phonetic grounds (noun vs. verb, Ainu u and i instead of Japanese o and u), but I wonder if the Ainu form is a borrowing from an otherwise unattested nonstandard old dialect form like *nuburi 'mountain' (< 'that which is climbed'), a noun derived from *nubur- 'climb' corresponding to the standard root nobor-.

5. Ainu umma 'horse' : Jpn uma 'horse'

Like 'lock', 'horse' is a continential borrowing which is probably somehow connected to Mandarin ma < Old Chinese *mraʔ (unless its root is *ra), Korean mal < Middle Korean mʌr, and Mongolian morin (borrowed into Manchu). The earliest Japanese form is muma. Alexander Vovin has speculated that the mu- in muma and mume (from Chinese 梅; > later ume 'plum') represents part of an early Chinese cluster, but mu- ~ u- variation also occurs in native Japanese words that never had clusters.

7.23.00:33: According to Torii (in Batchelor, part 2, p. 30), Kuril Ainu has rosot 'horse' < Russian лошадь loshad' 'id.' (Like Japanese, Ainu has no l, so r is the closest equivalent of Russian l.)

None of the above words are basic vocabulary like 'fire', 'water', etc. - i.e., words that are unlikely to be borrowed.

As far as I know, borrowing from Ainu to Japanese has been very limited: e.g., the place name element -betsu < Ainu -pet 'river'. Batchelor's list of Ainu derivations for Japanese place names which has many examples of -betsu : -pet.

If Ainu isn't related to Japanese, what might it be related to? In part 3, I'll critique arguments for a surprising answer and talk about the current status of Ainu.

*Ainu pone 'bone' is borrowed from some earlier Japanese word for 'bone' like fone or even earlier *pone. (Old Japanese p became Middle Japanese f which then became h except before u. Also see hitsuji 'sheep' above.) IS THE ITAK AN ISOLATE? (PART 1)

When I talked about Japonic languages not being demonstrably related to other languages, Robinson asked me about Aynu itak 'Ainu language'. I think that Ainu is also an isolate.

Ainu does superficially resemble Japanese in a few ways: e.g.,

- Ainu has five vowels: a e i o u (but its u is a real [u] unlike standard Japanese u [ɯ])

- Ainu has pitch accent like Japanese

- Ainu has subject-object-verb order

The problem is that these characteristics are not unique to Japanese and can be found in unrelated languages:

- Spanish also has five vowels.

- Swedish and Norwegian also have pitch accent.

- Subject-object-verb order is the most common in the world. It only seems exotic to most Westerners because European languages don't have SOV basic word order. (German and Dutch have SOV order in subordinate clauses, but SVO in main clauses.)

Ainu also has words that look like Japanese: e.g.,

Ainu cho 'lock' : Jpn jou 'lock' (83)

Ainu hitsuji 'sheep' : Jpn hitsuji 'sheep' (148)

Ainu kamui 'god' : Jpn kami 'god' (205)

Ainu nupuri 'mountain' : Jpn noboru 'climb' (301)

Ainu umma 'horse' : Jpn uma 'horse' (475)

(Numbers refer to pages in Batchelor's dictionary.)

Can you guess why these words don't constitute evidence for a genetic relationship between Ainu and Japanese? WHO FIRST PROPOSED THAT JAPANESE IS ALTAIC?

When I linked to Sergei Starostin's Wikipedia entry last night, I was shocked to read that (emphasis mine)

He devoted much of his later life to developing the theory, originated by Abu al-Ghazi Bahadur Khan in the 17th century, but really revived by Gustaf John Ramstedt in the early 20th century, that Japanese is an Altaic language.

The entries for Abu al-Ghazi Bahadur Khan and Altaic do not mention this. The Altaic entry credits the inclusion of Japanese in Altaic to Anton Boller in 1857.

You can read a neutral account of the possible 'genetic' affinities of Japanese in Shibatani (1990) which also mentions Boller:

... one of the factors contributing to the myths surrounding Japanese is the uncertainty of its genealogy. Indeed, Japanese is the only major world language whose genetic affiliation to other languages or language families has not been conclusively proven. Since the middle of the nineteenth century, this challenging topic has been attacked by both foreign and Japanese scholars alike, and various hypotheses connecting Japanese to a large number of languages and language families have continuously been proposed.

My stance is like Juha Janhunen's: Japanese is not an isolate because it is undoubtedly related to the Ryukyuan languages (not 'dialects'). The Japonic languages (Japanese and Ryukyuan) don't have to be demonstrably related* to any extant modern languages. Their relatives could all be extinct. And if some of their relatives still survive, a relationship with them may never be proven to everyone's satisfaction.

*Distinct from the absolute absence of a relationship. If I cannot prove a relationship, that doesn't necessarily mean it never existed. DHUE YOU KNOW ...

... where Laurie Dhue's name comes from, and why it is spelled with a dh-? I've never seen her on TV, but her unusual surname has been stuck in my mind ever since I first saw it online years ago.

I normally associate dh with a voiced aspirated stop [dʱ] found in South Asian languages: e.g., Sanskrit dharma. The -h- of dharma is not decorative; dh is distinct from d in Sanskrit, Hindi, etc., and have different letters in South Asian alphabets: e.g., dh (ध) and d (द) in Devanagari. (UPSID lists no languages with [dʱ] outside South Asia.)

(Karlgren and - many decades later - Starostin reconstructed Old Chinese voiced aspirated stops *bh, *dh, *gh which have generally been rejected by other researchers. UPSID lists only two languages with an aspirated voiced affricate like Karlgren and Starostin's *dzh.)

Dh can also represent Arabic [ð] (ذ) in some romanization systems, though in Riyadh it represents emphatic d (ض).

Lastly, dh can variously represent velar (!) [ɣ], palatal [j], or zero in Irish and velar (!) [ɣ] or palatal [ʝ] in Scottish Gaelic. Although none of those sounds are d-like, dh was [ð] in Old Irish and is ultimately an lenited [d].

Laurie Dhue is obviously not Indian or Arab, and dh- doesn't appear in initial position in Irish (and, I assume, in Scottish Gaelic). Is Dhue just an idiosyncratic spelling of an English name like Dew, or an abbreviation of something like Goodhue?

Wikipedia lists Laurie Dhue's ethnicity as Dutch, though dh and ue are un-Dutch spelling patterns and Googling for Dhue at .nl sites mostly results in references to her. Dhue in American English sounds like Dutch doe [du] 'do'.

howmanyofme.com says,

There are fewer than 114 people in the U.S. with the last name Dhue.

I suspect that 114 is the estimated cutoff point for inclusion on the US Census list of surnames.

It took me years to learn that Katie Couric is of Breton ancestry and that Hoda Kotb is Egyptian. How long will I have to wait before I learn what to do with Dhue?

Tangut fonts by Mojikyo.org
Tangut radical font by Andrew West
All other content copyright © 2002-2009 Amritavision