Or, in Jurchen,

< so.nggiyan CHICKEN DAY> songgiyan indahūn inenggi

1. I've glanced at Pirahã phonology before but never noticed two things until today:

1a. Pirahã has no nasal vowels. So why does the exonym of the Hi'aiti'ihi 'Straight Ones' have a nasal vowel? Is it a Portuguese borrowing from some other indigenous language? (And what does the apostrophe represent? Is it another way to write the glottal stop which is written as x elsewhere?)

(3.3.20:05: No, the apostrophe indicates a high tone; it seems to be an easily typeable substitute for the acute accent that Everett uses. Vowels not followed by apostrophes have low tones.)

(3.3.23:58: And even if Pirahã has no nasal vowels now, maybe it once did. The name could date back to the first contact with Portuguese speakers.)

1b. Pirahã has three vowels


which are not in a 'top-heavy' classical 'triangle':


I have never seen a Pirahã-type 'left-heavy' triangle before. Do any languages have the other two hypothetically possible layouts?





3.4.0:33: Today (3.3) I thought it would be interesting to see what distributional phenomena and allophony would motivate such analyses. Here are some syllables in hypothetical languages with the latter two types of vowel systems:

'Right-heavy' with nine phonetic vowels


'Bottom-heavy' with nine phonetic vowels

/ɨ/ /æ/ /ɑ/
[kɨ] [kæ] [kɑ]

I've designed the allophones so each phonemic symbol matches one allophone. But what if they don't? What if the 'bottom-heavy' language had only two phonetic high vowels distributed like this?


Do those syllables share a single vowel phoneme? What if speakers rhymed [i] and [u]? Should that vowel phoneme be symbolized as /ɨ/ halfway between front [i] and [u] even though [ɨ] isn't actually in the language? Does it make sense for /t/ to palatalize before nonpalatal /ɨ/? That is, in fact, what I think happened in Late Old Chinese: e.g., 之 *tə > *tɨə > *tɕɨə 'genitive marker'.

1c. In Don't Sleep, There Are Snakes: Life and Language in the Amazonian Jungle (2008), Daniel L. Everett mentions Steve Sheldon's Pirahã neologism for the Christian God, Baíxi Hioóxio 'Up-high Father'.

3.4.0:32: And I forgot to mention why I mentioned that ... I went on to write item 2 without finishing 1c.

If [k] is an allophone of /hi/, then is it possible to pronounce Hioóxio as Koóxio? What would be the phonetic motivation for hardening /hi/ into [k]?

3.4.21:02: Answering my first question, no:

The sequences [hoa] and [hia] are said to be in free variation with [kʷa] and [ka], at least in some words.

But why wouldn't [ha] be in free variation with [ka]? I thought perhaps at one time pre-Pirahã had *[q] and *[k] with Mongolian-type distribution: *[qa] but *[ku] and *[ki]. *[qa] became [ha], whereas *[kua] and *[kia] became [hoa] ~ [kʷa] and [hia] ~ [ka]. However, if *high vowels conditioned *[k], why aren't [hi hu] in free variation with [ki ku]?

I'm surprised to see presumably disyllabic [hia] in free variation with monosyllabic [kʷa]. Does such variation only apply if /i/ and /a/ are both the same tone? Do /hía/ (high + low tone) and /hiá/ (low + high tone) exist? If they do, do they have monosyllabic free variants?

2. I have long been puzzled by the correspondences of the codas in the early written Sino-Tibetan languages. Today it finally occurred to me to see how much more confusion adding Evans' (2001) Proto-Southern-Qiang tones would cause. In chronological order from left to right (except for Proto-Southern-Qiang which can't be dated; I've put it last since it alone has the innovation of losing most codas):

Old Chinese
Old Tibetan
Proto-Southern Qiang

-s (< *-ds?)
(*low tone)

*low tone

*low tone
(*high tone)
(*high tone)
(*low tone)

*low tone

The Tangut numerals from 'one' to 'nine' all have the 'level tone' (1- in my notation which I adopted from Arakawa Shintarō), whereas 'ten' has the 'rising tone' (2- in my notation) from a source I symbolize as *-H, possibly a glottal consonant. I doubted there would be any correlation between the two Tangut tones and the two tones of Proto-Southern Qiang, its closest relative among the languages above. And of course there was none.

3.4.0:34: Did tone 1 spread through the closed set of Tangut numerals 'one' through 'nine"/

3.4.21:14: Notes on individual numerals:

'One': Straightforward. Tangut and Proto-Southern Qiang do not preserve any final stops.

'Two': Pre-Burmese points to *-t, Old Chinese, Old Tibetan, and Tangut are ambiguous, and Pyu has an open syllable.

The function of the *-s in Old Chinese and Old Tibetan is unknown.

I have no idea what Tangut *X is; it is a dummy symbol for the source of the equally mysterious feature -' which distinguishes certain rhymes in Tangut. I have never found any correlation between *X/-' and any feature in any other language. It could be a Proto-Sino-Tibetan feature preserved only in Tangut, though I doubt that.

'Three': At first I was pleased to see -ḥ in both Pyu and pre-Burmese. But look at 'four', 'five', and 'nine' where pre-Burmese has a -ḥ absent from Pyu. Pre-Burmese -ḥ doesn't correlate with Old Chinese *-ʔ.

'Four': Might  Pre-Burmese -ḥ here be from *-s rather than *-ʔ? Why was this *-s added? There is no trace of it in Pyu (where *-s probably became -ḥ) or pre-Tangut (where *-s may have become *-H).

'Five': Pre-Burmese -ḥ corresponds to Old Chinese *-ʔ, but Pyu lacks the expected -ḥ.

'Six': See 'one'.

'Seven': Tibetan has a unique root for 'seven'.

'Eight': See 'one'.

'Nine': Pre-Burmese -ḥ corresponds to Old Chinese *-ʔ, but Pyu lacks the expected -ḥ.

'Ten': The languages do not share a common root. This is the only pre-Tangut word with *-H in the set, and that *-H / Tangut 'rising' tone corresponds to Proto-Southern Qiang low tone ... just like pre-Tangut *-Ø / Tangut 'level' tone which can also correspond to Proto-Southern Qiang high tone! THE DAY OF THE RED CHICKEN

Or, in Jurchen,

< RED.nggiyan CHICKEN DAY> fulanggiyan tiko inenggi

1. It's actually still the day of the green sheep for me as I write this item, but it's already the day of the red chicken in what was once the Jurchen Empire.

Viacheslav Zaytsev linked to this video of the text in Jurchen found by the Arkhara River discovered by Prof. Andrey Zabiyako (h/t Andrew West who has written the definitive article on the subject in English).

That site is not far from Birobidzhan. I just learned that Biro- is a reference to the Bira River - 'River River'. Bira is 'river' in Jurchen, Manchu, and other Tungusic languages; the word can be reconstructed for Proto-Tungusic. I wonder what specific language is the source of that name and of the name of the Bidzhan River. Wiktionary does not have etymologies for either name.

2. I had no idea Li Fang-Kuei's brother-in-law 徐道鄰 Hsu Dau-lin was once Chiang Ching-kuo's tutor upon the latter's return from the USSR.

3. While looking at Evans' (2001) reconstructions of Proto-Southern Qiang numerals, I realized why his PSQ *a (low tone) corresponds to Tangut 5981 𗈪 0a1 'one' rather than †i4 < *a. Brightening (*a > i) in Tangut might only have applied in word-final position, and *a 'one' only appeared before other words, so its vowel remained intact.

A wilder possibility is that 0a1 is from *ʕa with a pharyngeal *ʕ- that blocked brightening and conditioned Grade I, but there is no evidence for such a pharyngeal in pre-Tangut.

I reconstruct 𗈪 0a1 'one' with Grade I (hence -1 in my notation) because it was transcribed in late 12th century northwestern Chinese as 阿 1a1 in the Pearl in the Palm glossary.

The 0 indicates that I don't know the tone of 'one'. Maybe it literally had 'zero' tone in the sense that its tone may have been neutral.

4. Speaking of numerals, I was surprised to learn that Dmitri Mendeleev used Sanskrit numeral prefixes (eka- 'one', dvi- 'two', tri- 'three') in the periodic table he submitted for publication 150 years ago today. Why Sanskrit?

5. Looking at Alexander Vovin's (2017) reconstruction of Old Korean (OK) *-arari for a verbal suffix 下里 <BELOW.ri> that he seems to regard as cognate to Middle Korean (MK) àráj 'bottom' made me wonder how it lines up with John R. Bentley's 2000 reconstruction of *arUsI 'below, lower' for Paekche (P):

Old Korean
Middle Korean


The correspodence of P *rU and OK *ra may point to a Proto-Koreanic (or Proto-South Koreanic?) *ɔ.

The P and OK words may have different suffixes added to a shared root *arɔ. If the Old Korean liquid had been *l, I might propose a Proto-Koreanic voiceless *l̥ that became P *s and OK *l. But OK *l would not have lenited to zero in MK: OK †arali would have become MK †àrári.

6. What are the characters 𠡙, 𠧭, 烞, and U+2C1D1 (⿺气朴) for? I found them in the Wiktionary entry for 朴 when writing this addendum to "The Day of the White Hare". Of course I don't know a lot of characters. What makes those so special?

- I have never seen 朴 as a phonetic before

- 朴 was not a phonetic in Old Chinese, so its derivatives must postdate Old Chinese

- 力 'strength', 大 'big', 火 'fire', and 气 'air' are not normal left-hand components

烞 has a Wiktionary entry with a Mandarin reading but no meaning. zdic.net defines 烞 as 'the sound of cracking from heat'. It has no definitions for the other three characters.

7. 加藤昌彦 Katō Atsuhiko (2009) reconstructs a ten-vowel system for Proto-Pwo Karen including two unrounded high nonfront vowels and on the basis of dialects preserving a contrast between them. I do not recall ever seeing a description of a living language with a /ɨ ɯ/ contrast before. There seems to be a common assumption that Proto-Sino-Tibetan had a small number of vowels. How such a small inventory expanded into the larger inventories of languages like Proto-Pwo Karen remains to be explained.

8. Today is the centennial of Korea's 三一運動 Samil undong, the March 1st Movement. Looking at the text of the Korean Declaration of Independence (image / English), I was surprised by how relatively modern it looks. It lacks the obsolete vowel symbol arae a (ㆍ), perhaps the most striking characteristic of old hangul orthography. It does have ᄯ <st> for modern ㄸ <tt> and instances of standalone ㅣ <i> instead of 이 <Øi>: e.g., ㅣ며 <i myŏ> as well as modern 이며 <Øi myŏ> for i-mye 'be-and' after vowel-final words.

9. The Jurchen word for 'honey' apparently only survives in Chinese transcription as 希粗 *xi tsʰu in the vocabulary of the Bureau of Interpreters (#1025). The corresponding Manchu word is hibsu [xipsu].

Does *tsʰ represent [tsʰ] < *ps in that Jurchen dialect (which could not be ancestral to Manchu which preserved *ps), or does the transcription conceal a Jurchen [ps]?

How would the Jurchen ancestor of hibsu have been written? Neither a phonogram <hip> nor a logogram <HONEY> have been found. Would the word have been written <hi.pu.su>? We probably do not yet have a complete set of Jurchen characters. Parts of the Jurchen Character Book are missing; there are characters in inscriptions and the Sino-Jurchen vocabularies that are not in that presumably early catalog, and there may be characters that are not in any of those sources. THE DAY OF THE GREEN SHEEP

Or, in Jurchen,

<nion.nggiyan SHEEP DAY> nionggiyan honi inenggi

1. Today is the 900th birthday of Emperor Xizong of the Jurchen Empire.

His Jurchen name, transcribed in Jin Chinese as  合剌 *xo la, was probably either *Hola or *Hora.

He has been credited with the apparently short-lived Jurchen small script. If Aisin-Gioro Ulhicun is right, these are the only two remaining blocks in the small script:

None of the components look like Khitan small script components except for the one at the bottom of the second block resembling 쇼 whose reading has yet to be identified.

쇼 also looks like the hangul spelling of Middle Korean syo 'cow', a word I have yet more to say about.

2. I was hoping Guillaume Jacques would give examples of Tangut *P-causatives in "The Labial Causative In Trans-Himalayan" (2019), and he did. Here are two more examples from Gong (1988: 45-46).

ghost, demon, devil
𘘏 0622
*Pɯ.ʔ[o/ə] to bring evil
𘔚 1671
𗽫 2765
to turn red

It may be significant that all five examples of causatives are Grade III/IV syllables (written by Guillaume with -j- following Li Fanwen and by me with -3/-4). I hypothesize that Grade III/IV was conditioned by *high-vowel presyllables. So the causative prefix may have been *Pɯ-. (*ɯ is my symbol for an unknown high vowel. Maybe I should just write *I or *Y.)

Gong also gives examples of zero ~ -w- alternations with Tangut without any obvious semantic function. Those pairs outnumber the causative pairs and need further investigation. Some may be doublets involving *P-preinitials or presyllables that had nothing to do with causative *Pɯ-: e.g., perhaps

𗪺 3354 1ghi2 'power'

𘏐 5307 1ghwi2 'power'

are two different reflexes of a pre-Tangut noun *Pʌ.gr[a/e] 'power' (Note the *nonhigh vowel in the presyllable needed to condition both lenition and Grade II.) One lost its presyllable before *P- could condition *-w-:

Stage 1: The earliest reconstructible form
Stage 2: Grade II for syllables with *-r- and lower vowels (*ʌ, *a, *e) compensating for *-r-loss *Pʌ.g[a/e]2
Stage 3: *-g-lenition between sonorants
Stage 4: *Pʌ-loss
*ɣ[a/e]2 *Pʌ.ɣ[a/e]2
Stage 5: *a/e-merger *ɣe2 *Pʌ.ɣe2
Stage 6: Presyllabic vowel loss
Stage 7: Labial metathesis: *PC- > *Cw-
Stage 8: *e-raising

(Tone 1 is automatically assigned to pre-Tangut syllables without *-H.)

The exact relative chronology of changes is unknown, though the following suborders are certain:

*-g- must lenite before presyllabic vowels are lost

*Pʌ- must be lost after its vowel conditioned lenition

It would be simpler but not necessary for *-r- to be lost before labial metathesis. Keeping *-ɣr- intact up until metathesis would require *P- to 'jump' over two consonants, not just one: *P.ɣr- > *ɣrw-.

It would be simpler but not necessary for *-r- to be lost before lenition. Keeping *-r- intact up until lenition would require lenition to occur between sonorants in general rather than just vowels.

I think Gong Xun may be right about Grade II being uvularization: i.e., i2 was phonetically [iʶ] conditioned by a medial *-r- that was perhaps uvular *[ʁ] in the vicinity of low vowels.

*r- by a high vowel like the *rɯ- of *rɯ.nej 'red' was not uvular and did not condition uvulariztion/Grade II; *rɯ.nej became Grade IV 1ne4.

3. I saw this in Theraphan Luangthongkum's "A View on Proto-Karen Phonology and Lexicon" (2019) and thought, 'no!'

PTB *b-r-gyat*b-g-ryat > PK *grɔtD ‘eight’

Aside from the macroproblem of Proto-Tibeto-Burman (PTB) probably not existing, a microproblem is the proposed survival of 'PTB' *g- in Proto-Karen. PTB *b-r-gyat is a projection of Written Tibetan brgyad 'eight' back into the past, complete with a *-g- that is a Tibetan innovation - the product of Li Fang-Kuei's law. The consonant cluster †bry- does not exist in Written Tibetan.

I think Tibetan and PK have different prefixes attached to a common *r-root for 'eight'. (But what were those prefixes for?) Chinese has a labial prefix like Tibetan (八 *pret 'eight') whereas Japhug kɯrcat 'eight' and Evans' (2001: 2460 Proto-Southern Qiang *khr[a/e] 'eight' have velar prefixes. (Northern Qiang also has a velar prefix: e.g., Mawo khaʳ.) Tangut 𘉋 1ar4  < *rjat 'eight' may preserve the bare root. It might have had a presyllable *Cɯ-, but there is no internal evidence pointing to either *p- or *k-. If the Tangut form had a presyllable, I would guess it started with *k- since Tangut is more closely related to Japhug and Qiang than to Tibetan and Chinese.

The vowel of PK *grɔtD ‘eight’ is surprising because other languages lack rounded vowels in the word: e.g., Pyu, sometimes thought to be Karenic, has hrat·ṁ /r̥ät/ 'eight'. (Could Pyu /r̥/ be from *gr- via *ɣr-? There is no gr- in Pyu.)

You can see the diversity of forms for 'eight' in Sino-Tibetan at STEDT.

Tangut Yinchuan font copyright © Prof. 景永时 Jing Yongshi
Tangut character image fonts by Mojikyo.org
Tangut radical and Khitan fonts by Andrew West
Jurchen font by Jason Glavy
All other content copyright © 2002-2019 Amritavision