Amaravati: Abode of Amritas

14.2.22.23:59: THREE RECONSTRUCTIONS OF PRE-TANGUT 'FIVE'

Here's a closer look at one of the cognate sets I brought up in "Sino-Tibetan Numerals".

Looking only at Old Chinese, Tibetan, and Burmese evidence, one would reconstruct the Proto-Sino-Tibetan word for 'five' with *-a. However, the Tangut word is

1ŋwə

with schwa rather than -a. Normally Tangut -a corresponds to -a elsewhere. How can I account for this irregular correspondence?

Solution A

I reconstruct a preinitial labial *P- as the source of the -w- of 1ŋwə. Related languages have labials preceding the ŋ-root of 'five': e.g., m- in many rGyalrongic varieties. What if this *P- was the initial of a presyllable whose vowel *Ə conditioned the raising of *a to schwa?

*PƏŋa > *PƏŋə > *Pŋə > 1ŋwə

(I use capital letters to indicate presyllabic vowels that 'color' the vowel in the main syllable: e.g., *I caused *a to front and raise to i, etc.)

But is there any external evidence for *Ə? (2.23.0:52: There are other Sino-Tibetan languages with pə- in 'five', but other prefixes are attested, and even if I only look at p-prefixes, I can find all sorts of vowels after p-: e.g.,

Khezha paŋu

Chokri pɤŋu ~ püŋu

Angami peŋu

Mao pongo

Wakung pə ŋ (one syllable or two?)

Lushai pəŋa

It would be cherry-picking to say that Wakung and Lushai preserve a pə- that was lost everywhere else except in pre-Tangut.)

Solution B

The pre-Tangut word was *PV́ŋa with stress on the first syllable. The original *a was unstressed and reduced to schwa. When stress shifted to the second syllable and the original first syllable was lost, the schwa of the second syllable remained as the only vowel in the word:

*PV́ŋa > *PV́ŋə > *PVŋə́ > *Pŋə > 1ŋwə

But there is no external evidence for initial stress in 'five'.

Solution C

Proto-Sino-Tibetan had a vowel *ʌ that raised to schwa in Tangut but lowered to a elsewhere:

*Pŋʌ > *Pŋə > 1ŋwə

But is it wise to reconstruct a vowel at the Proto-Sino-Tibetan simply to account for an irregularity in Tangut?

None of these three solutions satisfy me. Maybe there's a fourth.

2.23.1:09: Could the vowel of 'five' have changed to match that of the adjacent numeral *T-ləC 'four' at a stage when it had lost *-C but before its vowel had become retroflex: *T-ləə? But that begs the question of why 'four' has schwa instead of the expected *i.

14.2.21.23:59: SINO-TIBETAN NUMERALS: EVIDENCE FOR NUMEROUS VOWELS?

Matisoff's (2003) reconstruction of Proto-Tibeto-Burman looks a lot like Written Tibetan (WT), and recent Old Chinese reconstructions also resemble WT. One might expect their common ancestor Proto-Sino-Tibetan to also be Tibetan-looking. But are these reconstructions going in the right direction, or will they be viewed the same way that we now view a Sanskritish 19th century reconstruction of Proto-Indo-European?

WT has only five vowels: a, i, e, o, and u. Recent Old Chinese reconstructions and my pre-Tangut reconstruction have six: the five of WT plus schwa. One might expect these vowels to line up nicely, and they more or less do with the exception of Tangut. Does the inclusion of Tangut data necessitate the reconstruction of a more complex Proto-Sino-Tibetan system with more vowels such as *ɨ, *ʌ, and *y?

Gloss	Old Chinese	Written Tibetan	Tangut	Pre-Tangut	Written Burmese	Matisoff's (2003) Proto-Tibeto-Burman*	Proto-Sino-Tibetan?**
'one'	*tek 'single'	gcig	1lew	*Cʌ-tek	tac	*g-t(y)ik	*tek
'two'	*nis	gnyis	1niəə	*nəC	hnac	*g-ni-s	*nɨ
'three'	*səm	gsum	1sọ	*S-so	sumḥ	*sum	*səm
'four'	*slis	bzhi	1lɨəəʳ	*T-ləC	leḥ	*ləy	*lɨ
'five'	*ŋaʔ	lnga	1ŋwə	*P-ŋə	ṅāḥ	*ŋa	*ŋʌ
'six'	*ruk	drug	1tʂʰɨiw	*K-trik	khrok	*d/k-ruk	*ryk
'eight'	*pret	brgyad	1jaʳ	*rja	hrac	b-r-gyat ~ b-g-ryat	*rjat
'nine'	*kuʔ	dgu	1gɨəə	*gəC	kuiḥ	*d/s-kəw	*kəw

I have excluded 'seven' and 'ten' since no common ancestors can be reconstructed for them:

Gloss	Old Chinese	Written Tibetan	Tangut	Pre-Tangut	Written Burmese
'seven'	*tshit	bdun	1ʂɨạ	*Sɯ-ša	khunac
'ten'	*gip	bcu	2ɣạ	*Sʌ-KaH	chay, kyip

Are WT and Old Chinese really as conservative as many assume them to be***, or could Tangut be preserving Proto-Sino-Tibetan vowel distinctions lost in those other two languages?

2.22.12:36: I have added Written Burmese and Matisoff's (2003) Proto-Tibeto-Burman.

*I don't believe in a Tibeto-Burman subgroup of Sino-Tibetan (i.e., all Sino-Tibetan languages other than Chinese in a single branch of Sino-Tibetan). And even if such a subgroup existed, I do not think its ancestral language would be as similar to Tibetan as Matisoff's (2003) reconstruction: e.g., his PTB *b-r-gyat 'eight' is almost identical to Written Tibetan brgyad and even contains the Tibetan innovation of -g- between *r and *y. Nonetheless I include his reconstruction as an example of a proto-language with only six vowels.

**The Proto-Sino-Tibetan forms here are only offhand guesses, not the product of large-scale systematic comparison. I merely intend them to illustrate an alternative to mainstream reconstructions with four to six vowels (Gong 1995 and Hill 2012). I consider Hill (2012) to be the state of the art in Sino-Tibetan vowel reconstruction. Until recently I have assumed that the unusual Tangut vowels were conditioned by lost presyllabic vowels and/or stress (e.g., *Pʌ́-ŋə > *Pʌ-ŋə́ > *Pŋə > 1ŋwə 'five'). but now I am not so sure.

***No one assumes Written Burmese is vocalically conservative: e.g., the sound change *-ik > -ac in 'two' and 'seven' is well established.

14.2.20.23:59: RED SPREAD: THE *A-XODUS OF TANGUT VOWELS

Below I have color-coded pre-Tangut vowels using the colors I used for Yiddish:

*i	*ə	*u
*e	*a	*o

The Yiddish colors were selected to mimic a spectrum mapped to five vowel types in AEIOU order. Unfortunately that order does not nicely map onto phonetic reality since A is actually between front EI and back OU. Nonetheless I have retained the colors for ease of comparison with my previous post.

I assigned white to symbolize the achromatic nature of schwa, a vowel absent in the Yiddish vowel codes I've been exploring.

So far, it seems that the Tangut vowel types are fairly stable through time* with a few exceptions.

1. Fronting: *a, *u, *o fronted to i- and e-type vowels under uncertain circumstances. I think fronting was conditioned by presyllabic vowels *I and *E, but there is no external evidence to back up this hypothesis.

2. Bleaching: *a, *i, *e, *u, and *o were reduced to ə-type vowels, possibly when originally unstressed. Again, there is no external evidence to back up this hypothesis.

3. *aŋ backed to o-type vowels (a change shared with the local Chinese dialect).

4. Dissimilation: *u fronted to *i before *-ɰ from *-k. (But perhaps it was a presyllabic *I rather than a final glide that conditioned fronting. See 1.)

There may be even more exceptions that have not yet been found.

Out of the six Tangut vowel types, only a and u are derived exclusively from earlier *a and *u (and hence retain their original colors):

i < i, a, *u	ə < a, i, e, o, *u	u < *u
e < e, a, *o	a < *a	o < o, a

*a could become any vowel other than an u-type vowel. Hence "Red Spread" and "*A-xodus".

2.21.0:21: But is a Red Spread realistic? A quick reading of what I just wrote suggests that almost any Tangut vowel could develop into any other Tangut vowel for uncertain reasons. Must we wait this chaos to be explained, or is there a more orderly alternative? I'll explore the latter in my next post.

*E.g., pre-Tangut *i developed into Tangut ə(ə)i, ɪ(ɪ), ɨi(i), i(i), əị, əiʳ, etc. which are all still i-like.

14.2.19.23:59: THE RUBIK'S CUBE OF YIDDISH VOCALISM

David Boxenhorn asked me to expand the table of Yiddish vowels from my last post, so I've done so below using the correspondences in Katz (1978: 8-13):

Dialect \ Vowel

22-23

32-33

42-43

52-53

Netherlandic

[ă]

[oː]

[ă]/[oː]

[ɛ]

[ɛj]

[aː]

[eː]

[ĭ]

[iː]

[ɛj]

[ɔ]

[ɔw]

[aː]

[ŏ]

[u]

[ɔw]

Polish

[u]

[aj]

[eː] ~ [ej]

[aː]

[ɔj]

[ĭ]

[iː]

[oː] ~ [ow]

Lithuanian

[a]

[ɔ]

[ej]

[ɛ]

[i]

[aj]

[ej]

[u]

[ɔj]

The presence of a breve is phonemic rather than phonetic; it indicates the presence of a length opposition. So [ă] and [a] may be phonetically alike, but [ă] contrasts with [aː] whereas [a] does not.

In Netherlandic Yiddish, 13 is short [ă] in Germanic words but long [oː] in Hebrew words.

My quick guess at a Proto-Yiddish vowel system:

31 *ĭ	32-33 *iː	34 *ij	51 *ŭ	52-53 *uː	54 *uw
21 *ĕ	22-23 *eː	25 *ej	41 *ŏ	42-43 *oː	44 *ow
		24 *aj	11 *ă	12 *aː	13 *aw

34 and 54 may have been phonetically *[ɪj] and *[ʊw] since *[ij] and *[uw] would be hard to distinguish from 32-33 *[iː] and *[uː] which they never merged with. The vowels of 34 and 54 later lowered even further.

All three Yiddish dialects have no synchronic vowel length distinction for high back vowels.

Netherlandic Yiddish vowel inventory

[ĭ]	[iː]	[u]
	[eː]	[ŏ]	[oː]
[ɛ]	[ɛj]	[ɔ]	[ɔw]
		[ă]	[aː]

Almost symmetrical (if one considers [ă]/[aː] central) except for the absence of [e] and a length distinction for [u].

Light green for [ɛj] indicates its mixed origins: part yellow *eː, part green *ij.

A different light green for [aː] indicates its mixed origins: part yellow *aj, part blue *ow.

Polish Yiddish vowel inventory

[ĭ]	[iː]	[u]
	[eː] ~ [ej]		[oː] ~ [ow]
[ɛ]		[ɔ]	[ɔj]
	[aj]	[ă]	[aː]

No phonemic height distinction for mid vowels. Phonemically [ɛ] could be the short counterpart of [eː] ~ [ej].

Gray for [ĭ]/[iː] indicates their mixed origins: part green *ĭ/iː, part purple *ŭ/*uː.

Lithuanian Yiddish vowel inventory

[i]		[u]
	[ej]
[ɛ]		[ɔ]	[ɔj]
	[aj]	[a]

No length distinctions. Symmetrical if one treats as the [ej] as the glide-final counterpart of [ɛ]. One could group the vowels into two classes: one with [i]/[j] and one without.

[u]	[i]
[ɛ]	[ej]
[ɔ]	[ɔj]
[a]	[aj]

I think there may have once been a three-way opposition between [ɛ], [e], and [ej] that ended once secondary *e lowered to [ɛ]:

primary *e > [ɛ] (or maybe *e was [ɛ])

primary *ej > secondary *e > [ɛ] (or maybe *ej was [ɛj] and lost its glide)

*eː, *aj > secondary [ej]

*iː > *əj > [aj] (I couldn't reconstruct *ej as an intermediate stage since this vowel did not merge with primary or secondary *ej)

Light green for [ej] indicates its mixed origins: part yellow *eː/*aj, part blue *oː/*ow.

Gray for [ɔ] indicates its mixed origins: part red *aː, part blue *ŏ.

2.20.15:36: I have added color to the tables and comments regarding mixed colors. Reconstructing Yiddish vowel history is like solving a Rubik's cube: e.g., how did the red area split in Netherlandic and Polish Yiddish?

14.2.18.23:33: WEINREICH'S YIDDISH VOWEL CODES

When I discovered Max Weinreich's system of codes for facilitating the comparison of vowels across Yiddish dialects last week, I thought of my letter codes for Sinospheric tones:

First digit \ Second digit	1. Short	2. Primary long	3. Secondary long	4. Diphthong	5. Special long
*1. a**	11. ă	12. ā	13. ă̄	(no 14)	(no 15)
*2. e**	21. ĕ	22. ē	23. ĕ̄	24. eG	25. é
*3. i**	31. ĭ	32. ī	33. ̄̄ĭ̄̄	34. iG	(no 35)
*4. o**	41. ŏ	42. ō	43. ŏ̄ ̄	44. oG	(no 45)
*5. u**	51. ŭ	52. ū	53. ŭ̄	54. uG	(no 55)

The diacritics and G for 'glide' are my notational choices. The acute accent for length is taken from Czech, Slovak, and Hungarian.

Questions (numbered, of course!)

1. What conditioned the secondary length of the -3 series?

2. Why did the -2 and -3 series merge with the exception of 12 and 13 in Netherlandic Yiddish?

Dialect \ Vowel	11	12	13	41	42-43	44	51	52-53	54
Netherlandic	[a]	[aː]	[o]	[ɔ]	[ɔw]	[aː]	[o]	[u]	[ɔw]
Polish		[u]			[ɔj]		[i]	[iː]	[oː] ~ [ow]
Lithuanian		[ɔ]			[ej]		[u]		[ɔj]

3. How did the long low vowel 12/13 rise in Polish and Netherlandic Yiddish without merging with 41 [ɔ] as in Lithuanian Yiddish?

Short 11 remained low [a] in all three dialects described in the article. Sanskrit has the opposite pattern: long /ā/ is low and short /a/ is schwa-like.

4. Why did 42-43 front to [ej] while all other back vowels remained back in Lithuanian Yiddish?

5. What conditioned the secondary length of 25?

6. Why is 25 the only -5 vowel?

Glancing at table 2.18 in Katz (1978: 12), I have a new explanation for the fronting in Polish and Lithuanian Yiddish 42-43: the distinction between *-j and *-w was lost after *o, so 42-43 merged with 44 *oj (my guess), the vowel of 'tree':

Proto-Germanic *baumaz

German Baum

Proto-Yiddish *boym (?)

Netherlandic Yiddish [baːm]

Polish Yiddish [bɔjm]

Lithuanian Yiddish [bejm]

Dutch boom

English beam (obviously no longer 'tree')

How can I bridge the gap between Proto-Germanic *au and Proto-Yiddish *oj?

*au > *æu > *æɥ >

Lithuanian Yiddish [ej]

Polish Yiddish *œɥ > [ɔj]

Yiddish vowel number codes got me thinking about my old ideas about number codes for Tangut vowels. Everyone agrees that the dictionary dialect(s) of Tangut had 105 rhymes and that - for example - the first seven rhymes were something like -u. But there is no agreement on how to reconstruct those rhymes. Codes like u1.1 (first u-type rhyme of the first rhyme cycle), u2.2 (second u-type rhyme of the second rhyme cycle), etc. are more infomative than 'rhyme 1', 'rhyme 62', etc. Unfortunately, there is no consensus on the vowel types and cycles of some rhymes: e.g.,

Rhymes	Arakawa 1999		Gong 1997 and this site
Rhymes	Vowel type	Cycle	Vowel type	Cycle
77-79	e	2	e	3
99	o	3	i
100	i		y
101	e		i
103	a	4	o

The only neutral labels for these rhymes are 77-79, etc.

14.2.17.23:19: THE BROADER RELEVANCE OF BRISK VOWELS

All roads seem to lead back to the Tangut Empire - even if they start in Belarus.

Last night I realized that the correspondences in "Broad Bread" were like those between Tangut and its relatives: e.g.,

front vowel : back vowel

Litvish breyt 'bread' : Standard Yiddish broyt 'id.'

Dutch been 'bone' : English bone

Tangut 2rieʳ 'bone' : Written Tibetan རུས་ rus 'id.'

Does this mean that the correspondences have similar origins? No. I think the Tangut vowels fronted to harmonize with lost front-vowel presyllables: e.g.,

*CI-ro-H > *CI-rø-H > *Ci-re-H > *Ci-rie-H > 2rieʳ

On the other hand, Dutch ee is from a Proto-Germanic *ai that lost its *i in Old English and rounded to o in Middle English. And maybe Litvish ey is the product of dissimilation followed by assimilation:

*aw > *ew (vowel fronting to be further away from back *u)

*ew > *eɥ > ey (glide fronting to assimilate to *e)

So perhaps my hypothetical *øɥ was wrong - or was it? David Boxenhorn reported øɥ existed in some variety of Yiddish. Maybe the vowel tree was something like this:

*oː
*ow
*aw
Germanic Yiddish aw	*ew
	*eɥ
	Litvish ey	*øɥ
		Polish Yiddish oy	? Yiddish øɥ

ey reversed is ye = ie, which with retroflexion is the vowel of Tangut 2rieʳ 'bone'. There is no escape from the lost tongue of the Great State of White and High. It's ultimately all about me, no, I mean

1mi 'Tangut'

14.2.16.23:16: BROAD BREAD

Reading about Al Jaffee's years in Lithuania yesterday led me to the Wikipedia articles on Lithuanian Jews and the Brisk tradition, bringing me back to the Brest half of Brest-Litovsk.

According to "Lithuanian Jews",

Litvaks have an identifiable mode of pronouncing Hebrew and Yiddish; this is often used to determine the boundaries of Lita (area of settlement of Litvaks). Its most characteristic feature is the pronunciation of the vowel holam as [ej] (as against Sephardic [oː], Germanic [au] and Polish [oj]).

Apparently Sephardic [oː] is the most conservative pronunciation. How did the others develop? My guess is that

*oː became *ow

which Germanic Jews dissimilated to [aw] (= [au])

which fronted to *øɥ

and dissimilated to [oj] in Polish Jewish speech

but delabialized to [ej] in Lithuanian Jewish speech

I would not have expected *øɥ in Polish or Lithuanian Jewish speech since neither Polish nor Lithuanian have front rounded vowels or labiopalatal glides. *øɥ reminds me of standard German eu/äu [ɔʏ]. What would be the motivation to front the vowel of *browt 'bread' (or to at least shift *w to *j)? Such fronting also occurred elsewhere in Germanic: e.g., English bread (front and nonlabial) and Swedish bröd (front and labial) as opposed to Dutch brood and German Brot (both back and labial). More cognates here.

The title was inspired by this passage:

In all probability, Minsk also became the only capital city in Europe where, as late as 1937, one could see a truck passing through the city streets, distributing bread to local cooperatives, with the word "Bread" written in Yiddish on it. Actually, as someone noticed at the time, lamenting the level of "ignorance and provincialism" of the text on the bread lorry, the writing did not appear in standardized literary Yiddish, but in Litvish Yiddish, or the dialect spoken in Minsk and its environs. (Instead of reading "broyt" - or "bread" in Standard Yiddish - the text on the truck read "breyt," which in the Litvish dialect indeed means "bread" but in standard Yiddish means "wide."

Conversely,

A Lithuanian Jew, who pronounces the word broyt (bread) with the diphthong /ei/, may spell it breyt if he is not familiar with standard orthography, and if he is very deficient in orthography and he wants to write in a "literary" fashion, he may come out with a statement that the street is broyt (bread), instead of breyt (wide).

English has the adjective broad corresponding to the noun breadth. What is the origin of that alternation which is also in

long : length

strong : strength

wrong : wrength

Never heard of wrength before, but make no mistake - I like it.

Looking at Wiktionary, I get the impression that

Proto-Germanic *ai > Old English ā > Middle English o

and confirmed that in Beekes (1995: 157), but why wouldn't that change also affect the noun which was Middle English brede (not *brode)?

That change is parallel to

Late Old Chinese *ai > Middle Chinese *a > Cantonese o

and the first stage has already occurred in southern US English dialects with 'ah' for I.

I've long been puzzled by vowel correspondences between English and other Germanic languages. Beekes (1995: 159) addressed the very words that bother me (emphasis mine):

PIE *ou > ī: ear < OE ēare < PGmc *aurōn- < *ous (but note exceptional developments of PIE *ou in death, great, high, red [cf. Dutch dood, groot, hoog, rood]).

For starters, how did Proto-Germanic *au become Old English ēa? I write in English, but I'm embarrassed to admit that I don't know much about English.

2.17.0:51: Ah, I think I see now: *u lowered and delabialized to harmonize with a fronted *æ:

Proto-Germanic *au > *æu > *æo > *æɔ > *æa > Old English ēa

But I still don't understand where the length in the Old English diphthong came from.

14.2.16.0:40: BALTIC VOWEL LENGTH: REPRESENTATION AND EVOLUTION

The Litovsk in Brest-Litovsk means 'Lithuanian'. When I read about the Treaty of Brest-Litovsk a few days ago, I had no plans to blog about Lithuanian, but then yesterday I read about Al Jaffee's years in Lithuania. I reacquainted myself with the Lithuanian alphabet and wondered:

1. Why is there no consistent method of indicating long vowels of nonnasal origin?

a, e [æ], and o can be short or long, but their length is not indicated in writing

(Short o is only in loanwords since Proto-Balto-Slavic shifted original short *o to *a. Perhaps the length of o was not indicated since one could just shorten it in loanwords. But what about a and e?)

ė [eː] has a dot

short i [i] and long y [iː] are written with different letters

ū [uː] has a macron

If I redesigned the Lithuanian alphabet, I would add the letters ā and ē for long a and e and ŏ for borrowed short o which I assume is less frequent than native long o.

2. How did the three-way distinction between [æ], [æː], and [eː] develop from Proto-Balto-Slavic which only had *e and *eː? (Those [æː] written as ę are of nasal origin, but not those written as e.)

3. Why is there no ǫ [oː] < *õ? Are some long o [oː] from *ó̃? Had *ó̃ already denasalized and merged with *oː at the time Lithuanian orthography was established?

The post-1946 Latvian alphabet has length indicated with a macron for all vowels other than o. Ō continues "to be used in print throughout most of the Latvian diaspora communities, whose founding members left their homeland before the post-World War II Soviet-era language reforms." Apparently long o is in loanwords (whereas in Lithuanian it is short o that is in loanwords). My guess is that original long o shortened in Latvian, which later gained a new long o in loanwords.

Why was uo for [uɔ] discarded in favor of o in 1914?

Why are [æ]/[æ] and [e]/[eː] both written with e/ē?

Did Latvian ever have nasalized vowels, and if so, what happened to them?

i < i, a, *u	ə < a, i, e, o, *u	u < *u
e < e, a, *o	a < *a	o < o, a