Amaravati: Abode of Amritas

08.8.30.23:59: STRAINER UNDER A SILKEN ROOF

TT3967 ?dʑjɨ R31 'patterned; silk'

is the most eye-catching member of its homophone group (Homophones 35B71-76):

It looks like a drawing of a house, but its rooflike semantic element

Li Fanwen radical 114

(Click here for a list of all tangraphs with this radical.)

probably means 'silk'. Unlike other elements which sometimes have no obvious or consistent semantic or phonetic function, LFWR114 always signifies something to do with materials for clothes.

LFWR114 has no obvious Chinese source. It doesn't look like the Chinese semantic element 糸 'silk' (which often appears in the sinographs translating tangraphs with it). Could it be influenced by the top and enclosing elements of 繭 'silkworm cocoon', minus the center divider? (The upside-down ш shape in 繭 seems to be unique to that sinograph.)

The element enclosed by LFWR114 'silk' in TT3967 is

TT0160 lo R51 2.42 [Sofronov 1968 and Gong have 1.49] 'strainer; skimmer' (according to Arakawa and Kychanov; Grinstead defined this as 'fence'!)

TT0160 cannot be semantic in TT3967, but it is probably phonetic since it also appears in

TT2517 ?dʑjɨ R31 'pull out; exempt; save from work'

which is homophonous with TT3967.

But no other tangraphs with TT0160 (= A&K0002) have similar readings:

Archetype	A&K number and reading
L-type	0002 lo 2.42 or 1.49 'strainer', 0009 lo 1.49 'look for; search; dig'
	0003 lhje 1.14 'pull out; pluck', 0004 lhje 1.14 'pull out a root; large piece of meat; weed (latter two 'things to be pulled out'?)'
	0005 lwa 1.20 (2nd half of rhyming binoms ka-lwa 1.17 1.20 'suddenly', 'wisdom' [< 'quick wit'?], 'interrogate' [why?] and tsa-lwa 1.20 1.20 'rapidly')
NJ-type	0006 ndʑê ? 'pull out; exempt; save from work', 5788 ndʑê ? 'patterned; silk'
TH-type	0011 thə̣ 1.68 'quick witted; intellectual; curiosity; knowledge; investigate' ('to strain through facts'?)
TH-type	0007 thu 1.1 'draw a bow' ('pull string away from bow'?)
KH-type	0008 khu 2.5 'dig; look for; turn over'
CH-type	4214 tɕhjeɯ 1.45 'haze; misty; fogs; vaporous; steaming'
S-type	3156 sjə 2.29 'pinch; shake off; knock off, beat out'
n/a; 0002 not phonetic; added as a graphic 'echo' of 0002 in 0005	0010 ka 1.17 (2nd half of nearly rhyming binom ka-lwa 1.17 1.20; see 0005 above)
	0012 tsa 1.20 (1st half of rhyming binom tsa-lwa 1.20 1.20; see 0005 above)

(Table added 8.31.1:19. The numbers and reconstructions are from Arakawa and Kychanov 2006. The results would be the same regardless of reconstruction.)

TT0160 seems to be a highly polyphonous element. Extreme polyphony can exist: e.g., the many readings of 生 in Japanese depending on context. To the Tangut, sinography also contained polyphonous phonetic elements: e.g., 者 *tʃa was in

都 *tu

諸 *tʃy

奢 *ʃy

緒 *sy

with four different initials and three different rhymes in Tangut period northwestern Chinese.

The Tangut could not have known that all five once shared a common *TA-like core in Old Chinese:

者 *tʃa < OC *tja < ?*i-ta

都 *tu < OC *ta < ?*(Cʌ-)ta

諸 *tʃy < OC *ta < ?*Cɯ-ta

奢 *ʃy < OC *sta < ?*sɯ-ta

緒 *sy < OC *sda < ?*sɯ-da < ??*sɯ-nɯ-ta

Such Chinese 'polyphony' is an unintended side effect of centuries of sound changes. On the other hand, Japanese polyphony is the result of layers of borrowing from Chinese plus multiple native morphemes corresponding to a single Chinese morpheme: e.g. (not a complete list),

Native Japanese	Sino-Japanese	Chinese
i (used to write first syllable of various 'life' words with ik-), u (used to write first syllable of various 'birth' words with um-), nama, ki- 'raw', o, na, ha (used to write first syllable of various 'grow' verbs with o-, na-, and ha-), -fu 'place where plants grow' (no Chinese counterpart; extended usage of 生 'grow'?)	shou (early layer of borrowing), sei (later layer of borrowing)	生 'life', 'birth', 'raw', 'grow'

What is the motivation behind Tangut polyphony? Deliberate obscurantism, or something more practical, like an interface between two different languages?

Next: A 'strained' semantic solution.

08.8.29.23:59: PERSON + TANGUT = ?

While looking up

TT3316 mi R11 2.10 'Tangut'

in Arakawa and Kychanov (2006 333, entry 2123) when writing "Top 10 Endangered Languages", I discovered a tangraph that looks like 'person' + 'Tangut':

+=

TT3344 + TT3316 = TT3739

TT3739 is listed in chapter VII of Homophones (35B71), so it must have a palatal initial. However, everything else about it is unclear. There is no agreement on its initial, rhyme, or meaning, and its tone is unknown:

Sofronov (1968 II: 360): ndʑêi R35

Li Fanwen (1986: 376): dʑie or ɲie R10 番 'Tangut'

Gong Hwang-cherng in Li Fanwen (1997): dʑjɨ R31; defined by Li Fanwen (1997) as a 族姓 clan surname

(8.30.2:45: Although I have not been able to find TT3739 in Mixed Tangraphs' lists of Tangut names, that does not necessarily mean that Li Fanwen was wrong.)

Arakawa and Kychanov (2006: 333): ndʑêi R35 (same as Sofronov 1968 II) 'тангуты' (Tanguts), 'Tangut', 番人 (Tangut person)

Neither TT3739 nor its homophones (35B72-B76)

appear in the Tangraphic Sea.

TT3739 has no entry in Nevsky 1960.

A&K (2006: 260, entry 1491.2) listed a compound

?'тангут', 'Tangut', 番人

which also appears in Homophones (35B71; translated by Li Fanwen as 西夏人 'Tangut person'). The first element is TT0405 lhwiẹ R64 1.61 'Tangut' and appears in the compound

'страна тангутов', 'Tanguts' country', 番人國 (A&K 1491.1)

with the second element TT2350 ʔɔ̣ R74 1.71 'territory' (A&K 4245; Grinstead 1972: 107 has 'init. part.'!).

If TT3739 is a surname, is it like surnames such as England, etc.? But it doesn't sound like any other native Tangut autonym: mi, mɨ niaa (cf. Tibetan mi.nyag)*, lhwiẹ. Could it represent a foreign name for the Tangut that sounded something like 'Ji'?

*8.30.1:34: It later occurred to me that Li Fanwen's ɲie R10 does sound like TT5745 niaa R21 2.18.

Is TT5745 a bound morpheme? I've never seen it outside the disyllabic word

mɨ niaa 'миняги; тангуты', 'Miniag; Tanguts', 黨項

A&K 2966.2 list the pronunciation of this name in Sofronov's reconstruction as mɪ tɕjei 2.28 1.35, but this is an error for mɪ njaɯ 2.28 2.18.

TT5748 Sofronov's tɕjei (= my tɕɨe) R36 1.35 (name of a bird) and

TT5745 Sofronov's njaɯ (= my niaa) R21 2.18

look very similar. Their bottom right elements differ: TT5748 has 'bird' whereas TT5745 has 'person'. Both have 'bird'/'waist' on their left sides. No analysis is available for TT5745, so the sources of its components are unknown.

08.8.28.1:44: TOP 10 ENDANGERED LANGUAGES

My blogfather James Hudnall sent me Peter K. Austin's list. Unfortunately, Austin's book 1000 Languages: The Worldwide History of Living and Lost Tongues is not yet available at amazon.com.

Two of my teachers worked extensively on two of the languages on the list: Alexander Vovin (Ainu, #3) and Robert Blust (Thao, #4).

I had never heard of Ket (#10) until Sasha Vovin talked about it in one of my classes. Ket is the last survivor of the Yeniseian languages of Siberia that may be related to the Na-Dené languages of North America.

Obviously, my number one lost tongue is

mi² ŋwəəu¹ 'Tangut'

No, I haven't forgotten about it ...

08.8.27.3:15: THE CONSONANT PHONOTACTICS OF GEORGIAN (PART 3)

Writing about languages one doesn't specialize in is always hazardous, and drawing typological generalizations from those languages is even more dangerous. I feel fairly confident when I talk about languages that I have personally studied, but I get a bit nervous when I leave my 'comfort zone'. It's hard to strive for accuracy in the dark. So when I look at these odd statements in Butskhrikidze's (2002) chapter on phonotactics, I wonder if I've said anything as strange:

p. 10: "In Sanskrit words are never represented as bare stems."

I presume "words" refer to nongrammatical words. Although a typical Sanskrit word has an affix, some bare consonant-final stems can be words: e.g., vaak 'voice' (nominative singular). (But historically, vaak is from *wook-s with a suffix; cf. Latin vox).

(3:45: p. 10: Butskhrikidze described the rules for Sanskrit stress accent which postdate the original Vedic pitch accent. She seemed to treat Sanskrit as a "fixed [i.e., nonphonemic] accent" language but it originally had mobile phonemic accent: e.g., váacas [nom. pl. of vaak 'voice'] vs. vaacás [acc. pl.; váacas also possible].)

p. 22: "Consider for instance, phonotactic restrictions on monomorphemic words of English. The generalisations are taken from Davis (1985). Words of the type C_iC_jVC_j are not attested. Thus, forms like *flil are not possible. Similarly: C_iC_jVC_jC_iforms, e.g. forms like flilf, are not attested."

What about stat and, if [ej] is analyzed as a single vowel, state or flail? friar would be a counterexample if [aj] is analyzed as a diphthong and final r is analyzed as /r/ rather than as /ɚ/ (in rhotic pronunciation).

p. 31: Japanese has "no word-final consonants". But in fact Japanese has many words ending in -n [ɴ]: e.g., Nihon 'Japan'.

p. 31: English has "no word-initial [ž]". But what about genre [ʒɑnrə]?

p. 56: "Zubkova (1990) says that in Vietnamese consonant combinations [? - should be "sequences" -A] in C1VC2 words have two main characteristics:

Restrictions on C₁ and C₂ in C₁VC₂ words in Vietnamese

a) C₁and C₂ form a rising sonority contour.

b) C₁ and C₂ belong to different phonemic sets.

Thus, in Vietnamese the Sonority Sequencing Principle works on the word domain and consonants form separate sets depending on the positions in a word: initial and final. For instance, voiced fricatives occur only in word-initial position, while semivowels occur only in word-final position."

This seems to imply that Vietnamese initials can never be finals, though they often are: e.g., mắm 'salted fish'. Sequences with falling sonority are common: e.g., một 'one'. Moreover, some Vietnamese dialects do allow the semivowel [j] as an initial.

p. 57: Russian лук, рот, рак are listed as examples of consonant sequences "dispreferred" in Vietnamese, which in fact has such syllables with falling sonority: e.g., lục 'six', rốt 'last', rác 'garbage'.

p. 63: "Japanese has a constraint against the occurrence of two separate voiced obstruents within a morpheme"

This constraint is absent from modern Japanese: e.g., debu 'fat' and loanwords like debyuu 'debut'.

p. 65: Very few specialists in Japanese would describe Japanese as "Altaic/Austroasiatic".

I presume "affixes" only refer to inflectional affixes. Japanese has derivational prefixes and Vietnamese has derivational prefixes and suffixes (see Table 8 of Alves 1999 for examples).

pp. 66-67: "Vietnamese words are characterised by the occurrence of certain phoneme classes only in certain positions of a word: initial, medial or final. For instance, the obstruents th and kh are typical initial consonants (i.e. C₁) ... the typical word-medial consonants are glides, while the consonants p and c are the typical final ones (i.e. C₃) ... The typical word-initial nasal is /m/, while the typical word-final is the nasal /n/ ... Thus, in general terms, one can conclude that Vietnamese C₁VC₂VC₃ words, in which all consonants belong to the lexical morpheme/word, are characterised by a rising and falling sonority contour."

Although it is true that th and kh can only appear initially, there is only one medial glide /w/, and c [k] can also occur as an initial. /m/ and /n/ are both initials and finals. C₁VC₂VC₃ seems to be an orthography-based misinterpretation of syllables like Nguyên [ŋwiən] (not disyllabic [ŋujen]!).

ADDENDUM 1: Butskhrikidze (2002: 36) cited Sanskrit gemination rules from Vennemann (1972). I wasn't taught these rules and I can't find them in Whitney or Macdonell's grammars. I learned nongeminated forms of the examples cited: e.g., maargam 'road' (accusative singular), sapta- 'seven' instead of maarggam and sappta-. All dictionaries I have ever used only list nongeminated forms. Are the rules in W. Sidney Allen's Phonetics in Ancient India, which I haven't read in 13 years? Is this gemination subphonemic?

ADDENDUM 2: This isn't in Butskhrikidze, but while I'm writing about typology ...

I was puzzled by the treatment of Japanese in the indefinite articles map at WALS. Does Japanese have an "indefinite word distinct from 'one'"? It does have aru (lit. 'that exists') which could be translated as 'a' or 'some' but is certainly not as common as European indefinite articles. Someone could look at this map and think that Japanese has a indefinite article since English, Dutch, etc. were also categorized as having an "indefinite word distinct from 'one'". I would distinguish between indefinite words and indefinite articles.

Strangely, even though Jpn aru can be translated into Korean as OttOn (lit. 'what kind of') or OnU (lit. 'which'), Korean was not put into the same category as Japanese. I would treat both as languages without definite or indefinite articles.

08.8.26.2:13: THE CONSONANT PHONOTACTICS OF GEORGIAN (PART 2)

Butskhrikidze (2002:1) began her book by citing prckvna 'to peel' as an example of a Georgian word with a complex consonant cluster. prckvna cannot be a bare stem because it has too many consonants. The longest possible vowel-final stems only have two consonants: CCV (p. 196). It must consist of a stem plus one or more affixes. Butskhrikidze analyzed it as

/pVc͡kʷ-en-a/

CVC stem + -VC suffix + -V suffix

even though its surface form is CCCCCV (p. 149)!

I find this unconvincing. Although the Megrelian (= Mingrelian) cognate purckon-u-a 'to peel' implies an earlier polysyllabic form in the common ancestor of the two languages, I do not believe that speakers 'relive' history to generate surface forms. Underlying forms must be learnable. How can children hear [prckvna] and figure out that it is 'really' /pVc͡kʷ-en-a/ with two inaudible vowels, one of which is unspecified?

According to Butskhrikidze, -r- is optional between p and c. Is pckvna an acceptable pronunciation of 'to peel'? I presume there are no minimal pairs of pck and prck. Synchronically, -r- seems to be inserted to break up a stop sequence pc-, though -r- probably was originally part of the stem (cf. Mingrelian).

The unit phoneme /c͡kʷ/ looks like an ad hoc means to avoid positing /ckʷ/ (two consonants) or /ckv/ (three consonants). I have nothing against unit phonemes as long as one can demonstrate that they pattern like single phonemes. Butskhrikidze provides such arguments for 'harmonic clusters' like /c͡kʷ/ in chapters 3-5. I have only looked quickly through her book, so I will reserve judgment. I can only say that I don't think East Asian languages have anything like her harmonic clusters.

08.8.25.3:40: THE CONSONANT PHONOTACTICS OF GEORGIAN (PART 1)

A lot of non-Altaic (South)east Asian languages have undergone or are undergoing what I call 'The Great Collapse' at different speeds: disyllables becoming sesquisyllables which in turn end up as monosyllables. To understand this process better, I have been looking for other 'collapsed' languages: e.g., Polish, in which *gъdanьskъ (four syllables*) became the monosyllable Gdańsk (Comrie 1987: 326).

The back cover of Butskhrikidze (2002) suggests that Georgian is a 'collapsed' language:

The central topic of this thesis is the study of Georgian consonant sequences: e.g., forms of the CCC type. It demonstrates that the complexity of the Georgian consonant clusters is related to morphological complexity and to processes of vowel reduction and complex segment formation. Thus the Georgian 'complex' CCC complexes are derived from structures of the CVCVCV type.

Similarly, I believe that Old Chinese consonant clusters are usually morphologically complex and largely if not wholly "derived from structures of the CVCVCV type".

Sagart (1999: 20) proposed that Old Chinese roots had a simple structure: *CV(C)(ʔ). His precise formulation is open to debate but the monosyllabicity of OC roots is not. (Whether pre-OC roots were monosyllabic is another question.) Since OC had no inflectional morphology, there is no need to distinguish stems from words. Most OC stems/words consist of a monosyllabic root with or without subsyllabic affixes. (Exceptions include compounds, reduplicated forms, and borrowings.)

Are Georgian stems (which may or may not be morphologically complex) predominantly monosyllabic? Butskhrikidze (2002: 193) wrote,

Ertelishvili points out that the most common Georgian stem type is monosyllabic.

But the statistics she cited indicate otherwise (unless verbal stems were specified):

	Nonsyllabic	Monosyllabic	Disyllabic	Trisyllabic
Nominal stems	0	2764	5166	0
Verbal stems	115	3224	1895	247
Total	115	5988	7061	247

(The verbal stem totals are mine.)

Monosyllabic stems may begin or end with as many as five consonants (p. 194). Fourteen verbal stems consist of four consonants without a vowel (p. 195). This does not necessarily mean that Georgian has verbs with no vowels. A Georgian word typically consists of a stem plus a suffix, with the exception of vowel-final verbal roots (p. 10). (I presume Butskhrikidze was excluding monomorphemic grammatical words.) So words containing CCCC verbal stems would presumably be CCCCV(C). (Georgian doesn't have final obstruent clusters [p. 10].)

Next: Peeling prckvna apart.

*Three years ago, I reconstructed Slavic-style minimal vowels *ъ (nonpalatal) and *ь (palatal) in Old Chinese. I have since reinterpreted those vowels as *ʌ (low) and *ɯ (high).

08.8.24.20:44: მხედრული MXEDRULI: WARRIOR WRITING (PART 2)

Georgian has a number of obsolete letters which are placed after the end of the current alphabet in Unicode:

1. 10F1 ჱ [ej] > [e] : Greek η

[ej] sounds like an attempt to imitate Greek long [ee]

2. 10F2 ჲ [j] (no Greek counterpart, even though it resembles Ω!)

3. 10F3 ჳ [wi] : Greek υ

[wi] sounds like an attempt to imitate Greek [y]

Holisky (1996: 366): "უ u originated as a fusion of ო o and ჳ wi [cf. Greek ου [u]], subsequently replacing ჳ and assuming its numerical value." Hence Georgian has two letters for '400'.

4. 10F4 ჴ [q] (according to Holisky 1996: 366; no Greek counterpart)

I would expect [q] to be aspirated [qh] like the other nonejective voiceless stops.

This letter is [qh] in Svan.

5. 10F5 ჵ [ow] : Greek ω

[ow] sounds like an attempt to imitate Greek long [oo].

Holisky did not list a later pronunciation [o] which would have been parallel to [e] < [ej].

The following letters are not in Holisky (1996: 366):

6. 10F6 ჶ (Unicode 'fi') : cf. Greek φ

Was this used to represent the non-Georgian sound [f]? Was Greek [θ] borrowed as [f] (cf. Russian Федор 'Theodore')?

7. 10F9 ჹ (Unicode 'turned gan')

I have no idea what an inverted გ g represents.

8. 10FA ჺ (Unicode 'ain')

The name and even the shape are reminiscent of Arabic ﻋ `ayn (initial form). Is this a coincidence? Does it stand for [ʕ]?

These letters are for Mingrelian and Svan.

9. 10F7 ჷ (Unicode 'yn'; Svan [ə] ~ [ɨ])

10. 10F8 ჸ (Unicode 'elifi'; Svan [ʔ])

I presume their Mingrelian sound values are similar.