For years I have been puzzled by the Khitan large script character

used to transcribe Liao Chinese 聖 *3ʂiŋ 'sage'. Why would 'sage' have been written as  an apparent combination of 夕 *4siʔ 'evening' and the name 卞 *3pian?

Tonight I found that an identical-looking Chinese character 𫝢 has been in Unicode since version 6.0 in 2010. 𫝢 turns out to be quite different from the sum of its apparent parts; it is a variant of 升 *1ʂiŋ 'rise', a near-homophone* of 聖 *3ʂiŋ 'sage' that is attested as far west as Dunhuang (Huang Zheng 2005: 361), though it is not in Longkan shoujian (997) compiled in the Khitan Empire. A dotless variant 㚈 is even more similar to 升. TLS lists even more variants from 異體字字典 including 𢦑 which reminds me of Tangut

.2544 2ʂɨẽ 'sage' < northwestern Chinese 聖 *3ʂɨẽ 'sage'

Although I thought the Tangut character might have been derived from Khitan 𫝢, I now wonder if its shape combined the parallel diagonal lines of a 𢦑-type variant with the right-hand vertical line of a 㚈-type variant. (7.13.0:13: The earliest attestation I can find for 𢦑 is in 四聲篇海 from the Jin Dynasty. However, Shuowen from 100 AD has a similar form.)

Note that in the dialect of Chinese known to the Tangut, 升 *1ʂɨĩ 'rise' and 聖 *3ʂɨẽ had different vowels, whereas the two words were homophonous except for their tones in the dialect of Chinese known to the Khitan: *1ʂiŋ and *3ʂiŋ. Moreover, the merger of their rhyme categories in the east dates from the Liao Dynasty (see the table in Kane 2010: 242).

Hence I conclude that the idea of writing 聖 as 𫝢 probably originated among the Khitan in Liao times and may not have been a retention from the elusive, hypothetical Parhae script. (If Chinese pronunciation in Parhae were like Sino-Korean which was based on an eighth century eastern dialect, 聖 and 升 were read something like *3sjəŋ and *1sɯŋ.)

The Tangut might not have thought of writing 'sage' as a 𢦑/𫝢-like character on their own because 聖 *3ʂɨẽ did not sound like 升 *1ʂɨĩ 'rise' in the Chinese dialect they knew. So I think they got the idea from the Khitan.

How many other Tangut characters reflect Khitan influence? Probably not many, but 2544 might not be an isolated instance.

Also, how many other Khitan large script characters are based on unfamiliar variants of Chinese characters?

*Numbers in front of syllables indicate tones in Chinese as well as Tangut. 升 *1ʂiŋ 'rise' had a 'level' tone whereas 聖 *3ʂiŋ 'sage' had a departing tone. The contours and voice qualities of these tones are unknown. Using the same notation for Chinese and Tangut facilitates comparisons, though it also may erroneously imply that Chinese and Tangut had identical or similar tones. The lack of systematic treatment of Chinese tones in Tangut transcriptions of Chinese and Tangut borrowings from Chinese may indicate that they sounded very different. WAS TANGUT RHYME 50 GRADE I?

Rhymes 50 and 51 were similar in the first published full-scale Tangut reconstruction known to me (Kychanov and Sofronov 1963) but not in most subsequent reconstructions: e.g., Sofronov (2012: 428) reconstructed rhyme 50 as Grade III -joˁ (identical to part of his rhyme 55!) and rhyme 51 as Grade I -o. Arakawa (1999) is an exception in which rhymes 50 and 51 are both Grade I -o.

Given that Tangut rhymes are generally grouped into sets with ascending grades (I-III or I-IV depending on the reconstruction), treating 50 as Grade I avoids the problem of an odd grade sequence found in other reconstructions (e.g., III-I-II-III/IV for o-rhymes in Sofronov 2012), though it also raises the question of why there are two Grade I rhymes in a row - something not found elsewhere in Arakawa's reconstruction. Moreover, rhyme 50 is almost always preceded by class VII initials which otherwise only precede Grade II and III rhymes.

Reconstructions of class VII initials preceding rhyme 50

Nishida 1964 tš- tšh- ndž- š-
Sofronov 1968 tś- tśh- ndź- ś-
Li Xinkui 1980 tʂ- tʂh- dʐ- ʂ-
Huang Zhenhua 1983 tś- tśh- ȵtś- ś-
Li Fanwen 1986 tɕ- tɕh- dʑ- ɕ-
Gong Hwang-cherng 1997 tś- tśh- dź- ś-
Arakawa 1999 c- ch- j- sh-
This site tʂ- tʂh- dʐ- ʂ-

Let me try to explain those anomalies.

In my reconstruction, the four grades are differentiated by their medials:

Medial zero -ɤ- -ɨ- -i-

Grade I and perhaps Grade II vowels are lower and perhaps also backer than their Grade III and IV counterparts: e.g., Grade I (rhyme 8), Grade II -ɤi (or -ɤɪ?; rhyme 9), Grade III -ɨi (rhyme 10), and Grade IV -i (rhyme 11).

Why did the shibilants of class VII almost always precede the Grade II and III medials? Here's what I think happened:

1. Early pre-Tangut had palatals: tɕ-, tɕh-, dʑ-, ɕ-.

The first three could also have been palatal stops: c-, ch-, ɟ-.

2. These palatals were followed by medial -i-.

3. Pre-Tangut developed retroflexes from *dental-r-clusters: e.g., *k-tr- became *tʂh- in 'six' (cf. Tibetan drug 'six').

4. The local dialect of Chinese merged its palatals and retroflexes. Modern northwestern Chinese dialects later developed new palatals from old dentals and velars before *i:

Middle Chinese Tangut period Chinese Modern northwestern Chinese
*palatals *retroflexes retroflexes (and labiodentals before *u)
*alveolars *alveolars alveolars and palatals
*velars *velars velars and palatals

The table above is simplified: e.g., it does not account for northwestern dialects like Xi'an which have different reflexes for Middle Chinese palatals and retroflexes. See Coblin (1994: 97-105) for a more detailed overview of the development of palatals, retroflexes, alveolars, and velars in northwestern dialects.

Moreover, it is not clear whether modern northwestern Chinese dialects are descended from earlier northwestern Chinese dialects or if the latter were substrata for the former which were newcomers.

5. Pre-Tangut merged its palatals and retroflexes:

Pre-Tangut Tangut
*palatals retroflexes
*dental-r sequences

6. The -i- that followed palatals became -ɨ-.

7. This -ɨ- spread to retroflexes that did not come from palatals (i.e., those from *Tr-sequences).

Another possibility is that *Tr-sequences became *Tɨ-sequences which affricated into *Tʂɨ-sequences. No such affrication occurred when nondentals were followed by *-r-: e.g., *kr- > *kɨ-.

8. This -ɨ- lowered to *-ɤ- for height harmony if it was

- preceded by presyllabic

- not preceded by presyllabic and followed by a 'low'* vowel (*a *e *o):

Vowels Heights Vowels Heights
*-ɨu high-high *-ɨu high-high
*-ʌ-ɨu low-high-high *-ʌ-ɤu low-low-high
*-ɯ-ɨu high-high-high *-ɯ-ɨu high-high-high
*-ɨi high-high *-ɨi high-high
*-ʌ-ɨi low-high-high *-ʌ-ɤi low-low-high
*-ɯ-ɨi high-high-high *-ɯ-ɨi high-high-high
*-ɨa high-low *-ɤa low-low
*-ʌ-ɨa low-high-low *-ʌ-ɤa low-low-high
*-ɯ-ɨa high-high-low *-ɯ-ɨa high-high-low
*-ɨə high-high *-ɨə high-high
*-ʌ-ɨə low-high-high *-ʌ-ɤə low-low-high
*-ɯ-ɨə high-high-high *-ɯ-ɨə high-high-high
*-ɨe high-low *-ɤe low-low
*-ʌ-ɨe low-high-low *-ʌ-ɤe low-low-high
*-ɯ-ɨe high-high-low *-ɯ-ɨe high-high-low
*-ɨo high-low *-ɤo low-low
*-ʌ-ɨo low-high-low *-ʌ-ɤo low-low-high
*-ɯ-ɨo high-high-low *-ɯ-ɨo high-high-low

9. The presyllables were lost, so ɤ was no longer a predictable allophone of /ɨ/.

10. -ɨ- generally merged with -i- after initials other than retroflexes, v-, and l- (which may have been velar [ɫ]).

Grade II -ɤV and Grade III -ɨV syllables were assigned to different rhymes except perhaps in the case of rhyme 55 which I should investigate.

Reconstruction Grades of rhyme 55
Hashimoto 1965 III
Sofronov 1968
Gong 1997 II III
Arakawa 1999 II
Sofronov 2012 II III IV

What if -ɤ- and/or -ɨ- were occasionally lost after retroflexes? Then those retroflex-initial syllables would be Grade I: i.e., without medials.

Grade II *dʐɤo or Grade III *dʐɨo > 1955 Grade I R50 1.48 *dʐo?

I am using Arakawa's reconstruction of rhyme 50 here.

This solution has several problems.

First, why weren't these -o syllables assigned to Grade I rhyme 51 -o?

Second, was medial -ɤ- and/or -ɨ-loss after retroflexes sporadic within the Tangut prestige dialect, or were rhyme 50 borrowings from one or more dialects that had regularly lost medials after retroflexes.

Third, it cannot explain why rhyme 50 has liquid-initial syllables such as

2912 Grade I R50 1.48 lo (in Arakawa's reconstruction).

Why isn't 2912 in the same homophone group as

4710 Grade I R51 1.49 lo (in Arakawa's reconstruction)

in the Tangraphic Sea and the Homophones?

Tonight another solution occurred to me. Rhymes 44-49 had front vowels followed by -w. What if rhyme 50 was -ow with an -o like rhyme 51 but a -w like rhyme 49?

49. -iw'

50. -ow

51. -o

Tangut -w is from earlier *-w and *-k. If this solution was correct, I would expect rhyme 50 words to have cognates ending in *-w and/or *-k. Unfortunately, I do not know of any cognates for rhyme 50 words.

7.12.0:54: If rhyme 50 was -ow, why would it be almost exclusively preceded by shibilants? What would make -ow more shibilant-friendly than rhyme 51 -o?

Why would medial -ɤ- and/or -ɨ- be lost before -ow but not other -w rhymes?

Conversely, if rhyme 50 was -ɤow and/or -ɨow, why was there no simple -ow?

*Phonologically, was a 'high' vowel and *e and *o were 'low' vowels, though they were phonetically all mid vowels. I could also call *i *ə *u 'higher' vowels to distinguish them from the 'lower' vowels *e *a *o. DID TANGUT RHYME 51 HAVE A LONG VOWEL?

Last night I added the first published full-scale Tangut reconstruction that I know of (Kychanov and Sofronov 1963) to my database (Excel / HTML). Kychanov and Sofronov's reconstruction has three rhymes with macrons. The first of these is rhyme 51 following 50 which lacks a macron:

Rhyme KS Nishida 1964 Hashimoto 1965 Sofronov 1968 Huang 1983 Li 1986 Gong 1997 Arakawa 1999 Sofronov 2012 This site
50: 1.48 -oɦ -jəw -i̭o -iən -ǐo̭/-ǐo -jwo -o -joˁ -wɨo
51: 1.49/2.42 -ʌ̄ -ɔɦ -ɔwN -o -uẽ, -uɐ̃ -ǐəu/-ǐuo -(w)o -o -(w)o
52: 1.50/2.43 -ʌ' -ǐow -owN -uõ -ǐo̭/-ǐo/-ɪo̭ -io -yo -ɤo
53: 1.51/2.44 -i̭ʌ -ǐɔɦ -jowN -i̭o -iõ, -ïõ, -iɔ̃ -ǐou -j(w)o -o: -jo, -ö -(w)ɨo, -(w)io

I list two reconstructions for rhymes 50 and 51 in the column for Li (1986). The first is from pages 165-166 and the second is in the rest of the book. On page 165, Li wrote rhyme 51 as -ǐəo which is also his reconstruction for rhyme 49 on the previous page. I assume that -ǐəo is supposed to be -ǐəu.

I have included rhyme 53 which some consider to be similar to rhyme 50.

I have also included rhyme 52 to complete the set of -o-rhymes. On page 188, Li wrote rhyme 52 as -ɪo̭, but elsewhere he wrote it as -ǐo̭ or -ǐo like rhyme 50.

Kychanov and Sofronov observed that 50 and 51 were in complementary distribution: 50 appeared after shibilants and what they reconstructed as r, whereas 51 appeared after all other types of initials. This still mostly holds true in my reconstruction:

Grade Rhyme Shibilants l- ɬ- v- Other initials
III 50: -wɨo X X X
I 51: -o X X X
51': -wo X X X
II 52: -ɤo X X
III 53a: -ɨo X X
53a': -wɨo X X X X
IV 53b: -io X X X
53b': -wio X X X X

Rhymes 50 and 51 are not in complementary distribution, as both can occur before l-:


R50 1.48: 2732 lwɨo : R51 1.49: 1018 and 1595 lwo

Unlike Kychanov and Sofronov (1963), I do not reconstruct r- before rhyme 50 (or any rhyme in this o-set).

Rhymes 50 and 53 are in complementary distribution only if tones are taken in consideration. The following syllables would be homophonous if tones are ignored:


R50 1.48 1955 dʐwɨo, 2784 dʐwɨo : R53 2.44 2207 dʐwɨo, 5586 dʐwɨo

Why did the Tangut regard first tone -wɨo as a rhyme category distinct from first tone  -ɨo while placing second tone -wɨo in the same category as second tone -ɨo?

Tone\rhyme -ɨo -wɨo
1 R53 1.51 R50 1.48
2 R53 2.44

Tonight I noticed that the Grade II rhyme 52 never has -w-, whereas rhyme 50 always has -w-. Moreover, all the initials of rhyme 50 can also occur in Grade II. Could rhyme 50 be the -w-version of rhyme 52: i.e., could I reconstruct rhyme 50 as -wɤo?

Tone\rhyme -ɤo -wɤo
1 R52 1.50 R50 1.48
2 R52 2.43 (none)

Different grades for rhyme 50 imply different origins:

Grade III -wɨo < *Pɯ-o and/or *Cɯ-wo

Grade II -wɤo < *P-o and/or *-wo (but lwɤo would be from the improbable *P-lro or, worse yet, *lwro, so I am inclined not to regard rhyme 50 as Grade II)

In either case, I do not understand why the non-Grade I rhyme 50/1.48 precedes the Grade I rhyme 51/1.49. Arakawa avoided that problem by reconstructing both rhymes 50 and 51 as -o, though I am not sure how he accounted for the difference between


R50 1.48: 2732 lwɨo : R51 1.49: 1018 and 1595 lwo

I wish there were a publicly available complete list of Arakawa's reconstructions. Kotaka's partial list of Arakawa's reconstructions (now offline) has 1ldwo for 1595 (and presumably its homophone 1018 would also be 1ldwo)*. I think 2732 would be 1lo in Arakawa's reconstruction. So he might say they had different initials and medials. However, Arakawa (1997: 134, 135) reconstructed the syllable lo in both rhyme 50 and rhyme 51. What determined whether a given lo was assigned to rhyme 50 or 51?

Any solution must explain why rhymes 50 and 51 have largely nonoverlapping initials. One might interpret Kychanov and Sofronov's -ʌ̄ as a long vowel, though I don't understand why a long vowel would not follow shibilants and their r- Moreover, I don't think they intended -ʌ̄ to be a long vowel:

Поэтому, чтобы подчеркнуть связь между этими двумя гласными, обозначим его как ʌ̄.

'Therefore to emphasize the connection between these two vowels [of rhymes 50 and 51], we will denote it [rhyme 51] as ʌ̄.

I suppose their macron is a bit like the (over)bar of mathematics:

A bar (also called an overbar) is a horizontal line written above a mathematical symbol to give it some special meaning.

In this case, ʌ̄ might mean 'special variant of rhyme 50 after shibilants and r'. They did not specify if or how rhyme 51 phonetically differed from rhyme 50.

*7.11.0:41: I have followed Gong in reconstructing only a small number of liquids (l-, ɬ-, ɮ-, ʐ-, r- = Gong's l, lh-, z-, ź-, r-). However, Tai (2008: 201) has made a case for a larger set of liquids, and I have yet to integrate his ideas into my own reconstruction. If Tai is correct, then


R50 1.48: 2732 : R51 1.49: 1018 and 1595

is not a true minimal set. 2732 had an initial belonging to liquid fanqie chain 1 (generally transcribed in Tibetan as l- with or without preinitials), whereas the other two had an initial belonging to liquid fanqie chain 4 (generally transcribed in Tibetan as ld- or zl-). Tai reconstructed the initials of liquid fanqie chains 1 and 4 as l- and ld-.

7.11.1:21: Sanskrit o is always long [oː]. If the version of Sanskrit heard by the Tangut preserved long [oː] and if rhyme 50 were short and 51 were long - not that anyone said they were - I would not expect rhyme 50 in transcriptions of Sanskrit -o-syllables. Nonetheless

R50 1.48 0009 ʂwɨo

transcribed Sanskrit śo [ɕoː]. (I do not reconstruct [ɕ] in Tangut.) Moroever its -w- corresponds to zero in Sanskrit. (Its -ɨ- is not a problem, as ʂo is not possible in my Tangut reconstruction.) However, this usage of an rhyme 50 -w-character is an isolated instance and could be regarded as an error.

Most instances of Sanskrit -o were transcribed with rhyme 51 characters, but that does not necessarily mean that rhyme 51 was long. Exceptions mostly had initials absent from rhyme 51: shibilants and r-. Shibilants were not possible before grade I initials, and r- was not possible in first cycle rhymes other than rhyme 43. TANGUT RHYME DATABASE: 9 JULY 2014 EDITION

I updated my Tangut rhyme database (Excel / HTML) for the first time since September to include

- my latest reconstruction of the Tangut rhyme system (in the rightmost column named "*new")

See "G-*r-adation in Tangut (Part 2)" for an explanation of the vowels.

- corrections in my previous reconstruction which I used (with variations) between 2008 and this summer (in the column named "old")

- Kychanov and Sofronov's 1963 reconstruction which is the first published full-scale Tangut reconstruction to the best of my knowledge; I started that column last September and regret not completing it in time for the fiftieth anniversary of that reconstruction's publication.

I do not know how to best represent their diacritics in Unicode and have used hangul letters as similar-looking placeholders for the time being.

Eventually I will include Sofronov's 2012 reconstruction. I may also add Huang Zhenhua's 1983 reconstruction and Li Fanwen's 1986 reconstruction. MARRIAGE, TANGUT STYLE (PART 2)

The character for the first syllable of the Tangut word for 'to get married'

0225 1ɣɨə

is interesting because it has a right half (Boxenhorn code wuu) that is in no other character, whereas its left half is in 53 other characters.

In the Tangraphic Sea, 0225 is analyzed as


0225 1ɣɨə = left of 1085 1ɮi 'man' + top and bottom right of 0050 1ni'*

Why wasn't all of 0050 used as the right half of 0225? Because it would have been too complex? The omitted bottom left component is

'not' (Nishida radical 041 / Boxenhorn code cia).

Li Fanwen (2008: 9) defined 0225 as 'to marry'. 0225 is not in either Kychanov or Nishida's dictionaries. Its Tangraphic Sea analysis and definition is

2ʂɨe 1ɣəu 1dziẽ 2ŋõʳ

'request head relation-by-marriage whole'

'0050 is composed of the top half of 0147 'request' over all of 1965 'relation by marriage'.'

1ni' 1tia 1ɮi 1ni' 1lɨə

'0050 TOPIC man 0050 is'

'0050 is as in 'to 0050 a man' [?] and'

1ɣɨə 1ɮwị 1vɨi 1ʔie 1ʔiə

'wedding do GEN say'

'how one says to wed'

The D version of Homophones has the note

1ni' - 1ɣɨə 1ɮwị 1ɮi

'0050 - get married man' = a verb-noun sequence 'married man'?

0050 might mean something like 'to marry a man'. Li lists no examples of 0050 outside dictionaries, so I cannot speculate any further about its semantics. It is obviously semantically relevant to 0225, whose structure can be interpreted as an abbreviation of an object-verb sequence 'man + marry'.

0050 in turn is another semantic compound: a noun-verb sequence 0147 + 1965 'requested relation by marriage'.

The analysis of 0147 (Boxenhorn code wus) is unknown. It occurs in only one other character:


5150 1thʊʊ 'to request' = center of 2364 1sew 'to survey' + all of 0147 2ʂɨe 'to request'

I'll look at the analysis of 1965 in part 3.

*I am no longer comfortable with writing long vowels in Tangut readings since I no longer believe Tangut had any such vowels. If Tangut had, for instance, a distinction between -i (rhyme 11) and -ii (my old reconstruction of rhyme 14), I would expect Sanskrit syllables ending in short -i and long to be respectively transcribed with rhyme 11 and 14 characters. However, rhyme 14 occasionally appears in transcriptions of Sanskrit -i but never appears in transcriptions of Sanskrit (Arakawa 1999: 111-112):

Rhyme 10 11 14 30 31 37 46 84
Sanskrit \ Tangut -ɨi -i -ii -ɨə -iə -ie -iew -iʳ
-i 5 22 4! 1 1 2 1 2
2 3 0!   1

That suggests rhyme 14 was inappropriate for both -i and -ī. Nonetheless, it must have been -i-like, as it was transcribed as -i in Tibetan. Hence I will write rhyme 14 as -i', with an apostrophe as a typographical substitute for a prime symbol indicating a rhyme that was somehow different from its counterpart without an apostrophe. MARRIAGE, TANGUT STYLE (PART 1)

The Tangut word for 爲婚 'to get married' in the Tangut-Chinese handbook The Timely Pearl in the Palm (1190) is

0225 1851 1ɣɨə 1ɮwị (page 34, column 3, characters 3-4)

Although Li Fanwen (2008: 39, 308), and Kychanov and Arakawa (2006: 517, 142), and Grinstead (1972: 130, 78) all list each half as an independent verb, all textual examples in Li (2008) contain the two halves together. That leads me to conclude that this is a disyllabic word rather than a sequence of two words (though it may have originated as a sequence of two words).

In an earlier stage of Tangut, this word may have been *Cɯ-K(r)ə *S-P-zi:

- The vowel of *Cɯ- conditioned the lenition of *K- to ɣ- before being lost.

-*K is an unknown back stop; it could have been velar (*k, *kh, *g) or even uvular (*q, *qh, *ɢ).

(7.8.0:00: I am assuming ɣ- is derived rather than original.)

- Medial *-r- lenited to -ɨ-; if there was no medial *-r-, *ə might have broken to *ɨə after ɣ if it was uvular *[ʁ].

- Preinitial *S- conditioned tension (indicated by a subscript dot) and preinitial *P- conditioned medial -w-:

*S-P-ɮi > *zbɮi > *zɮbi > *zɮβi > *zɮwi > *ɮ̣ẉị > ɮwị

The root of the second half may be

1085 1ɮi 'man'

whose character in fact is the source of the left half of the first half according to the Tangraphic Sea (more on this in part 2).

Was *S-P-zi a derived verb meaning something like 'obtain a man'?

7.8.1:40: *S- derived causative verbs in some cases: e.g.,


4906 2gwi (< *Ni-gwa-H) 'to wear, to put on (clothes)' : 3146 1gwị ̣< (*S-Ni-gwa) 'to make to wear, to clothe (v.t.)' (both stem 1 which is used in all environments but two; see below)


3686 2gwio (< *Ni-gwa-w-H) 'to wear, to put on (clothes)' : 0539 1gwiọ ̣< (*S-Ni-gwa-w) 'to make to wear, to clothe (v.t.)' (both stem 2 which is used when the subject is first or second person singular and the object is third person)

See Gong (2002: 51-54) for many more examples.

*P- derived verbs from nouns in at least two cases (pairs and glosses from Gong 2002: 45-46; the Tangut and pre-Tangut reconstructions are mine):


3003 1ʔɨu (< *ʔru) 'ghost, demon, devil' : 0622 1ʔwɨu (< *P-ʔru) 'bring an evil'


3259 1dzi 'a state of abstraction, meditation, without anxiety and hinderance' : 3411 1dzwi (< *P-dzi) 'cause the mind to be in the state of abstraction'

I have not included the pair 3943 'sole of shoe' and 3961 'to sole' since I follow Gong's later reconstruction of 3943 with -w-.

So could *S-P-ɮi be more precisely glossed as 'to cause to get a man'? UNAMI GEMINATES

I wonder if Unami geminate obstruents (in bold) sounded like Korean tense obstruents: e.g.,

kkə́ntkaan 's 'then you danced' vs. ná kə́ntkaan 'then there was dancing'

ppɔ́ɔm 'his thigh' vs. ní pɔ́ɔm 'the ham'

nsaassaakkənə́mən 'I stuck it out repeatedly' vs. nsaasaakkənə́mən "I stuck it out slowly"

(I have replaced the hard-to-see dot · for length with a doubling of the previous letter.)

Unami has a geminate xx reminscent of Middle Korean ㆅ hh (which was lost sometime after the seventeenth century). This xx is distinct from /xh/ which is pronounced [xk] in medial position: e.g., /màxhee/ 'to be red' is [màxkee]. I wonder how initial /xh/ in /xhook/ 'snake' is pronounced. (The word appears in this 1889 dictionary as achgook. Did initial /xh/ result from apheresis: axg- > /xh/?)

Korean tense consonants generally originated from earlier obstruent sequences: e.g.,

ttae < pstay 'time'

There are, however, cases of tense consonants in words that originally never had obstruent sequences: e.g.,

kkot < 곶 kos 'flower'

ssang < 솽 swang 'double' (< Chinese 雙; sw- is an obstruent-sonorant cluster)

sshi < 시 si 'courtesy title, family' (< Chinese 氏)

cf. native 씨 sshi < psi 'seed'

The tense consonant of 氏 might incorporate genitive -s-: e.g., I-sshi 'the Lee family' could be a reanalysis of *Ri-s-si 'Lee-GEN-family'.

I suspect that Tangut at one time also had tense consonants from obstruent sequences. Their tenseness was lost after it spread into the following vowel: e.g.,

0359 *Sʌ-tuŋ > *stʊ > *ttʊ > *ttʊ̣ > 1tʊ̣ 'thousand' (cf. Written Tibetan stong 'id.')

Did Unami geminates originate from earlier clusters? (7.6.21:18: Here is a list of Proto-Algonquian clusters.)

