184.108.40.206:59: THE SOUND OF THE DOUBLE-SKINNED MOUTHLast night I mentioned
4620 1ka 'how' =
left of 2247 1tu (first half of 1tu 1muʳ 'stupid'; arbitary source for the 'mouth' radical?) +
all of 1326 1kə (perfective prefix; phonetic)
as one of three Tangut transcriptions of Sanskrit ka. It also transcribed Sanskrit krā, ga, and kiṃ (Nevsky 1960 I: 574).
The title refers to its structure: 'mouth' on the left and what appears to be 1dʐə 'skin' doubled on the right. I have no idea why one would write a perfective prefix with 'skin'.
Nor do I have much of an idea of why 4620 transcribed Chinese syllables other than *ka (Li Fanwen 2008: 733):
|Tangut text||Sinograph||Middle Chinese||Tibetan transcriptions of Tang NW Chinese||Tangut period NW Chinese||Liao Chinese||Phags-pa Chinese|
|Forest of Categories||吉||*kit||kyir||*ki||*kiʔ||ꡂꡦꡞ gÿi [kji]|
|Forest of Categories, Sunzi||建||*kɨanʰ||(Used by Amoghavajra to transcribe Skt -kaṇ-, kañ-)||*kɨã||*kien||ꡂꡠꡋ gen [kɛn]|
|Forest of Categories||蹇||*kɨanˀ, *kɨenˀ||(Amoghavajra used a homophone 謇 to transcribe Skt -khaṇ-, -kan-)|
|Forest of Categories (see Nevsky 1960 II: 83)||堅||*ken||kyan, kyen||*kiã||ꡂꡦꡋ gÿan [kjɛn]|
Although my Tangut period NW Chinese reconstruction is based on Gong's which in turn is based on Tangut evidence, none of the readings of those characters match 1ka.
I have included Liao and Phags-pa Chinese (with Coblin 2007's transliteration and phonetic reconstructions) for reference. Neither the reconstruction of Liao Chinese nor the attested forms of Phags-pa Chinese is dependent on the reconstruction of Tangut. Both varieties were spoken to the east of northwestern Chinese, and of course Phags-pa Chinese postdates the fall of the Tangut Empire.
No single reconstruction of 4620 can account for all of its uses*:
|Source||Reconstruction||Sanskrit ka||Sanskrit krā||Sanskrit ga||Sanskrit kiṃ||Chinese *ki||Chinese *kia-type syllables|
|Nishida 1966||1kǐɑ||partial match||full match except for nasality|
Li Fanwen 1986
This site now
|1ka||full match||partial match||weak match||partial match|
|Arakawa 1997 (Nishida-style)||1kaɦ||partial match||weak match||partial match|
|Gong Hwang-cherng 1997||1kja||partial match||full match except for nasality|
|This site 2008-2014||1kia||partial match||full match except for nasality|
|Kotaka 2012 (Arakawa-style)||1ka:||full match except for vowel length||partial match||weak match||partial match|
How do I explain those mismatches?
1. Tangut knowledge of Sanskrit was probably limited, so some inaccuracy was inevitable.
1a. Tangut probably did not have contrastive vowel length (contra Arakawa and Gong) which explains why transcription characters such as 4620 did double duty for short and long-vowel syllables.
1b. Tangut had no Cr-consonant clusters. Sanskrit kr could be misheard as k.
1c. If Tangut g was prenasalized [ŋg], then Tangut k could have been an acceptable approximation of Sanskrit g.
2. Educated Tangut were well versed in Chinese. Hence it is inconceivable that they made glaring errors in transcription.
2a. Tangraphs may have had multiple readings, and the lexicographical tradition only listed basic readings. Readers could supply nonbasic readings from context. Nonbasic readings of 4620 may have been closer to Chinese *ki and *kia.
2b. The readings of tangraphs underlying transcriptions may have been from nonstandard dialects: e.g., a dialect in which *kia had not simplified to ka or a dialect in which *kia had simplified to ki. The situation may have been comparable to the use of non-Mandarin-based transcriptions in written Mandarin today.
2c. The readings of the sinographs being transcribed may have been colloquial forms that were irregular from the viewpoint of the Chinese lexicographical tradition: e.g., 吉 may have had a reading like *ka (cf. Sino-Vietnamese cát [kaːt] instead of the expected regular *cất [kət] or *kia in addition to *ki. I cannot explain the *a of my speculative *ka. (Is SV cát the product of taboo deformation?) *ia may have been conditioned by a low-vowel prefix:
Standard reading: *klit > *kit > *kir >*ki
Alternate reading: *Cʌ-klit > *Cʌ-kleit > *ket > *kiet > *kier > *kia?
(That in fact is the evolution of 結 which was transcribed in Tangut by 2219 1ke which was also the transcription character for Sanskrit ke.)
Amoghavajra's transcriptions with 建 and 謇 (a homophone of 蹇) may suggest that they had alternate *kan-like readings in eighth century northwestern Chinese, though it is more likely that he used *kɨan-type characters because *kan was *[qɑn] with an un-Sanskrit uvular initial whereas *kɨan was *[kɨan] with a velar initial.
10.31.0:45: The problem with 2c is that it is unlikely that colloquial readings would be used to pronounce the Classical Chinese texts translated by the Tangut. And surely Tangut who could not only speak colloquial Chinese but also read Classical Chinese would know better than to mix informal and formal pronunciations of words. Then again, the line between colloquial and literary is not absolute: e.g.,
However, some dialects of Hokkien, such as Penang Hokkien as well as Philippine Hokkien (Lan-lang-oe) overwhelmingly favor colloquial readings. For example, in both Penang Hokkien and Philippine Hokkien, the characters for 'university,' 大學, are pronounced toā-ȯh (colloquial readings for both characters), instead of the literary reading tāi-hȧk, which is common in Taiwanese and Mainland Chinese [Hokkien] dialects.
10.31.0:49: Grinstead's dictionary (1972: 144) defined 4620 as 'Skt. ke' but I think ke may be a typo for ka. His table of dhāraṇī transcription tangraphs on p. 184 does not list 4620 as ke, his list of Tangut phonetics on p. 190 equates 4620 with ka, and no other scholar has ever equated 4620 with Sanskrit ke.
220.127.116.11:59: TANGUT GRADE III -A('): RHYMES 19 AND 21 (PART 2)
I started what was meant to be a
series almost three weeks ago. Then I got caught up correcting my
own mistakes - the ones I noticed, that is: a wrong rhyme and a wrong fanqie speller. (There
must be even more errors in my Tangut reconstruction that I haven't
even noticed yet!). I thank David
Boxenhorn for reminding me of my plans to write about Tangut
'apostrophe' rhymes like 21 -ɨa'.
I got the apostrophe notation from Arakawa Shintarō. He uses
apostrophes to indicate glottal stops in initial position, so maybe
they indicate final glottal stops in his reconstruction. I, on the
other hand, use apostrophes simply to mean 'different in some unknown
In the Tangraphic Sea, nonapostrophe rhymes are followed by
similar apostrophe rhymes in the first group of rhymes (1-60).
Apostrophes and tenseness are always mutually exclusive, and
apostrophes and nasality are almost mutually exclusive. Their
coexistence in 59-60 should be investigated. There are also some
anomalous combinations of nasality with tenseness (65 and 76) and
retroflexion (97-98). I am fairly confident about the classification of
rhymes up to 60; later rhymes, particularly 97 and up, are iffy, and
others interpret them very differently (e.g., Arakawa only has two
retroflex apostrophe rhymes: 88-89). The ordering pattern breaks down
after 62: e.g., 63 is an e-type rhyme rather than an i-type
rhyme. 104 and 105 look like last-minute additions.
I do not rule out the possibility of a reinterpretation of the later
rhymes in the future. For now, let's focus on 19 and 21.
In my reconstruction, both 19 and 21 are Grade III rhymes, so in
theory they should have Grade III
initials (in green). The reality is messer. Unexpected initial
in pink, and initial types with minimal pairs are in red.
I used to follow Gong and write glottal fricatives as if they were
velar, but I thought it was odd to have x- and ɣ- under
"Glottal", so I now follow Arakawa and write the voiceless fricative as
h-. By analogy I write its voiced counterpart (absent in
Arakawa's reconstruction) as ɦ-.
There are no labial or lateral fricative (ɬ- ɮ-) initials
which were generally somehow incompatible with Grade III. The absence
of lateral fricatives is due to a larger constraint against alveolar
fricatives in Grade III (but see below!).
Arakawa reconstructed Grade III as vowel length and reconstructed 21
as the only Grade IV rhyme in his system with both vowel length and
medial -y-. But I don't understand why those features would be
incompatible with labials. If pya and pa: were possible
(Arakawa 1997: 128), why not *pya:?
In my reconstruction, labials are rarely followed by Grade III -ɨ-.
Let's look at all the anomalies and see if they can be explained (or
at least have notable features):
||Li Fanwen 2008 number
||bent, winding, crooked (only in dictionaries)||No Grade IV rhyme 20 *kwa; regular reflex of pre-Tangut *Cɯ-kwa or *Pɯ-ka?|
||1tsɨa||to broil, roast (only in dictionaries); cognate
to Grade IV rhyme 20 1tsa 'hot'
||Why isn't this Grade IV rhyme 20 1tsa?
Minimal pair with Grade IV rhyme 20 1tsa
||first half of 1hɨa 1ʂɤe 'to condemn'
(only in dictionaries)
||No Grade IV rhyme 20 *ha;
regular reflex of pre-Tangut *Cɯ-ha?
||2ɦɨa||second half of 1dzəʳ 2ɦɨa 'fast, rapid';
cognate to 2521 with voiced initial conditioned by lost prefix and
second tone conditioned by suffix *-H
||No Grade IV rhyme 20 *ɦa;
regular reflex of pre-Tangut *Cɯ-Ka
(with lenition of intervocalic *-K-)?
||cover, lid, to cover; borrowing from
Late Middle Chinese 盒 *xɑ(p) 'box' or some related word with
voiced initial and vowel bending conditioned by high-vowel prefix?
||umbrella of a carriage (specialized usage of
||to make a detailed inquiry
||No Grade III rhyme 19 *lwɨa;
regular reflex of pre-Tangut *Cɯ-lwa or *Pɯ-la?
||second half of 1bạ 1lwa 'lower limbs,
||1dɨa'||second half of 1ti 1dɨa' 'to drip' (only in dictionaries); < Tangut period northwestern Chinese 滴答 *ti tɑ (but final vowels don't match - did front vowel of 1ti condition breaking of an earlier *ɑ in the following syllable?)||Minimal pair with
Grade IV rhyme 24 1da' for transcribing Sanskrit ḍa
||No Grade IV rhyme 24 *na'; regular reflex of pre-Tangut *Cɯ-naX?|
||2nɨa'||to not be
||second half of 2mə 2nɨa' 'Tangut'; Tibetan minyag 'Tangut' may reflect an earlier or nonstandard form; may be derived from 0176 'black' plus a suffix *-H conditioning the second tone|
Sanskrit ka and kā
||No Grade IV rhyme 24 *ka'; regular reflex of pre-Tangut *Cɯ-kaX (except for transcription character, of course)?|
||foundation, basis, burden; transcription of Sanskrit ka and kā|
||pedestal, plinth (same word as 3985 above)
||1khɨa'||transcription of Sanskrit kha||No Grade IV rhyme 24 *kha'|
||No Grade IV rhyme 24 *ʔ(j)a'; regular reflex of pre-Tangut *Cɯ-ʔaX?|
||horn (only in dictionaries)
||second half of 2vɪ 2ʔɨa' 'singing' (with
2vɪ 'to sing'; both halves only in dictionaries)
||gold (less common synonym of 1kɤẹ)
Six anomalies in Gong's reconstruction (3456, 3502, 5584, 5763 in rhyme 20 and 0357, 0837 in rhyme 24) are not listed because they are no longer anomalies if they are reconstructed with ld- (following Tai 2008) instead of l-. ld- may have been a lateral affricate with the same pattern of distribution as the lateral fricatives ɬ- and ɮ-: i.e., in Grades I, II, and IV but not III.
Out of the remaining twenty-four anomalies, only two have
corresponding Grade IV syllables (3408 and 2936). Those minimal pairs
force me to reconstruct 19 and 20 differently (unlike Arakawa and Gong who seem to reconstruct them as
All others are in complementary distribution, albeit not in the ideal pattern of complementary distribution. Did the Tangut dictionary tradition reflect a mixture of dialects with different sound changes: e.g.,
- in dialect A, *Cɯ-tsa became Grade III rhyme 19 1tsɨa
- in dialect B, *Cɯ-tsa became Grade IV rhyme 20 1tsa
and the dialect A form was chosen to be the standard form for 'to
broil' while the dialect B form was chosen to be the standard form for
'hot'. I would rather not reconstruct different prefixes to account for
the different vocalism of 'to broil' and 'hot' which probably share the
same root *tsa.
Only two of the anomalies (3948 and 4823) are characters created for
transcribing Sanskrit, and one of them (3948) is homophonous with a
native word (3985). Why were Sanskrit ka, kā, and kha
transcribed with the Tangut rhyme -ɨa' containing -ɨ-
and the mysterious apostrophe feature absent from Sanskrit? (No Tangut
transcription of Sanskrit khā is known.) Was that practice
influenced by the Chinese transcriptions 迦 *kɨa and 佉 *khɨa
for those syllables? (In earlier Chinese, *k(h)a was [q(ʰ)ɑ]
with an un-Sanskrit uvular, so velar-initial sylalbles with medial *-ɨ-
were regarded as closer matches.) The Tangut transcription of Sanskrit ka
without -ɨ- may reflect Sanskrit filtered through Tibetan or even Sanskrit itself. Are transcriptions with 4620 closer to (Tibetanized) Sanskrit? Conversely, are transcriptions with
3948 1kɨa', 3985 1kɨa', and 4823 1khɨa'
based on Sinified Sanskrit? Or were those two types of transcriptive characters randomly mixed up?
*Although both 19 and 20 are -a: in Arakawa's notation, Arakawa's (1997: 128) table appears to list two subtypes of each of those rhymes (not including subtypes with -w-).
18.104.22.168:42: 'CROSSED NINE' IN THE KHITAN SMALL SCRIPT
Today I was looking at the Khitan small script fish tally in Bushell (1897: 18) which ends with character 089 resembling Chinese 九 with a bar across it:
Kane (2009: 45) wrote,
Aisin Gioro 2004: 51 notes that the title for a [Khitan] lady of high rank, 別胥 biexu [in modern standard Mandarin pronunciation] was normally written
<b.ɥ.dz.ü> ~ <p.ɥ.dz.ü>
but in Gu [i.e., 故耶律氏銘石 Gu Yelü shi mingshi, the epitaph of Mme. Yelü, 1115] it is written
The evidence for the pronunciation of 089 points in contradictory directions:
suggesting that  is similar to  <dz>.  is only found in [native] Kitan words. In the rhymed sections of the Xingzong inscription,  rhymes with  <u>.
1. 別胥 was something like *pje(ʔ)sy in Liao Chinese.
2. 089 was interchangeable with 258 <dz> (used for Chinese unaspirated *ts)
3. 089 may have fused with 289 <ü> to represent a syllable [Cy].
4. 089 rhymed with <u>.
I think 089 might have been <su> or <sy>:
1'. The Chinese transcription 胥 *sy suggests [s].
2'. Although 258 <dz> was created to transcribe Chinese *ts, an affricate absent from Khitan, the Khitan often spelled that foreign consonant with 244 <s>. Perhaps even those who spelled Chinese loanwords with 258 <dz> may have pronounced them with [s]. So interchangeability with 258 <dz> may indicate either [dz] or [s].
3'. Maybe there was a rule of assimilation: <su.ü> > [sy(ː)] or <ɥ.us> > [y(ː)s] (if <su> had an alternate reading [us] after consonant characters; 082 <ɥ> is usually a semivowel, though perhaps it was [ø] in the title transcribed as 別胥)
4'. It is simplest to assume that 089 ended in [u] if it rhymed with <u> [u], though the possibility of the rhyming of similar vowels ([y] and [u]) cannot be ruled out.
The Chinese loanword data in Kane (2009) lacks the syllables *su and *sy. Would such syllables have been transcribed as 089?
If 089 had an alternate reading [us], how would it have differed from
068 and 103 <us>?
Are there any instances of 089 alternating with those characters?
I also considered the possibility that the alternation between <089.ü> and <dz.ü> might have indicated a reading like [dʑu] or [dʑy] for 089, but if 089 were <ju>, it would be homophonous withs
147 ~ 148 ~ 149 <ju>
and therefore redundant. And there are no known cases of Liao Chinese *tɕy transcribed as 089. (The Khitan consonants written as voiced obstruents corresponded to Chinese voiceless unaspirated obstruents.)
(10.29.0:29: The modern standard Mandarin pronunciation of 九 'nine' as [tɕjow] is not evidence for pronouncing 089 as [dʑu] or [dʑy]. In Liao Chinese, 九 was *kiw, and the Khitan borrowed it as
with a velar stop, not a palatal affricate.)
089 appears at the end of
<284.089> which "must refer to the emperor, the throne, or affairs of state" (Kane 2009: 69): i.e., 'imperial'
Are there any continental 'Altaic' terms for rulers ending in something like -su or -us? The first word that comes to mind is Mongolian ulus 'people, nation' which has already been proposed as a potential cognate of
<xu.177> (see the discussion of this mysterious word in Kane 2009: 162-165)
Could <284.089> have meant 'national'?
The vertically stacked variant of <284.089> is from the fish tally. (The handwritten copy of the fish tally on p. 623 of Qidan xiaozi yanjiu has the regular horizontal combination.) The significance of vertical stacks, if any, is unknown. Back in May I started to collect vertical stacks for a future post, but I never finished.
22.214.171.124:14: LOST WORD FAMILIES
Today I read Stephen Wootton Bushell's account of the end of the Tangut Empire. I looked through Li Fanwen's (2008) Tangut dictionary to translate 亡國 'lost country' and found six equivalents of 亡 'to be lost, die':
1. 0316 1xwɤa 'to lack, die, kill'
2. 0788 2me (second half of 1sə 2me 'death'; the first half is 'to die')
3. 1508 1bɛ 'to lose, fail'
4. 1839 1ɬø 'to lose, fail'
5. 2194 1me 'to not exist, not have'
6. 4007 1phɑ 'to damage, lose'
The m-words belong to an m-family of Tangut negatives related to *m-negatives in Old Chinese (e.g, 亡 *Cɯ-maŋ 'to be lost') and elsewhere in Sino-Tibetan. Only two (1918 and 2376) contain 'not' (Nishida radical 041):
0788 2me < *CE-ma-H or *Cɯ-ma-j-H
This bound morpheme is homophonous with 1064 'not yet', but may have a different pre-Tangut origin because 1064 precedes rather than follows the verb 'to die', and 'not yet die' for 'death' makes no sense.
0944 1mʌ̣ < *Sʌ-mə 'not'
1064 2me < *CE-ma-H or *Cɯ-ma-j-H 'not yet'
1918 1mi < *CI-ma 'not' (the most general negator)
2194 1me < *CE-ma or *Cɯ-ma-j 'to not exist, not have'
2376 2mẹ < *SE-ma-H or *Sɯ-ma-j-H 'nothing, not'
5643 1mə < *mə 'not' (for auxiliary verbs)
I have been tempted to include 1943 2nɨa' 'not' (before 'be') in that list, but I cannot prove that n- is from *mj-. Nor can I explain why an *m-family word would have a medial *-j-. My pre-Tangut reconstruction has no *-j-infix.
If 0316 1xwɤa 'to lack, die, kill' is from *P-xra, it might be related to the first syllable of
3913 4862 1xɤə 2lɨa' < *xrə (Cɯ-)laXH 'to leave' (only in dictionaries).
4862 (also written 4951 ) is 'frontier, border' by itself, but it would be odd to have a noun in second position if 3913 4862 was a verb-noun sequence. Is 3913 a prefix that derived a verb out of a noun? Or is 4862 a phonetic symbol for a syllable unrelated to 'border' in 3913 4862? Could 3913 4862 be a sequence of verbs: e.g., 'vacate leave'? Then the second half of 3913 4862 might be related to a lateral-initial family of 'loss' words
1068 1lɨə < *lə 'to fall, sink'
1839 1ɬø < *Kɯ-lo < *-əw < *-ə-k? 'to lose, fail'
3545 1ɬəʳ' < *R-K-lə 'to lose, fall'
related to Old Chinese 失 *l̥it 'to lose'.
Li Fanwen (2008: 252) regarded 1508 1bɛ (Grade I rhyme 34) as a loan from Chinese 敗 'to lose', but the two may be unrelated lookalikes, as I would expect Middle Chinese 敗 *bɤajʰ (Grade II) to correspond to Tangut *bɤe (Grade II rhyme 35). Gong (2002: 421) regarded 1508 as an irregular loan. See Gong (2002: 421) for examples of the regular correspondence between Middle Chinese *-ɤaj (his *-aj) and Tangut Grade II rhyme 35 in loanwords.
Li Fanwen (2008: 643) translated 4007 1phɑ 'to damage, lose' as Chinese 破 'to break, smash'. I agree with Gong (2002: 417) who regarded 4007 as a loan from Middle Chinese 破 *phɑ.
126.96.36.199:21: THE GOLDEN GUIDE: LINE 102: TANGRAPHS 506-510
102. Translating lists of Chinese surnames in tangraphy is relatively easy, but not as interesting as translating coherent text. No wonder I haven't been motivated to translate the Golden Guide since I got stuck in the surname section in 2010. If only I had more patience. The last surname is just four lines away!
|Li Fanwen number||4579||2736||4807||2177||2476|
|My reconstructed pronunciation||2l
|Tangraph gloss||the surname element Lu||the surname element Ba||to lose (< Chn 棄 *khi)||big||flower (< Chn 華 *xwɤa)|
|Word||the surname 呂 Lü (*lɨu)||the surname 馬 Ma (*mbɤa)||the surname 杞 Qi (*khɨi) or 祁 Qi (*khɨi)||the surname 不 Bu (*pʌ)||the surname 華 Hua (*xwɤa)|
|Translation||Lü, Ma, Qi, Bu, Hua|
506: The analysis of 4579 is unknown, but its structure is obviously inspired by its Chinese soundalike 呂 (also 吕) which looks like a stack of two 口 mouths. (It is actually a drawing of the spine.) The left side of 4579 consists of two Tangut
mouth radicals. The right side
has no known independent function and could be from 547 (!) other tangraphs. Some radicals can stand alone while others require it as an apparent filler: e.g.,
0764 1reʳ 'horse' (with a radical derived from Chinese 馬 'horse' on the left)
which brings us to the next tangraph.
507: The analysis of 2736 is also unknown, but it is obviously related to 0764 'horse' (above).
2736 2bɤa' sounds like Chinese 馬 *mbɤa 'horse', the translation equivalent of 0764 1reʳ. The mysterious phonetic feature that I write as an apostrophe must not have made 2bɤa' sound too different from Chinese *mbɤa. (Tangut b- might have been prenasalized [mb].)
It would be nice if the top left radical of 2736 were a diacritic indicating that a tangraph was to be read like its Chinese translation. However, I doubt that is the case. I don't have time to investigate all 42 tangraphs with that radical on the left at the moment, so for now I'll pick one at random which may not be representative:
2314 2ʔɨu 'death' (only in dictionaries?; analysis unknown)
does not sound like any Chinese word for 'death', and subtracting its left-hand radical results in
5156 1vɑ (name and transcription character; see 369) =left of 5489 2ryʳ (surname element rur)
right of 1925 2bɨu (surname element -bu)
which has nothing to do with death and doesn't even sound like 2ʔɨu. I presume that a va-family had something to do with families whose names contained the syllables rur and bu.
Miscellaneous Tangraphs (27.7.11-12, #837) lists these last two Chinese surnames in the opposite order:
2736 2bɤa' 'Ma' and 4579 2lɨu 'Lü'
508: The other three tangraphs have surviving analyses; ironically one of them is 'to lose':
4807 1khi 'to lose' =
top of 4910 2ve (second half of 1ʂwo 2ve 'to clear away, clean up'; semantic) +
all of 3545 1ɬəʳ' 'to lose, fall' (cognate to Old Chinese 失 *l̥it 'to lose'?; semantic)
3545 has a circular analysis:
3545 1ɬəʳ' 'to lose, fall' =
bottom left of 1068 1lɨə 'to fall, sink' (cognate to 3545?; semantic) +
bottom right of 4807 1khi 'to lose' (semantic)
3545 looks like a semantic compound of 'die' (Nishida's radical 045) and 'hand':
509: 2177 is a semantophonetic compound:
2177 1pʌ 'big' =
left of 2892 2khwɛ 'big' (< Chn 魁 *khwɛ) (semantic)
all of 2306 1pʌ (second half of 2tsoʳ 1pʌ 'small colt') (phonetic)
2306 has a dubious circular analysis:
2306 1pʌ =
center of 2177 1pʌ (phonetic) +
right of 2132 2ʔjew 'achievement' (why?)
The analysis of 2132 also leads back to 2177:
2132 2ʔjew =
2477 2thọ (second half of 1dza 2thọ 'to grow up'; semantic)+
2177 1pʌ 'big' (semantic)
I think 2306 came first, followed by 2177 and then 2132.
510: I'm not surprised the tangraph for the loanword for 'flower' is derived from the tangraph for the native word, but what is 'head' doing?
2476 1xwɤa 'flower' (< Chn 華 *xwɤa) =
left and center of 2750 1ɣɤu 'head' (why?) +
right of 2467 1vạ 'flower' (semantic)
2467 1vạ superficially resembles Old Chinese 華 *wra, the source of *xwɤa, but it goes back to *Sɯ-wa which has no *-r-. If the medial *-r- of OC *wra is a metathesized prefix -
*T-wa > *r-wa > *wra
- then perhaps the Chinese and Tangut words for 'flower' are related. But if Baxter and Sagart (2014) are right, 華 was OC *qʷʰˁra, sharing nothing in common with Tangut *Sɯ-wa other than a vowel.
188.8.131.52:52: THE GOLDEN GUIDE: LINE 101: TANGRAPHS 501-505
101. I couldn't resist the opportunity to try out my newest Tangut vowel reconstruction in a
continuation of where I left off last
year (even if doing so entailed inconsistency with my
reconstructions of lines 1-100).
|Li Fanwen number||4695||5087||2259||3951||2042|
|My reconstructed pronunciation||1giw'
|Tangraph gloss|| the name Giw
||the Chinese surnames Yang and Wang||the surname element Me
||to talk, speak
|Word||the surname 牛 Niu (*ŋgɨiw)||the surname 酒 Yang (*jø̃)||the surname 孟 Meng (*mɤẽ)||the surname 杜 Du (*thu)||the surname 家 Jia (*kɤa)|
|Translation||Niu, Yang, Meng, Du, Jia
501: 4695 1giw' contains 1909 1guʳ
'ox, cattle' as a 'xenophonetic' (i.e., a phonetic element chosen for
the pronunciation of its translation in another language: in this case,
Tangut period northwestern Chinese *ŋgɨiw 'ox'). However, 1909
is not in its Tangraphic Sea analysis:
4695 1giw' 'the name Giw' =
top of 4940 2ʔjə 'the surname Y' (a family associated with the Giw?) +
bottom of 4107 1giw' (first syllable of 1giw' 1kie 'a kind of plant')
Nor is 1909 in the analysis of 4107 which takes us back to 4695:
4107 1giw' (first syllable of 1giw' 1kie 'a kind of plant') =
top of 4303 1kie (second syllable of 1giw' 1kie 'a kind of plant') +
4695 1giw' 'the name Giw'
The Tangraphic Sea analysis of 1909 is dubious; surely its 'sources' are actually its derivatives:
1909 1guʳ 'ox, cattle' =
part of the bottom of 4704 2rɛʳ 'ox, elephant' (i.e.., large mammal?) +
part of the bottom of 0021 2bɨu 'ox, elephant' (synonym of 4704)
1909 is not a simple pictograph. It seems to contain 'not' (left), a Tangut derivative of 羊 'goat' (center), and a mysterious right-hand element whose function eludes me. 'Not' must be an abbreviation of some other tangraph.
Chinese 牛 *ŋgɨiw < *ŋʷəʔ and Tangut 1guʳ < *Nʌ-gur or *Tʌ-ŋgu are vaguely similar but difficult to relate. A zero grade Tangut derivative of the root *ŋʷʔ would be *2ŋu, not 1guʳ.
502: The analysis of this surname tangraph makes me wonder if there was a Yang family associated with sheep and birds.
5087 1ʔjø̃ 'Yang' =
center of 3452 2ʔje 'sheep' +
left of 2262 1dʐwɨõ 'bird' +
right of 2107 1tsɪʳ 'earth'
I am not sure it is necessary to reconstruct a glottal stop before *j-.
It is odd that Tangut had ʔj- but no simple j-, and I
cannot account for the ʔ- in 1ʔjaʳ < *rjat
'eight' unless it is a remnant of a prefix.
5087 also transcribed the surname 王 *wɨõ 'Wang'. Another Tangut transcription was in 412:
0403 1võ (Chinese transcription character)
503: 2259 is a straightforward semantophonetic compound:
2259 2mɤe (the surname element Me) =
left of 2888 'surname' (semantic) +
center and right of 1966 1mɤe 'to call, greet' (phonetic)
2259 2mɤe did not have a nasal vowel like Chinese 孟 *mɤẽ 'Meng', but perhaps the Tangut thought it was appropriate to write a Chinese surname with a tangraph for a similar-sounding syllable in indigenous surnames such as.
0493 2259 2sə 2mɤe 'Syme' and 2259 0714 2mɤe 1tʂɤew 'Mechew'.
504: 3951 is a phonosemantic compound used as a
transcription of an unrelated Chinese name:
3951 1thu 'to talk =
left of 3949 1thu (second syllable of 2kyʳ 1thu 'skill') (phonetic) +
right of 1045 2dạ 'speech' (semantic)
505: I could guess the analysis of this tangraph
even before seeing it in Li Fanwen (2008: 339):
2042 2kɤa 'duck' =
left of 3058 2ɮəʳ' 'water' +
right of 2262 1dʐwɨõ 'bird'
Such transparent tangraphs are rare, which is why I continue to wonder how tangraphs were learned.
184.108.40.206:16: ALBANIAN 'SALT' FROM 'GROATS'?
Two months ago I was looking up the reflexes of Proto-Indo-European *seʕl- 'salt' and was surprised to see Albanian ngjelmët 'salty'. I've long been puzzled by how *s- became gj- (part 1 / part 2). Today I found Matasović's (2012: 14) reconstruction of the stages between *s- and gj- which are like mine from two years ago:
*s- > *ś- > *ź- > gj-
But where did the n- in ngj- [ɲɟ] come from? Orel (1998: 298) reconstructed Proto-Albanian *en-salma. What is the prefix *en- doing? Is it *en 'in' which is a verbal prefix (Orel 2000: 168)? Or is it another prefix? Wiktionary has a prefix *(a)n- without any attribution.
The unrelated Albanian noun kripa < *krūpā 'salt' (Orel 1998: 197) is a loan from ... Slavic 'groats' (e.g., Russian krupa)! What is the semantic bridge between 'salt' and 'groats'?
220.127.116.11:59: UMBROUS UMBRELLA
Umbrellas have been in the news lately. The Sino-Vietnamese (SV) reading of Cantonese 遮 ze 'umbrella' (< 'to obstruct') is già with an irregular huyền 'dark' tone. In Middle Chinese (MC), 遮 was *tɕja. Normally
MC *tɕ- corresponds to SV ch- [c]
MC *-ja corresponds to SV -a
which reminds me of my recent derivation of Tangut rhyme 20 -a from *-ia
the MC 'yin level' tone corresponds to the SV ngang 'level' tone
so I would expect the SV reading of 遮 to be *cha with a ngang tone indicated by the absence of a tonal diacritic. However, the actual reading già on the surface not only has initial gi- [z] ~ [j] but also has a huyền tone implying a *voiced initial.
The initial gi- turns out to be regular.* Annamese Middle Chinese (AMC)* *tɕ- was borrowed as Old Vietnamese (OV) *c- before all rhymes other than *-ja. (See further exceptions here.**) Both *k- and *c- voiced before *-j- in Old Vietnamese, merging into Middle Vietnamese (MV) [ɟ] and leniting to [z] or [j] in New Vietnamese (NV):
OV *kj- > *gj- > MV [ɟ] > NV [z] ~ [j]
OV *cj- > *ɟj- > MV [ɟ] > NV [z] ~ [j]
The spelling gi- reflects a *ɟ-like pronunciation in 17th century Middle Vietnamese.
The voicing implied by Vietnamese tones reflects primary rather than secondary voicing: e.g.,
加 MC *kæ > AMC *kja > OV *kja > *gja > SV gia 'to add'
加 has a ngang tone reflecting its original *voiceless initial and not its secondary *voiced initial gj-.
伽 MC *gɨa > AMC *kjà > OV *kjà > *gjà > SV già 'transcription of Indic ga'
Similarly, the huyền tone of 伽 reflects its original *g- rather than the new *g- that developed in OV.
I wondered if MC *tɕj- had become AMC *dʑj- with 'yang' tones, but MC *tɕj- non-'level' tone syllables have SV tones implying *voiceless initials:
者 MC *tɕjaˀ with 'rising' tone > AMC *tɕjả > OV *cjả > *ɟjả > SV giả (not SV *giã) 'nominalizer'蔗 MC *tɕjaʰ with 'departing' tone > AMC *tɕjá > OV *cjá > *ɟjá > SV giá (not SV *giạ) 'sugar cane'
If *tɕj- became AMC *dʑj- only in 'level' syllables, what would be the phonetic motivation for such a limited change? Why would 'oblique' (i.e., non-'level') tones be anti-voicing?
Original MC *dʑj- and MC *ʑj- apparently merged into AMC *tɕʰ- with 'yang' tones***: e.g.,
社 MC *dʑjaˀ > AMC *tɕʰjã > > OV *cʰjã > MV [ɕã] > SV xã 'altar for the god of the soil'
蛇 MC *ʑjaˀ > AMC *tɕʰjà > > OV *cʰjà > MV [ɕà] > SV xà 'snake'
cf. 車 MC *tɕʰja > AMC *tɕʰja > OV *cʰja > MV [ɕa] > SV xa 'cart' with an original *voiceless initial and ngang tone (i.e., a 'yin' tone)
The aspiration of OV *cʰ- might have blocked voicing before *-j-. Conversely, *-j- could have become voiceless after *cʰ-: *cʰj- > *cʰj̊-.
MC *dʑj- and MC *ʑj- merged into AMC *ɕ- with 'yang' tones when not followed by *-j-: e.g.,
臣 MC *dʑjin > AMC *ɕə̀n > > OV *sʰə̀n > MV [tʰə̀n] > SV thần 'minister'
神 MC *ʑjin > AMC *ɕə̀n > > OV *sʰə̀n > MV [tʰə̀n] > SV thần 'god'
cf. 申 MC *ɕin > AMC *ɕən > OV *sʰən > MV [tʰən] > SV thân 'ninth Earthly Branch' with an original *voiceless initial and ngang tone (i.e., a 'yin' tone)
Summing up the history of shibilants in SV (with some more details):
|*kɤ- > *kɣ- > *kɰ-||*kj-||*kj- > *gj-||[ɟ]||gi-||加|
|*tɕj-||*tɕj-||*cj- > *ɟj-||遮|
I can't explain why there was a four-way merger of MC *tɕʰj-, *dʑj-, *ʑj-, and *ɕj- but only a three-way merger of MC *dʑ-, *ʑ-, and *ɕ-. Was there a three-way merger of *Cj-clusters in AMC and OV parallel to the other three-way merger?
I am reluctant to reconstruct aspirated fricatives in OV, but they allow me to formulate a single rule covering two changes:
OV *s(ʰ)- > MV [t(ʰ)]Reconstructing palatal fricatives in OV forces me to formulate two rules:
OV *s- > MV [t]
OV *ɕ- > MV [tʰ]
I do not know of any modern Vietic language with [ɕ]. Then again, I do not know of any modern Vietic language with [sʰ] which is of course a rare sound in the world's languages.
I could reconstruct palatal stops instead of affricates in AMC or palatal affricates instead of stops in OV, but I presume that AMC had affricates like other Chinese dialects and OV had palatal stops like modern Vietnamese. There is no guarantee that was the case: e.g., AMC could have had palatal stops due to Vietnamese influence. (One could use a term like 'Annamese' to avoid the anachronism of 'Vietnamese' as a name for the early Vietic language of Annam.)
*AMC is the dialect of Middle Chinese that developed in Annam and later became extinct after the independence of Vietnam. See Phan (2013).
I write AMC tones using Vietnamese tone marks for convenience. I would not be surprised if the phonologies of the two languages had converged.
**炙 MC *tɕjaʰ 'to roast meat' corresponds to SV chả (< *cả) and chá (< *cá) with ch- [c] instead of gi-. The tone of SV chả indicates that it is an older loan borrowed before the convention of borrowing *tɕj- as *cj-. The tone of SV chá with a tonal reflex characteristic of newer loans may indicate that tonal borrowing patterns changed shortly before the convention of borrowing *tɕj- as *cj- with a *-j- that later conditioned the voicing of the preceding *c-.
***I am using 'yin' and 'yang' as shorthand for 'normally**** conditioned by voiceless initial' and 'normally conditioned by voiced initial'. Here are the six written***** Vietnamese tones and their 'yin/yang' status:
The name of each tone contains its characteristic diacritic (or no diacritic in the case of unmarked ngang).
****There are 'yang' tones in Chinese in syllables with *voiceless initials: e.g., standard Mandarin 國 guó < *kwək 'country' which has a 'yang level' tone even though it originally had a 'yin entering' tone.
*****Southern Vietnamese speakers merge the ngã tone with the hỏi tone, but that is not reflected in spelling which mostly reflects Middle Vietnamese.
18.104.22.168:57: AURAL DOUBLES (PART 2)A recap of part 1: Tangut had two syllables with similar fanqie ('to hear' as an initial speller plus a rhyme 20 final speller):
3369 1mia 'transcription character for Skt ma, mā' and sixteen homophones = 5026 1mi 'to hear' + 3853 1tia 'topic marker'
=+5025 2mia 'transcription character for Skt mya' = top and bottom left of 5026 1mi 'to hear' + left of 5314 2ʔia 'transcription character for Sanskrit ya'
If both syllables were mia (disregarding tones), why was 3369 1mia used to transcribe Sanskrit ma and mā without -y-? And why create 5025 2mia as an 'aural double' of 3369 1mia etc. if 1mia was already a good match for Sanskrit mya?
The answer to both questions is the same: 3369 etc. were actually 1ma, not 1mia, so a special character had to be created to transcribe Sanskrit mya.But wait - if rhyme 20 was -a, then I can't reconstruct rhyme 17 as -a anymore. What was rhyme 17? To answer that question and the questions I asked at the end of part 1 -
Why did I reconstruct -i- in rhyme 20? Can this -i- be salvaged?
- I need to write about 'grades'. I've already covered the topic in "G-*r-adation in Chinese" (part 1 / part 2) and "G-*r-adation in Tangut" (part 1 / part 2), but I've changed my mind about a few things over the past day.
In the Yunjing rhyme tables for some unknown variety of Late Middle Chinese, a-type syllables were placed in four tables:
|Grade \ Table||27||28||29||30|
Vietnamese, Korean, and Japanese loans from Late Middle Chinese have /-(w)a/ for all of those rhymes, so their vowels must have been a-like. One could reconstruct a single Yunjing phoneme */a/ and compress the four tables into two:
|Grade \ Table||27+29||28+30|
But why didn't the author of the Yunjing do that? I think it's because */a/ had two allophones, back *[ɑ] and central and/or front *[a]. The *[ɑ] rhymes were placed in tables 27 and 28, while the *[a] rhymes were placed in tables 29 and 30. I reconstruct these allophones on the basis of correspondences with standard Mandarin and Cantonese. (The latter two languages are probably not descendants of the Yunjing language, but their ancestors were probably similar to it.)
|Grade \ Table||27+29||28+30|
|I||[ɤ] after velars, [wɔ] elsewhere|
after *back initials, [a] elsewhere
|Grade \ Table||27+29||28+30|
|II||[aː]|| [waː] after velars, [aː]
The Cantonese pattern is quite clear:
- Grade I: back vowel
- Grade II: central vowel
- Grades III and IV: front vowels
The Mandarin pattern is complicated by these shifts:
*[ɔ] > [ɤ] after velars, [wɔ] elsewhere
*[wɨɑ] > *[wja] > *[ɥa] > *[ɥɛ]
*[ɛ] > [ɤ] after retroflexes
Sino-Vietnamese, Sino-Korean, and Sino-Japanese data for some non-a
rhymes indicate that Grade IV was more palatal than Grade III (which
may have been entirely nonpalatal in the source dialects of SK and the
Go-on layer of SJ): e.g.,
||Sino-Korean (premodern spelling)
||-ŏn after back initials; -yŏn
||-on < *-ən
||-iên with palatalization of labial
initials: *pʲ- > t-, etc.
Similarly, Mandarin Grade IV [jɛ] is more palatal than Grade III [ɤ].All these diverse sources give us some idea of what the four grades in Chinese were like:
- I was backer than the others
- IV was more palatal than III
We do not know for sure that Tangut also had grades. I do not know of any Tangut term for 'grade'. However, patterns of correlation between Tangut rhymes and Chinese grades in transcriptions have been known for over half a century. Moreover, those patterns also correlate with Tangut initials.
Here is a new Tangut-internal definition of 'grades'. One could identify the grade of a Tangut rhyme by looking at which initials may precede it:
|Grade \ Initial
The table above is only a first approximation.
I classify rhymes which can be preceded by any initial as Grade
III/IV. One could also consider such rhymes Grade V, though such a term
would have no parallel in the Chinese tradition.
Compare that distribution of initials with the distribution of
Chinese initials in Yunjing:
|Grade \ Initial
||*w- and labiodentals
The two patterns are not identical, but there are similarities:
- Labiodentals and r- never appeared in Grade IV.
- Dentals and sibilants were in near-complementary distribution with shibilants.
- l- was infrequent in Grade IV.
I think these similarities were due to Chinese influence on Tangut. Of course, Tangut had its own history, which is why the parallels are not absolute: e.g.,
- Tangut had Grade I and II v- unlike Chinese (in which *w- became *ɣw- in Grades I and II - a change absent from Tangut).
- Tangut had Grade I r- unlike Chinese (in which *r- became *l-; Yunjing *r- is from *n-)
The distribution of initials in each grade tells us whether certain
grades were 'friendly' or 'hostile' toward certain initials. Such
'attitudes' give us clues about the phonetic characteristics of both
grades and initials. For example, the fact that shibilants never occur
in Grade IV, the most palatal of the grades, tells us that they were
not palatal in either the Yunjing language or Tangut. That is
why I reconstruct retroflex shibilants. One can also make historical
inferences from (near-)complementary distribution: e.g., Chinese
shibilants derived from dentals and sibilants, and Tangut shibilants
may have partly derived from dentals and/or sibilants.
Having established strong parallels between grades in the two
languages, I used to think that Grade IV a was the same in Yunjing
Chinese and Tangut: i.e.., -ia. But I could not explain why
Tangut rhyme 20 -ia
- transcribed Sanskrit -a and -ā (and nearly all rhyme 20 characters for Sanskrit -ya syllables were fanqie tangraphs combining part of an initial speller with the left side of the character transcribing Snaskrit ya: e.g., 5025)
- was transcribed as -a(H) in Tibetan
Now I think I have a solution:
||Transcribed in Tibetan as
||-a (rare; only after shibilants)
If the Chinese dialect known to the Tangut was similar to the Yunjing language, it had four kinds of a-rhymes which were similar to Tangut rhymes 17-20.
The Grade IV a-type rhyme of late Tang Dynasty northwestern
Chinese was transcribed in Tibetan as -ya, matching the *-ia
I reconstructed for the Yunjing language. Maybe that rhyme was
still *-ia in the eleventh century, and the Tangut thought its
front (?) *a was like the front vowel of their rhyme 20.
The Tangut transcribed Sanskrit central a and ā - vowels absent from their language - with both back ɑ (rhyme 17) and front a (rhymes 18-20).
I suspect that rhyme 20 was once an *-ia that simplified to -a
after all initials except glottal stop. Hence the rhyme 20 tangraphs
1767 1ʔia and 5314 2ʔia)
transcribed Sanskrit ya and yā. There was no rhyme 20 *ʔa. The *i of pre-Tangut *-ia may have been conditioned by a preceding presyllable with a high vowel, as Japhug cognates identified by Guillaume Jacques (2006) lack i:
0335 1pha < *Cɯ-pha : J ɯ-phaʁ 'side'
1530 1ma < *Cɯ-ma : J smar 'river'
2098 2ŋa < *Cɯ-ŋa-H : J aʑo < *ŋa-jaŋ 'I' (also cf. Old Chinese 吾 *ŋa 'I')
4225 1sa < *Cɯ-sa : J kɤ-sat 'to kill' (also cf. Old Chinese 殺 *ksat 'to kill')
4459 2ba < *Nɯ-ba-H 'to cut': J kɤ-mbaʁ 'to be cut'Tangut
3926 and 4601 2na < *Cɯ-naH 'thou' and second person singular verb suffix
correspond to Old Chinese 汝 *Cɯ-naʔ 'thou'.
If Grade IV rhyme 20 lacked -i-, and Tangut Grade IV was
characterized by frontness contrasting with the backness of Grade I, I
can revise my
vowel reconstructions as follows:
|IV: fronter/higher||i||e < *ie
||ə < *iə||a < *ia
||y < *iu
||ø < *io|
That table is not as simple as its predecessor from four months ago, but it fits the Tibetan and Sanskrit transcription evidence better.
22.214.171.124:59: AURAL DOUBLES (PART 1)
I remain troubled by my reconstruction of Tangut rhyme 20 (1.20/2.17) as -ia. Let's look at the transcription evidence for (or should I say against?) the syllable 1mia from my last post:
1. In Pearl in the Palm,
0092 1mia 'mother'
was transcribed in 12th century northwestern Chinese as 麻 *mbɤa. Granted, there was no Chinese *mia, so this does not necessarily mean 1mia is wrong.
2. On the other hand, it is possible to write mya in the Tibetan script, and yet 0092 was transcribed eight times as ma. Moreover, all rhyme 20 syllables were consistently transcribed without -y-. The Tibetan evidence favors reconstructions of rhyme 20 like Arakawa's -a: and Sofronov's -a (see this table).
3. Moreover, rhyme 20 was often used to transcribe Sanskrit -a and -ā. That is another point in favor of Sofronov's -a. Sofronov did not reconstruct a length distinction in Tangut, whereas Arakawa did. I would expect Arakawa's length distinction to correspond to the length distinction of Sanskrit, but it doesn't: e.g., Arakawa's long -a: may corresponds to Sanskrit short -a as well as long -ā, and vice versa. (10.23.1:09: Gong's length distinction that I used to carry over into my reconstruction also did not correspond to Sanskrit length:
|Sanskrit||Tangut rhyme||Sofronov||Arakawa||Gong||This site until recently||This site now|
|23||-âˁ, -jaˁ, -äˁ||-ya'||-iaa||-ææ||-ɤa'|
|a, ā||24||-aɯ, -âɯ||-a:'||-jaa||-iaa||-ia'|
Colors indicate length: pink = short, green = mixed, blue = long.
Rhyme 22 could not have been a simple -a or -aa, as it was never used to write Sanskrit. Rhymes 18 and 23 were also un-Sanskrit.)
If rhyme 20 were -ia, there would be no reason to create a special fanqie character
5025 2mia = top and bottom left of 5026 1mi 'to hear' + left of 5314 2ʔia 'transcription character for Sanskrit ya'
to transcribe Sanskrit mya, since one of the seventeen 1mia characters with the fanqie
5026 1mi 'to hear' + 3853 1tia 'topic marker'
would have been sufficient. However,
actually transcribed Sanskrit ma and mā without -y-!
(10.23.0:33: One might think that 5025 was created for Sanskrit mya because the second tone was favored for Sanskrit words. But tones in Sanskrit transcription seem to be random: e.g.,
- ma was transcribed with both 3369 1mia and 4737 2ma
- mi was transcribed with 5026 1mi, the initial fanqie speller for 5025 and 3369
- Cya syllables were transcribed with both first and second tone tangraphs
I doubt tones in Tangut transcriptions of Sanskrit had anything to do with Vedic pitch accent which was absent from Buddhist Sanskrit.)
In Arakawa's (1997) Nishida-style reconstruction, the reason for 'aural doubles' - tangraphs with slightly different fanqie containing 5026 1mi 'to hear' - is clear: 5025 was 2myaɦ, whereas 3369 and its sixteen homophones were 1maɦ without -y-.
In Arakawa's own reconstruction, 5025 might be 2mya: contrasting with 3369 and sixteen other 1ma:. (Yet there is no 2mya: or 2ma: on pp.128-129 of Arakawa's 1997 syllabary, though there are seventeen 1ma:.)
|Tangraph||Li Fanwen number||Sanskrit transcription value||Nishida-style from Arakawa 1997||Arakawa 1997?||Gong
|5025||mya||2myaɦ||(2mya:?; not in his syllabary)||2mja
|3369||ma, mā||1maɦ||1ma: (with long vowel for Skt ma!)||1mja
||1mia (with -i- for -i-/-y-less Skt ma, mā!)|
(10.23.1:46: I don't know how Sofronov would reconstruct 5025 and 3369 today. In 1968, he reconstructed them as 2ma and 1ma.)
At least everyone agrees that rhyme 20 was a-like, which is why I render it as -a in my lay transcription of Tangut.
Next: Why did I reconstruct -i- in rhyme 20? Can this -i- be salvaged?
126.96.36.199:59: WHY SO MIA-NY?
I have been writing about names of Kumārajīva
lately (part 1 / part 2) such as Tangut
3948 3369 3284 2152 3284 (again!) 1kɨa' 1mia 2lɨa 1ʂɨi 2lɨa
The tangraph transcribing mā was one of the rhyme 1.20 syllables in the Tangraphic Sea that I listed last week. Most were written with one or two tangraphs, but 1mia was written with seventeen! (For comparison I have also included the corresponding rising tone syllable 2mia with rhyme 2.17.)
|Tangraph||Li Fanwen number||Reading||Li Fanwen gloss||Type (* = only in dictionaries)|
|0092||1mia||mother (cf. 3334)||free morpheme 1|
|0409||former times (only in dictionaries?; combines with regular word for 'day')||bound morpheme 1*|
|1178||first half of 1mia 2nie 'end' (only in dictionaries; cf. 3369)||free morpheme 1 in a compound 'end-tail'*|
|1215||first half of 1mia 2mɤe' 'to think of, to long for' (only in dictionaries)||morpheme half 1*|
|1216||ten thousand (loan from Late Old Chinese 萬 *mɨanh 'id.'?)||free morpheme 2|
|1458||second half of 2ni' 1mia 'salamander' (only in dictionaries)||bound morpheme 2* after a Chinese loanword 鯢 'salamander'|
|1530||river||free morpheme 3|
|1721||stirrup||free morpheme 4|
|1803||first half of 1mia 1ɬiu' 'gray', name of an ancestor (only in dictionaries)||morpheme half 2*, free morpheme 2*|
|2270||last syllable of (2mɪ) 2mɪ 1mia 'a kind of bird' (only in dictionaries)||morpheme part 3*|
|2648||first half of 1mia 1khiu 'underground' (1khiu is 'under')||bound morpheme 1|
|3334||female, woman (cf. 0092)||free morpheme 1|
|3369||end, tail, east (only in dictionaries; cf. 1178); first syllable of 1mia 2ɬiụ 'plantain' and 1mia ?xa 'water buffalo'; transcription of Sanskrit ma, mā||free morpheme 1*, morpheme half 1, morpheme half 2, (not in Tangut words)|
|3527||analogy; generally; doubt, fear (i.e., uncertain); and; few; should (i.e., to be time for), time; clothes||free morphemes 5-11|
|3569||fishing hook||free morpheme 12|
|3718||second half of 1ɣa 1mia 'doorframe' (1ɣa is 'door')||bound morpheme 2|
|5118||second half of 1niu 1mia 'earring' (1niu is 'ear')||bound morpheme 3|
|5025||2mia||transcription of Sanskrit mya||(not in Tangut words)|
Why are there so many 1mia - and no native 2mia? The lower frequency of second tone syllables indicates that the source of the second tone must have been something extra which I reconstruct as a final glottal *-H by analogy with Chinese.
I reconstruct *Cɯ-ma(C) as the pre-Tangut source of 1mia. The high presyllabic vowel conditioned the breaking of the main vowel:
*C₁ɯ-ma(C₂) > *C₁ɯ-mɨa > *mɨa > 1mia
I don't know when the final consonant was lost relative to vowel breaking.
The various 1mia may have had different presyllabic and/or final consonants in pre-Tangut: e.g.,
*kɯ-map, *tɯ-mak, *pɯ-ma, etc.
I count 24 types of 1mia:
17 in texts (not just dictionaries; pink):7 only in dictionaries (blue; possible 'ritual language' words and/or words that didn't happen to appear in Buddhist, Confucian, military, etc. texts: e.g., 'salamander'):
12 free morphemes (0092 = 3334, 1216, 1530, 1721, 3527 [seven homophones!?], 3569)
3 bound morphemes (2648, 3718, 5118)
2 parts of polysyllabic morphemes (3369 [two homophones])
2 free morphemes (1178 = 3369, 1803)
2 bound morphemes (0409, 1458)
3 parts of polysyllabic morphemes (1215, 1803, 2270)
Green indicates a tangraph (3369) that represents one morpheme only in dictionaries and parts of words in texts.
Further analysis may be able to reduce the number of types of 1mia: e.g., the 1mia in 1458 2ni' 1mia 'salamander' may be 'river' and the 1mia in 4681 5118 1niu 1mia 'earring' may be 'hook'.
Although one could describe tangraphy as 'logography' (i.e., as a word-per-character writing system), 3527 might have represented up to seven unrelated words! Conversely, the word 1mia 'female' was written with two tangraphs (0092 and 3334) depending on whether it referred to mothers or females in general. And 1mia 'end' was written differently depending on whether it was an independent word (3369) or in the compound 1178 5734 1mia 2nie 'end-tail'.
10.22.1:54: A high degree of homophony is tolerable: e.g., English can can mean
1. to be able2. a container
3. to place in a container
4. prison (if preceded by the?)
5. toilet (if preceded by the?)
6. to be ready for release (in the can)
7. to be released from employment (mostly passive: was/got canned?)
8. Canada (e.g., in Canwest)
and various other meanings I have never encountered. Context is sufficient to disambiguate these many uses.
None of those meanings are opposites. One might look up
1530 1mia and 2648 1mia
in Li Fanwen (2008) and think they are near-opposites ('river' and 'land'), but in fact the latter apparently only occurs in the disyllabic expression
2648 5399 1mia 1khiu 'underground'
and I suppose that is much more common than
1530 5399 1mia 1khiu 'under a river'
so there is little risk of ambiguity. (In Google, under a river has 8.74 million hits, which sounds like a lot, but underground has 335 million hits! And many references to under a river involve underwater construction that would have been unimaginable to the Tangut nearly a thousand years ago.)
188.8.131.52:21: 'ZEN': A REMNANT OF TANGUT EMPIRE CHINESE?
KJ Solonin's article made me think about the Tangut name for Zen
3504 1ʂɨã =
all of 2833 2diẽ 'calm, quiet' (probably 'not' + top and bottom right of 'to move')
left of 5593 1bɤo' 'to look, watch, observe'
as well as the Tangut names of Kumārajīva (part 1 / part 2). 1ʂɨã is a borrowing from Tangut period northwestern Chinese 禪 *ʂɨã which in turn is from Late Old Chinese (LOC) *dʑian, a Sinified form of Pali jhāna- (< Sanskrit dhyāna 'meditation'). (Japanese Zen is from Middle Chinese 禪 *dʑien.) Coblin (1994: 323) reconstructed 禪 as *śan ~ *źan in the 9th and 10th centuries AD on the basis of these Tibetan transcriptions:
大乘中宗見解: shan, zhan
南天竺國菩提達摩禪師觀門: zhan, Hzhan
LOC *dʑ developed differently in premodern northwestern Chinese and in Mandarin in 'level' tone syllables:
|Premodern northwestern Chinese||*ź > *ś > *ʂ|
|Mandarin||ch [tʂʰ]||sh [ʂ]|
I don't understand the phonetic motivation for the split. Why were 'nonlevel' tones incompatible with a voiced affricate? (Voiceless affricates were possible before 'nonlevel' tones.)
Although modern northwestern Chinese generally has Mandarin-style reflexes of *dʑ, 禪 'Zen' still has a fricative initial in some varieties (Coblin 1994: 323):
Early 20th century Xi'an (as recorded by Karlgren): ʂæ̃ (tone unknown)
I thought these fricatives might be substratum retentions. I had either forgotten or overlooked this passage earlier in Coblin (1994: 101):
Occasional exceptions are found [to the Mandarin pattern of reflexes of *dʑ ...], e.g. 禪  (QYS źi̯än) "Zen Buddhism": [mid-Tang Chang'an] *dźan > *źan; CSZ [colloquial Suzhou] *śan (~ *źan?); XN [Xining]: ʂã⁴⁴; DH [Dunhuang]: ʂæ̃²⁴. These exceptional modern reflexes appear to derive directly from forms like those found in CSZ.
I looked for those "occasional exceptions" and found
蟬 LOC *dʑian 'cicada' is ʂæ̃²⁴as well as tʂʰæ̃²⁴(cf. standard Mandarin chan) in Xiaoxuetang's Xi'an data
辰 LOC *dʑin 'fifth Earthly Branch' is ʂɛ̃ (tone unknown) in Karlgren's Xi'an data (Coblin 1994: 361) and ʂẽ²⁴as well as tʂʰẽ²⁴ (cf. standard Mandarin chen) in Xiaoxuetang's Xi'an data
This last graph has two Sino-Korean readings, chin (without aspiration!) and shin. The first reading may be an old borrowing from Early Middle Chinese *dʑin; the second is from Late Middle Chinese *ɕin.
The multiple Sino-Korean readings of 什 in 鳩摩羅什 'Kumārajīva') may also be from different strata of borrowing: 집 chip from Early Middle Chinese *dʑip and 십 ship from Late Middle Chinese *ɕip. (집 chip becomes -jip with secondary voicing after a sonorant. That voicing is due to a Korean phonological rule and does not preserve the voicing of Early Middle Chinese *dʑip.)
A third Sino-Korean reading 습 sŭp is difficult to explain; it may be from a different Late Middle Chinese dialect in which *-ip became *-ɨp rather than vice versa.
The Xining reading of 禪 'Zen' also has an irregular 'yin level' tone (which would normally reflect an earlier *voiceless initial) instead of the expected 'yang level' tone (reflecting an earlier *voiced initial). I don't think the tone of 禪 'Zen' indicates that it had a voiceless initial in pre-Xining. I hypothesize that the original dialect of the region had a 'yang level' tone that sounded like the 'yin level' tone of the Mandarin dialect that displaced it.
If I am correct, then a study of irregular tones in Xining may reveal something about the substratal tone system. Unfortunately, it may not reveal the exact values of the tones at the time of borrowing because all tones - substratal and superstratal may have changed since then. So I don't know if 44 was the 'yang level' tone contour in the substratum dialect.
It would be interesting if other modern northwestern dialects also have a seemingly 'yin level' tone for 禪 'Zen'.
Dunhuang only has one 'level' tone which may be a merger of earlier 'yin level' and 'yang level' tones.
I don't know the modern Xi'an reading of 禪 'Zen', but I do know that both the substratal fricative-initial and superstratal affricate-initial readings of 蟬 'cicada' and 辰 'fifth Earthly Branch' have 'yang level' tones in modern Xi'an. Were the tones of the substratal readings shifted to match the superstratal tones?
One last question: Why would northwestern Chinese retain an old word for 'Zen'? The answer probably has something to do with the religious history of the region.
I am reminded of how Japanese Buddhist terminology consists of Early Middle Chinese-based borrowings (呉音 Go-on) that were not displaced by Late Middle Chinese borrowings (漢音 Kan-on) during the Tang Dynasty: e.g., 禪 Zen was not replaced by a newer borrowing *Sen. (One might think that Zen Buddhism was practiced in Japan before the Tang Dynasty, but in fact it took root in the 12th century when 1ʂɨã 'Zen' was practiced in the Tangut Empire. An old reading Zen was used for a new school because of the strong association between Go-on and Buddhism in Japan.)
On the other hand, Korean Buddhist terminology generally consists of Late Middle Chinese borrowings: e.g., 禪/선 Sŏn 'Zen' probably replaced an earlier borrowing that would have become modern 전 *Chŏn. A rare exception is the 什 -jip in 鳩摩羅什/구마라집 Kumarajip. But that is not the most common reading of 鳩摩羅什. Here are Google frequencies for the three readings of the name:
구마라십 Kumaraship: 215,000
구마라집 Kumarajip: 21,900
구마라습 Kumarasŭp: 19,300
The newer reading 십 ship outnumbers the older reading 집 jip by nearly ten to one.
The older voiced affricate reading of 禪 'Zen' has left no trace in Sino-Vietnamese. The only Sino-Vietnamese reading of 禪 is Thiền from southern Late Middle Chinese *ʑien; there is no *Chiền from southern Early Middle Chinese *dʑien.
184.108.40.206:54: THE TANGUT NAMES OF KUMĀRAJĪVA (PART 2)
The third Tangut name of Kumārajīva shares no characters with the other two:
1429 4575 4710 4867 1kiew 2mo 1lo 1ʂɨəʳ
It is obviously based on Tangut period northwestern Chinese 鳩摩羅什 *kɨwmbɔlɔʂɨi from a 4th century *kumaladʑip.
As I mentioned yesterday, 1429 is also the transcription character for 鳩 in the Tangut translation of the Forest of Categories (Gong 2002: 438).
4575 and 4710 are also transcription characters for Sanskrit mo and lo (Arakawa 1997: 111).
4867 was also used to transcribe other Chinese characters pronounced *ʂɨi (十實失室) and 涉 *ʂɨa (Li 2008: 770). The retroflexion in Tangut may have reflected subphonemic vowel retroflexion in Chinese after retroflex affricates: /ʂi/ = [ʂɨiʳ] and /ʂia/ = [ʂɨaʳ].
In theory the name could have been borrowed in a more Sanskrit-like form as *kʊ ma raʳ dzi va via Tibetan kumaradziba [kumaradziwa] or directly from the variety of Sanskrit known to the Tangut which had [dz] for j. (My Tangut reconstruction has no rhyme -u. Retroflexion was almost always obligatory after r- in Tangut.)
I was curious to see how Kumārajīva was rendered in other languages. Judging from Wikipedia entry titles:
Czech Kumáradžíva preserves the long vowels.
Polish Kumaradżiwa [kumaradʐiva] has retroflex dż for Sanskrit palatal j [dʑ]. I would have expected *Kumaradziwa [kumaradʑiva] with palatal dz (pronounced like dź [dʑ] before i). The combination of retroflex dż and palatal i is unusual in Polish. I wonder if that i is pronounced [ɨ] as in the normal Polish combination ży [ʐɨ].
Ukrainian Кумараджива [kumaradʐɪva] has [ɪ] instead of [i]. I presume the spelling was taken from Russian Кумараджива [kumaradʐɨva].
Korean 쿠마라지바 [kʰumaradʑiba] has an un-Sanskrit (and English-influenced?) initial aspirate. I presume it is a modern term. Older names are 鳩摩羅什 Kumarasŭp/Kumaraship/Kumarajip (the last character is read three different ways) and 羅什 Nasŭp (with initial r- becoming n- before a-).
THE TANGUT NAMES OF KUMĀRAJĪVA (PART 1)
Having just written about Chinese transcriptions of Indic, I thought it was neat that I then stumbled upon KJ Solonin's tentative identification of
2152 3284 1ʂɨi 2lɨa
as a Tangut transcription of the name of Kumārajīva (1998: 411, 414 #80), translator of the Lotus Sutra and other Buddhist texts into Chinese. Kumārajīva's Chinese name was 鳩摩羅什, pronounced *kumaladʑip in the 4th century AD. In the Tangut period northwestern dialect of Chinese, it would have been read as *kɨwmbɔlɔʂɨi. If the two names are connected, the Tangut name might be an accidental inversion of
*3284 2152 2lɨa 1ʂɨi
corresponding to 羅什 *lɔʂɨi, an abbreviation of 鳩摩羅什 *kɨwmbɔlɔʂɨi.
(This abbreviation was obviously created by a Chinese speaker, as a
natural break in the Sanskrit would be between Kumāra 'boy, prince'
and jīva 'life'.)
Unfortunately, the name 2lɨa 1ʂɨi only appears once in the
text that Solonin translated. However, a transcription of the full name
鳩摩羅什 *kɨwmbɔlɔʂɨi does appear in the Hongchuan preface of
the Lotus Sutra (Li 2008: 533; see Nishida 2004
on the Tangut Lotus Sutra):
3948 3369 3284 2152 3284 (again!) 1kɨa' 1mia 2lɨa 1ʂɨi 2lɨa
There are several things that are odd about this spelling.
First, 3948 1kɨa' is a poor match for Chinese 鳩 *kɨw. It is a transcription character for Sanskrit ka and kya (Arakawa 1997: 110, 116; Kychanov and Arakawa 2006: 692). In the Tangut translation of the Forest of Categories, 鳩 *kɨw was transcribed as
which is a much better match (Gong 2002: 438). 1429 is also a
transcription character for Sanskrit (?) kyu (Grinstead 1972:
111) and is the first character in a different transcription I'll
Second, 3369 1mia (rhyme 20) has an -i- that
corresponds to zero in Chinese 摩 *mbɔ and Sanskrit and Tibetan ma
(Arakawa 1997: 110, Kychanov and Arakawa 2006: 234).
Maybe I should follow Sofronov and Arakawa and stop reconstructing -i- in rhyme 20.
Third, 3284 2lɨa (rhyme 19) has an -ɨ- that corresponds to zero in Chinese 羅 *lɔ and Sanskrit la (Arakawa 1997: 110).
I have yet to see a fully satisfactory solution to the problem of
reconstructed Tangut medials seemingly reflecting nothing in
transcriptions of Chinese and Sanskrit.
Fourth, 3284 appears again, corresponding to zero in the
four-syllable Chinese name. The first four syllables of this longer
Tangut name are obviously based on Chinese (hence 2lɨa for 羅 *lɔ
rather than *raʳ for Sanskrit ra). I would have
expected a fifth syllable to be
a transcription of Chinese 婆 *phɔ < *ba for Sanskrit va in longer Chinese names for Kumārajīva:
鳩摩羅什婆 *kɨwmbɔlɔʂɨiphɔ < *kumaladʑipba
鳩摩羅時婆 *kɨwmbɔlɔʂɨiphɔ < *kumaladʑɨba
鳩摩羅耆婆 *kɨwmbɔlɔtʂɨiphɔ < *kumalatɕiba
Having not seen the text where Li found this longer transcription, I don't know if this second 3284 is a typo (I doubt that, as even the Chinese translation has a doubled syllable: 鳩摩羅什羅) or in the orignal. Kychanov and Arakawa (2006: 692) do not list any words beginning with 3948. Maybe this longer name is a confused blend of *1kɨa' 1mia 2lɨa 1ʂɨi and the short inverted name 1ʂɨi 2lɨa.
At least 2152 1ʂɨi is a perfect match for Chinese 什 *ʂɨi, and is attested as a transcription of the last syllable of the name 李七什 *lɨi tshi ʂɨi (Li 2008: 356).
Next: Another Tangut name for Kumārajīva.
220.127.116.11:51: TESTING STAROSTIN'S 'LATE-RAL' SCENARIO
(I rhyme lateral [ˈlætəɹo] and scenario [səˈnæɹio]. 'Late-ral' is [ˈlejtəɹo] with a linking schwa to preserve the resemblance to [ˈlætəɹo].)One of the biggest sound changes in Chinese was the loss of laterals:
Old Chinese *l- in type A syllables > Middle Chinese *d-
Old Chinese *hl- in type A syllables > Middle Chinese *th-
Old Chinese *l- in type B syllables > Middle Chinese *j-
Old Chinese *hl- in type B syllables > Middle Chinese *ɕ-
(The nature of the Old Chinese type A/B distinction is disputed, but the Middle Chinese initials are uncontroversial.)
In my last entry, I mentioned two conflicting chronologies for the lateral shift in Chinese. Schuessler (2009) reconstructed Middle Chinese-like initials (*j-, *ɕ-, *d-, *th-) in his Later Han Chinese (i.e., Eastern Han / Late Old Chinese), whereas Starostin mostly reconstructed transitional fricatives or laterals for that period:
|Old Chinese syllable type||Early Old Chinese||Late Old Chinese||Middle Chinese|
|A||*l- (Starostin: *l- and dɮ-)||*l-||*d-|
|*hl- (Starostin: *tɬ-)||*hl-||*th-|
|A and B||*r-||*l-|
|*hl- (Starostin: *tɬ-)||*ɕ-|
(I use the same notation regardless of scholar for ease of comparison. I list Starostin's reflexes of his Early Old Chinese *tɬ- and *dɮ- because they correspond to *hl- and *l- in others' reconstructions. Starostin's EOC *hl- behaved differently from others' *hl-; it became Late Old Chinese and Middle Chinese *h- [= others' *x-]. For arguments against Starostin's lateral affricates, see Sagart 1999. I have included EOC *r- for comparison.)
To test Starostin and Schuessler's reconstructions of Late Old Chinese (LOC), let's look at Eastern Han transcriptions of Indic from Coblin (1983).If Starostin is right:
- LOC *l- should transcribe Indic l
If Schuessler is right:
- LOC *hl- shouldn't be used in transcription because there was no Indic voiceless hl
- LOC *r- should transcribe Indic r
- LOC *d- from EOC *l- could transcribe Indic d
- LOC *th- from EOC *hl- could transcribe Indic th
- LOC *l- from EOC *r- should transcribe both Indic *l and *r (since LOC no longer had *r-)
Both would agree that LOC *ɕ- should transcribe Sanskrit ś [ɕ].
As I already noted last time, the correspondence of Starostin's *ʑ- / Schuessler's *j- to Indic y- [j] is ambiguous since Starostin would have said that *ʑ- was the closest available initial due to the absence of *j- in his LOC. Correspondences between this LOC initial and Sanskrit c-, j- [ɟ], ś- [ɕ], and s- suggest that it was "a fricative or affricate of some sort" (Coblin 1983: 63): e.g., Starostin's *ʑ-.
In the transcriptions of 安世高 An Shigao (mid-2nd c. AD) we find that:
- Indic d and even intervocalic -t- were transcribed with Starostin's LOC *l- / Schuessler's *d- (18, 19; the numbers are from Coblin 1983)
- Indic l was transcribed with Starostin's LOC *r- / Schuessler's *l- (13, 15, 28)
These pattern are not quirks of An Shigao; they can also be found in the transcriptions of 支婁迦淺 Zhi Loujiachen/Lokakṣema (mid-2nd c. AD; his name has 婁 Starostin's LOC *r- / Schuessler's *l- for Sanskrit l-) and 康孟詳 Kang Mengxiang (late 2nd-early 3rd c. AD). All three men were non-Chinese who settled in Luoyang, so their transcriptions probably represent the same dialect.
The only Indic th in An Shigao's transcriptions was transcribed with 替 whose EOC initial is ambiguous. It definitely had *th- in Middle Chinese and must have had *th- here. Starostin might have taken that as evidence for reconstructing 替 with *th- in EOC.
th is a low-frequency consonant, so it's not surprising that there are no instances of it transcribed with original or secondary *th-. (Oddly Lokakṣema transcribed it as the coda-onset sequence -t s- in 55.)
I conclude that the following chain shift had occurred in the Luoyang dialect of LOC by the mid-2nd century AD:
*r- > *l- (type A) > *d-
This is contrary to Starostin's 'late-ral' scenario in which the laterals hardened later.I also reconstruct a parallel change
*hl- (type A) > *th-
on the grounds that it would be odd if *hl- lagged behind its voiced counterpart *l-. Unfortunately there is no Indic transcription evidence for that.
Phonetic glosses such as
indicate that *hl- did not harden in other LOC dialects during the early centuries of the first millennium AD. The glosses would not make sense if *hl- had already become *th-.
'聖 *hlieŋh (type B; > MC *ɕieŋʰ) is read like 通 *hloŋ (type A; > MC *thoŋ)' (Xu Shen 1063, b. in 召陵 Zhaoling 200 km SE of Luoyang, fl. c. 100 AD)
'天 *hlein (type A; > MC *then) read as 身 *hlin (type B; > MC *ɕin)' (Gao You 243, b. in 涿 Zhuo, fl. c. 200 AD)
10.17.23:17: Some LOC glosses that seem bizarre might make more sense if we don't try to shove the words into the standard paradigm defined by the Chinese lexicographical tradition. For instance, perhaps Xu Shen pronounced 通 as something like *hliøŋ with a front diphthong similar to 聖 *hlieŋh. The expected Old Chinese reconstruction 通 *hloŋ is mechanically derived from Middle Chinese *thoŋ, whereas my hypothetical *hliøŋ would have vowel warping conditioned by a presyllable in an Old Chinese variant *Cɯ-hloŋ or *Hɯ-loŋ. Perhaps *Hɯ-loŋ was the earliest form which developed along two paths:
Early fusion: i.e., before *ɯ conditioned vowel warping
*Hɯ-loŋ > *hloŋ > *thoŋ (Middle Chinese prestige form recorded in dictionaries)
Late fusion: i.e., after *ɯ conditioned vowel warping
*Hɯ-loŋ > *Hɯ-luoŋ > *hluoŋ > *hlioŋ > *hliøŋ (> Middle Chinese *ɕyøŋ?; nonprestige and extinct?)
For more examples of variation between fused and unfused presyllables, see the discussion of Phan Rang Cham (Austronesian) and Ruc and Nha Heun (Austroasiatic) in Sagart (1999: 15-17).
18.104.22.168:12: TILTED TONGUE
On Monday, I wrote,
Last night I mentioned
I wrote the pre-Tangut source of ld- as *L-. External evidence may help us identify what *L- was.
3190 1ldwia 'tongue' = (4226 1ldwị + 0537 1pia) + 1223 2phɤo' (Mixed Categories of the Tangraphic Sea 11.122)
as one of the syllables with a fanqie including the mysterious additional character 1223.
1ldwia is probably related to the many l-words for 'tongue' in Sino-Tibetan: e.g.,
Old Chinese 舌 *mɯ-lat or *m(ɯ)-ljat (Baxter and Sagart 2014: *mə.lat)
also cf. 舐/舓/咶 *mɯ-leʔ or *m(ɯ)-ljeʔ (B&S 2014: *Cə.leʔ) 'to lick'
and perhaps 舔 *hlˁimʔ < *qlimʔ or *Hʌ-limʔ (? - I can't find any attestations before the 13th century AD; nonetheless it resembles lem-words for 'tongue' elsewhere in Sino-Tibetan and may be very old) 'to lick'
It is not possible to determine whether Middle Chinese *ʑ- in 'tongue' and 'lick' is from *mɯ-l- or *m(ɯ)-lj-. Coblin (1986) reconstructed medial *-i- for 'tongue' at the Proto-Sino-Tibetan level.
If the third word is related, and if the root was *√lj, then I can reconstruct*m(ɯ)-lj-a-t (a-grade)
It is tempting to reconstruct *m(ɯ)-lj-a-j-ʔ (a-grade), but the phonetics 氏 and 易 point to *e.
*qli-m-ʔ or *Hʌ-li-m-ʔ (zero grade; the *j of the root became *i if no vowel followed)
Classical Tibetan ljags /ldʑags/ < *n-ljaks (Jacques, "The laterals in Tibetan")
CT j is an affricate /dʑ/, whereas pre-Tibetan *j is a glide.
Although it would be nice if Tibetan had *m- like Chinese, *m-lj- would have developed into mj- /mdʑ/, not lj- /ldʑ/ (Jacques, "The laterals in Tibetan").
Written Burmese hlyā
I cannot explain the variation in final consonants (Old Chinese *-t and *-[m?]ʔ, pre-Tibetan *-ks, Written Burmese zero). I presume they are all suffixes.
The pre-Tangut source of 1ldwia must be a combination of the following elements:
ld- may be from a consonant prefix plus root *l-
-w- is from a labial prefix *P- (and that prefix might have combined with *l- to form ld-)
-i- is from *-j- and/or a presyllabic *-ɯ-
a final stop could have been lost without a trace
the tone indicates there was no final *-H
If the root was *√lj, that narrows down the possibilities.
The simplest reconstruction would be *m-lja whose *m- would combine with *l- to form ld- and condition the medial glide -w-.
A more complex reconstruction *P-N-lja would have separate sources of -d- and -w-.
Forms for 'tongue' in Horpa varieties seem to be from *P-lj-: fʑa, vɮɛ, etc. See STEDT and the rGyalrongic Languages Database (item #36).
According to Guillaume Jacques ("The laterals in Tibetan"), Li Fang-Kuei, Coblin, and Gong all reconstructed *n-l- as the source of Written Tibetan ld- (whereas Jacques reconstructed *d-l- since his *n-l- became WT Hd- /nd/.) Perhaps *N-l- similarly became ld- in Tangut. *N- may have been an *n- as in pre-Tibetan *n-ljaks 'tongue' or an *m- as in Chinese *m(ɯ)-ljat.
The only other word out of the eight I discussed yesterday that might have a cognate - with emphasis on might - is
0841 1ɬwiẹ 'oblique, slanting, inclined' = (2814 2ɬị + 3439 1piẹ) + 1223 2phɤo' (Mixed Categories of the Tangraphic Sea 12.122)
Before I go on to a possible cognate, I realize what 1223 is doing here and in various other cases. I think 1223 in such contexts means 'combine the initial of one syllable with a labial-initial syllable to form a syllable with medial -w-': e.g.,
1ɬiẹ + 1piẹ = 1ɬpiẹ > 1ɬwiẹ ̣
Could this suggest that -w- was [v] or [β] and that Tangut labials lenited in coherent speech (as opposed to words pronounced in isolation): i.e., 1ɬiẹ 1piẹ was pronounced [ɬiẹ viẹ] or [ɬiẹ βiẹ]?
Another possibility is that labials were followed by a subphonemic glide [w]: e.g., 1piẹ /piẹ/ was [pwiẹ] and
1ɬiẹ + [1pwiẹ] = 1ɬwiẹ
There was no contrast between /P/ and /Pw/ in Tangut.
That does not explain the highly anomalous fanqie for 2417 (which does not have a labial-initial final speller; moreover, its final speller has a different rhyme with the wrong tone!):
2417 1ʂwɨọ 'to need, want' = (0245 2ʂwɨi + 1449 2tʂhwɨoʳ̣̣) + 1223 2phɤo' (Tangraphic Sea 55.222)
Moreover, 1223 is redundant in cases like the one above and
5679 1khwɤa 'remnants' = (2554 1khwɤe + 4314 1bɤa) + 1223 2phɤo' (Tangraphic Sea 26.211)
in which the initial speller has -w-. Perhaps this use of 1223 originated in fanqie for words like 0841 and was overextended.
Back to cognates: 0841 1ɬwiẹ could go back to *S-P-KE-la:
*S- conditioned the tense vowel
*P- conditioned -w-*K- fused with *l- to form ɬ-
*-E- conditioned the raising and breaking of *a to ie
The root *la would be shared with Old Chinese 邪 'awry' *sla (spelled 斜 from the 2nd century BC onwards for 'slanted'). But it is not clear if 邪 had an *l-root.
First, other *l-less reconstructions of 邪 are possible: e.g.,
*sja (Schuessler 2009 and this site)
*sə.ɢA (B&S 2014, which reconstructs the left side 牙 of 邪 as *m-ɢˤ<r>a; Schuessler 2009 reconstructs *ŋrâ and I reconstruct *ŋra)
Second, the lateral phonetic 余 *la of the later spelling 斜 is not strong evidence for an *l-root if
- Baxter and Sagart's *sə.ɢa is correct
- *l- had shifted to *ʑ- by the 2nd century BC
- *sə.l-, *s-l-, *s-ɢ-, and *sə.ɢ- had merged into something like *sj- or *zj- (i.e., a *ʑ-like cluster) by the 2nd century BC
However, Starostin reconstructed a different chronology in which laterals remained lateral as late as the 2nd century BC (i.e., during the Western Han):
邪 *lhia > 邪/斜 Western Han *lhia > Eastern Han *zhia
余 *dɮa > Western Han *la > Eastern Han *ʑa
Eastern Han transcriptions of Sanskrit y- are ambiguous. Starostin might have said that Chinese *ʑ- was used for Sanskrit y- because there was no *j-. On the other hand, Schuessler would say that Chinese *j- was used for Sanskrit y-.
22.214.171.124:20: BIRD WORDSAt the end of my last entry, I asked what 1223 was doing in this Tangut fanqie:
1363 1swia 'time' = (5323 1swi + 0537 1pia) + 1223 2phɤo' (Tangraphic Sea 29.132)
The analysis of 1223 2phɤo' 'gentle, harmonious, together, pair' is unknown, but it looks like 'bird' + 'word':
It is in eight fanqie in the first and third surviving volumes of the Tangraphic Sea. It might have been in the lost second volume as well.
|Volume/Page/position||Tangraph||Li Fanwen number||Initial class||Rhyme||Reading (Nishida-style, Arakawa 1997)||Reading (this site)||Fanqie||Gloss|
|1.26.211||5679||V||1.18||1khamba||1khwɤa||2554 1khwɤe||4314 1bɤa||remnants (only in dictionaries?)|
|1.29.132||1363||VI||1.20||1špwaɦ||1swia||5323 1swi||0537 1pia||time, transcription character for Chinese 宣 *swiã, 修 *siu|
|1.55.222||2417||VII||1.48||1štšhor||1ʂwɨo||0245 2ʂwɨi||1449 2tʂhwɨoʳ||to need, want|
|1.84.253||1029||V||1.80||1kwɑr||1kwaʳ||2503 1kʊ̣||5528 1baʳ||to cry, weep, sob|
|3.11.111||0732||IX||1.64||1hlwạ||1ɬwiạ||1770 1ɬwi||5370 1piạ||ash, dust|
|3.11.122||3190||1.20||1ɬwaɦ||1ldwia||4226 1ldwị||0537 1pia||tongue|
|3.12.111||2238||1.67||1hlwị||1ɬwị||0239 1ɬiə||5212 1pị||the surname Lhwi|
|3.12.122||0841||1.61||1lwɛ̣||1ɬwiẹ||2814 2ɬị||3439 1piẹ||oblique, slanting, inclined|
What is the function of 1233? It can be translated into Chinese as 合 'together', the word used in Middle Chinese transcriptions of Sanskrit to indicate that two syllables were to be read as one: e.g.,
One might expect 1233 to appear in fanqie for Sanskrit transcription characters, but it doesn't; in fact, one of the fanqie is for the basic word 3190 1ldwia 'tongue'. Why wasn't its fanqie simply
娑婆二合 *sa ba TWO TOGETHER for Sanskrit sva
4226 1ldwị + 0537 1pia
without 1233? Fanqie are by definition combinations of initials and finals; wouldn't 1233 be redundant?
In any case, 1233 is not a carryover from the Chinese lexicographical tradition, since 合 does not appear in Chinese fanqie.1233 is interpreted in at least three ways in Arakawa's Nishida-style reconstruction:
1. Read as a sequence of two syllables:
(1kĭɛ2 + 1mba) TOGETHER = 1khamba
This is the only disyllabic reading in Arakawa's Nishida-style reconstruction.Why isn't the combination 1kĭɛ2mba or 1kamba (if the second rhyme is copied in the first syllable)?
2. Read as a combination of the initials of the two syllables and the rhyme of the second syllable:
(1swiɦ + 1paɦ) TOGETHER = 1špwaɦ
(2ši + 2tšhɔr) TOGETHER = 1štšhor (not 2štšhɔr!)
3. Redundant in the other five instances which might as well be normal fanqie
The first two interpretations are highly unlikely. I don't know of any transcriptions of 5679. And I doubt Chinese 宣 *swiã and 修 *siu would have been transcribed with a very un-Chinese cluster špw-.
So that leaves the third interpretation which is also unsatisfying. What, if anything, does 1223 indicate that differs these eight syllables from all others in the Tangraphic Sea? I can't help but fear that the instances of 1233 in the lost second volume might not shed light on this mystery.
A PHONETIC KEY TO TANGRAPHIC SEA RHYME 1.20
Nearly fifty years have passed since the Russian translation of the Tangraphic Sea, and the Chinese translation of that dictionary turned thirty last year. An English translation would be nice but perhaps also redundant since Tangutologists should be able to read Russian and/or Chinese. Of course, English would be nice for many non-Tangutologists. What I would like to see (and make) is a Tangraphic Sea with reconstructed character readings. Since I have been writing abou rhyme 20 syllables lately, here are the readings for the
rhyme 20 1sia 'to do (only in dictionaries?); transcription character for Chinese *sa, *sã and Sanskrit sa, sā'
entries in the first (level) tone* volume of the Tangraphic Sea. You can see the characters in Andrew West's online Tangraphic Sea. I have added the initial classes from Homophones. Groups are divided by circles in the original text.
|Page/position||Initial class||Group||Reading||Fanqie||Number of tangraphs|
|29.132||VI||1swia||(5323 sw- + 0537) 1223||1|
The absence of classes II (v-) and VII (retroflex shibilants) is a trait of Grade IV rhymes.Class IV (ɲ-?) is rare.
Some groups divided by circles correlate with homophone groups (e.g., 1-4), but others don't: e.g., the fifth group is a mixture of class III and V syllables.
Fanqie initial speller 3031 is ambiguous (see "When Rhyme 21 Is Really Rhyme 20" and "When 1825 Is Really 1829"). I would not expect 3031 to represent dz- here, since dz-tangraphs were placed in the Mixed Categories volume of the Tangraphic Sea.
I see now that I mixed up the fanqie of 1829 and 1825 (as well as those characters themselves) last week. Great. For the record, the correct fanqie are
1829 'to heat up, burn' 1tshia = 3278 1tshi + 1693 1sia (Tangraphic Sea 28.271)
1825 is from 1829 with a prefix *P- in addition to the *Kɯ- that conditioned aspiration and vowel breaking:
1825 1tshwia 'to roast, warm up' and 5041 1tshwia 'stove, furnace' =
0311 1tshwiə + 1289 1lwia (Tangraphic Sea 29.141-29.142)
*Kɯ-tsa > 1829 tshia
*P-Kɯ-tsa > 1825 tshwia
(The bare root is in Tibetan tsha 'hot' whose initial aspiration is secondary. More cognates here.)
5041 is presumably an extended use of 1825 (i.e., 'where food is warmed up', 'device for heating').
In theory one might expect only one fanqie final speller for all rhyme 1.20 syllables or two (one for -ia and another for -wia), but in fact there are ten! That does not mean there were ten subtypes of rhyme 1.20 syllables. Nearly all of those ten can be linked in a complex fanqie tree:
Members of that tree are in pink in the first table. (I have colored 0537 somewhat differently since it is followed by 1223. I will write about 1223 in my next entry.)
I placed 1693 at the root since its fanqie final speller is ... itself! 1693 is the final speller of 3179 and 0618, 3179 is the final speller of 4620 which is the final speller of 3583 and 2019, etc.
The final spellers 1289 and 1825 for -wia form a closed circle. 1289 is the final speller of its final speller 1825 (see above for the fanqie of 1825).
1289 1lwia 'lower limbs, legs' = 2302 1lɨə + 1693 1tshwia (Tangraphic Sea 29.143)
I don't know why 1363 swia wasn't spelled with either 1289 or 1829:
1363 1swia 'time' = 5323 1swi + 0537 1pia + 1223 2phɤo' (Tangraphic Sea 29.132)Next: What is 1223 doing in that fanqie?
10.14.21:21: The numbers at the ends of
1ldia1 'to come' and 1ldia2 'to return, transport'
indicate that they were treated as nonhomophonous (heterophonous - why isn't that word used more in linguistics?) in the Tangraphic Sea (and in Homophones!) even though their fanqie seem to indicate they are homophones. Their final spellers belong to the same tree (see above), and the initial speller of 1ldia2 is derived from the initial speller of 1ldia1. See "Come Again?" for details.
"1.20" in the title of this post refers to tone one, rhyme 20.
The Tangraphic Sea volume for the second [rising] tone has been lost. Rhyme 2.17 is the rising tone counterpart of rhyme 1.20. The rhyme numbers do not match since not all level tone rhymes have rising tone counterparts and vice versa: e.g., 1.6, 1.13, and 1.16 lacked rising tone versions. Arakawa (1997) lists rhyme 1.20 and 2.17 tangraphs side by side.
126.96.36.199:43: THE COMING CLANYesterday I reconstructed a Tangut word for 'come' with ld-. Other words for 'come' have the same fanqie initial speller (0475), so they can also be reconstructed with ld-:
3456 1ldia < *Cɯ-La 'to come'
*C- might be the *S- conditioning vowel tension (indicated with a subscript dot) in the words below. *Sɯ- could have been lost after the vowel conditioned breaking (see below) but before *S- could condition tension.
Normally *ɯ conditions the breaking of *a to ɨa after *l-. Did *a break to ia after *L-?
4106 1ldɨə̣ < *S-Lə 'to come'
2373 1/2ldɨẹ < *Sɯ-La/ə-j(-H) 'to come'
The root vowel is ambiguous.
The Precious Rhymes of the Tangraphic Sea has two entries for this character, one in the level tone volume and the other in the rising tone volume. Although there are other characters with two readings, I don't know of any other case in which the two readings only differ in tone.
5727 1ldɨə̣ < *S-Lə 'to transport, come' (homophone of 4106; cf. how 3456 is nearly homophonous with 3502 'to transport', written as a mirror image of 5727 and derived from it:
I wrote the pre-Tangut source of ld- as *L-.
External evidence may help us identify what *L- was. There are many
Sino-Tibetan words for 'come' with l-; at least one
(Mandarin 來 lai < *mʌ-rək) is not related to the
others. Do the Tangut words belong to this clan of l-words? If
- do the other languages preserve a root-initial l- that gained a prefix in Tangut?
cf. how *d-l- became ld- in Tibetan (Jacques, "The Laterals in Tibetan")
- or does Tangut preserve a cluster reduced to l- in other languages?
- or are both Tangut ld- and non-Tangut l- from a third source in Proto-Sino-Tibetan?
188.8.131.52:15: COME AGAIN?
(23:09: The title refers to this idiom and to the fact that 3456 'come' is followed by 3502, another Tangut character containing it in Homophones.)
After two steps
backward ... one step forward ... I hope.
In my last post, I mentioned
which has no homophones: it is in the isolated liquid-initial section of Homophones (A edition, 55A54).
3456 1lia (Grade IV) 'to come' = 0475 1liu (Grade IV) + 3583 1tia (Grade IV)
Right below it in Homophones (A
edition, 55A55) is
3502 1lia (Grade IV) 'to return, transport' = 4464 1lɨə̣ (Grade III) + 2019 1thia (Grade IV)
which looks like 3456 'come' plus 'hand' and is derived from all of 'come' and the left side of 5727 1lɨə̣ 'transport, come' (also containing 'hand' and 'come' in reverse order) in Tangraphic Sea:
I have followed Gong who reconstructed 3456 and 3502 as homophones in spite of the fact that they are isolates. It would also be hard to distinguish them in context since both are motion verbs. But if they weren't homophones, what was the difference between them?
Could they have had different initials? Their initial spellers are of different grades (III and IV). So perhaps 3456 had Grade IV [l] whereas 4464 had Grade III velarized [ɫ]. If they had identical finals, I would have to posit a phonemic distinction between /l/ and velarized /ɫ/. Sofronov (1968 II: 308) reconstructed 3456 as 1la and 4464 as 1lda. But how could there be such a distinction if the two initial spellers were part of the same fanqie chain?
4464 1lɨə̣ (Grade III) = 0475 1liu (Grade IV) + 1493 siə̣ (Grade IV)
(There was no /ɨə̣/ : /iə̣/ distinction; the quality of the first vowel was dependent on the initial.)
Tai (2008: 201) reconstructed the initial of that chain as ld-
since it was transcribed in Tibetan as ld- (11 times) and
zl- (3 times), but never as a simple l- (Tai 2008: 198).
That initial was transcribed in late 12th century northwestern Chinese
as *l- which is not necessarily evidence for reconstructing
Tangut l-. Chinese *l- would have been the best
available substitute for an un-Chinese *ld-. (There was no *d-
in that Chinese dialect.) Hence there seem to have been two kinds of 1ldia.
I cannot reconstruct either 3456 or 3502 with -w- since the fanqie
do not contain such a medial. The final spellers were transcribed in
Tibetan without -w- (Tai 2008: 210):
3853: ta (37 times)
2019: tha (9 times)
3853 was also used to transcribe Sanskrit ṭa, ta, and tā
without -v- (Sanskrit had no -w-).
The Chinese transcriptions 怛 *ta and 達 *tha for 3853
and 2019 lack *-w-.
None of the transcription evidence supports the -i- required by my Grade IV hypothesis or Gong's -j-. Sofronov's (2012) -a is much more likely for rhyme 20 which he regarded as Grade I, not IV. The l- from earlier in this post would be unusual before a Grade IV rhyme but normal before a Grade I rhyme. Sofronov (2012) sometimes reconstructed more than one value for a single Tangut rhyme, but rhyme 20 was not one of them. At this point I can only combine Tai's ld- with Sofronov's 1-a and be agnostic about the difference between the two 1lda-like syllables (3456 and 3502).
184.108.40.206:21: WHEN 1825 IS REALLY 1829
What's worse than having to publicly correct a mistake on a blog? Having to publicly correct that correction!
Andrew West pointed out that the correct fanqie for Tangut character 3371 (and its homophones 0596 and 1283) is
3371, 0596, 1283 1dzia = 3031 + 1829 (not 1825!)
I got the idée fixe that
was the final speller and didn't notice that 1829 with the same left-hand radical 'fire' and a similar right-hand radical in the fanqie of the handwritten copy of the Tangraphic Sea in Wenhai yanjiu (1983) or Arakawa's Seikago tsūin jiten (Tangut rhyme dictionary, 1997).
Notice that I have not supplied readings for 3031 or 1829.
I have already explained why 3031 is ambiguous, and I will add one more complication here:
- 3031 is the initial speller for 3371, 0596, and 1283 which are in the MIxed Categories of the Tangraphic Sea. For some reason, all dz-, dʐ-, and ɬ-syllables were placed in Mixed Categories along with a seemingly random smattering of other syllables. That suggests 3371, 0596, 1283 had dz-.
- On the other hand, 3031 is in the 'rising' tone volume of Precious Rhymes of the Tangraphic Sea instead of the Mixed Categories volume. That implies 3031 did not have dz-.
The fanqie for 1829 indicates -w- ... or does it? There is no transcription evidence for the -w- of 1829, its final speller 1289 1lwia or 0259 1lwia, the only homophone of 1289. -w- is an attempt to account for why 1289 1lwia is not in the same homophone group as
3456 1lia 'to come'
whose Chinese transcription 辢 *la has no *-w-. Then again, that transcription is not ironclad proof 3456 didn't have -w-, because the Chinese known to the Tangut had no syllable *lwa. Nonetheless a Tangut lwia could have been transcribed in Chinese as 辢合 *laCLOSED with a small 合 'closed (mouth)' diacritic to indicate -w-.) 1289 and 3456 had the same initial (l-) and rhyme (1-ia), so they presumably had different medials (-w- and zero).
If 1289 was 1lwia, then 1829 was 1tshwia, and 3371, 0596, and 1283 were 1dzwia ... which conflicts with the use of 0596 as a transcription character for Sanskrit ja without -v- (there is no -w- in Sanskrit).
Let's suppose that 3371, 0596, and 1283 were 1dzia without -w- and that their fanqie final speller 1829 was 1tshia without -w-. 1829 and 1825 were in different homophone groups even though they had the same initial (tsh-) and rhyme (1-ia), so they presumably had different medials (zero and -w-). But if 1825 was 1tshwia, why was it transcribed in Tibetan as tsha instead of tshwa? Was the subscript -wa character accidentally omitted?
This is so frustrating. I want to end on a more positive note. Andrew West recently created an online Homophones lookup tool. You can input the Li Fanwen 2008 numbers I use for tangraphs to see that
- 3371, 0596, 1283 1dz?(w)ia are in the same homophone group (31A46-48; all Homophones numbers here are from the A edition; different editions have different numbers)
- 1829 1tsh(w)ia (the final speller for those three syllables) and 1825 1tsh(w)ia (which I confused with 1829) are in different homophone groups (31B36 [which has no homophones] and 33A13-14 [a set of two homophones: 5041 and 1825])
- 1289 1lwia (the final speller for 1829) and 3456 1lia are in different homophone groups (53B78-54A11 [a set of two homophones: 1289 and 0259] and 55A54 [which has no homophones])
Alas, Homophones does not give any concrete information about the homophone groups beyond their initial classes: e.g., 3371, 0596, 1283, 1829, and 1825 belong to the sixth class (alveolars) and 1289 and 3456 belong to the ninth class (liquids). The Tangraphic Sea lists homophone groups organized by rhyme with fanqie, but fanqie for most 'rising' tone syllables are lost, and readings for fanqie spellers are dependent on a mixture of transcription evidence and educated guesswork (e.g., the reasoning for reconstructing -w- above).
WHEN RHYME 21 IS REALLY RHYME 20
(10.11.18:25: Formerly titled "Tangut Grade III -a('): Rhymes 19 and 21 (Part 2)", but I changed the title since this entry has nothing to do with either rhyme apart from my confusion of rhymes 20 and 21.)
If you don't want to constantly make a fool of yourself in public,
don't blog about Tangut.
For the past couple of days, I've been reconstructing 3371 as 1dzɨa' with Grade III rhyme 21 which would be unusual after dz-, but its fanqie in the Mixed Categories of the Tangraphic Sea clearly indicates that it has Grade IV rhyme 20 which is normal after dz-:
3371 1dzia = 3031 2dzi + 1825 1tshwia (sic; should be 1829!)
Even this corrected (?) reading remains problematic for several reasons.
First, the initial might be ts-. The evidence is ambiguous:
1. There is no fanqie for 3031, the initial speller of 3371.
2. 3031 was used to transcribe
Chinese characters with *ts-readings
Sanskrit ci (pronounced [tsi] in the variety of Sanskrit known to the Tangut, probably via Tibetan which had [ts] for Sanskrit c).
3. 3031 was transcribed in Tibetan as both Hdza and Htsa. The phonetic value of H- is uncertain: it could have represented prenasalization or a voiced back fricative.
4. Another character
1290 2?-ew 'ordinal suffix, class, limitation'
with 3031 as a fanqie initial speller was transcribed in Tibetan as tsa, tsi(H), gtsiH, and gdzi(H).
5. 3371 was homophonous with
0596 'to grow'
a transcription character for Sanskrit ja (pronounced [dza] in the variety of Sanskrit known to the Tangut, probably via Tibetan which had [dz] for Sanskrit j).
Second, it would be odd for a -wia graph (1825; sic - should be 1829!) to be a fanqie final speller for -ia without -w-. But it would also be odd for Sanskrit ja [dza] to be transcribed as dzwia instead of dzia.
The Tibetan transcription of 1825 is tsha, not tshwa. So maybe 1825 lacked -w- after all. And maybe it lacked -i- as well. A Sofronov-style reconstruction of 1825 as 1tsha may be best. But then how can one explain the different fanqie for the other 1tsha (or 1tshia) in the Tangraphic Sea?
1829 'to heat up, burn' 1tshia = 0311 1tshwiə + 1289 1lwia
(10.14.20:00: This is actually the fanqie for 1825!)
Maybe 1829 had -w- and 1825 and its homophone
5041 'stove, furnace'
did not. Their fanqie has no -w- in either speller:
1825 and 5041 1tshia = 3278 1tshi + 1693 1sia (used to transcribe Sanskrit sa)
(10.14.20:00: This is actually the fanqie for 1829!)
I will revise my reconstructions accordingly:
|Tangraph||Sofronov 1968||Li Fanwen 1986||Gong||Nishida-style reconstruction in Arakawa 1997||This site|
|1829||1tsha||1tsha||1tshja||1tshaɦ||1tshwia (formerly 1tshia)|
|1825||1tshwa||1tshɛ||1tshjwa||1tshaɦ²||1tshia (formerly 1tshwia)|
(10.14.20:04: No, judging from the corrected fanqie,
Sofronov and Gong were right to reconstruct -w-
in 1825 and 5041! Which means that the equation below is still 'broken'
or 'unbalanced', depending on your preference in metaphors.)
Plugging that revised reconstruction of 1825 back into the fanqie at the beginning of this post results in a balanced equation:
3371 1dzia = 3031 2dzi + 1825 1tshia
The two homophones of 1825 listed in Mixed Categories of the Tangraphic Sea share that fanqie and should also be read as 1dzia:
0596 'to grow' and 1283 'stomach' (attested only in dictionaries)
This entry demonstrates how errors and their corrections can cause chain reactions in Tangut reconstructions.
I have eliminated one type of apparent anomaly in rhyme 21: the combination of an alveolar initial dz- with the Grade III medial -ɨ-. But other anomalies remain, and I will examine them in future entries.
220.127.116.11:55: TANGUT GRADE III -A('): RHYMES 19 AND 21 (PART 1)
Last night I mentioned the words (phrases?)
3371 0378 1dzɨa' 2ʔʊ 'curled hair' and 3371 1144 1dzɨa' 2dị 'bun (of hair)'
and noted that their first syllables had an anomalous initial-rhyme combination. (No, actually they don't!)
3371 has the Grade III rhyme 21 (= 1.21/level tone rhyme 21 and 2.18/rising tone rhyme 18). (10.10.20:01: The true rhyme of 3371 is 20.) Here are the latest reconstructions of that rhyme and its immediate neighbors in the first rhyme cycle:
|Rhyme||Tibetan transcription||Gong 1997||Arakawa 1999||Sofronov 2012||This site|
In Gong's reconstruction, there is no Grade III/IV distinction, and many rhymes are redundant: e.g., rhymes 21 and 24. Hence Gong regarded
3371 1dzjaa (rhyme 21; = my 1dzɨa') 'hair worn in a bun; peak' and 4075 1dzjaa (rhyme 24; my 1dzia') 'thrifty'
(10.10.20:02: 3371 should be 1dzia with rhyme 20.)
as homophones in spite of their placement into different rhymes and homophone groups in the Tangraphic Sea. They are not homophonous in the other three reconstructions.
In Arakawa's reconstruction, rhyme 21 is the only Grade IV rhyme, and it has a combination of the -y- of his Grade II and the vowel length of his Grade III.
Sofronov's reconstruction is very different from all others: e.g., it has Grade II and Grade IV variants of rhyme 21. Sofronov reconstructs five subtypes of a-rhymes corresponding to three subtypes in the other reconstructions.
In my reconstruction, Grade III rhymes are characterized by medial -ɨ- and are distinct from Grade IV rhymes with -i-. Grade III and IV rhymes typically have different initials:
III: v- (= w- in most other reconstructions), shibilants (tʂ-, tʂh-, dʐ-, ʂ-, ʐ-), l- (cf. Grade II which occurs with shibilants but not sibilants or r-)
All of these initials are associated with Grade III in the Late Middle Chinese (LMC) of the rhyme table tradition. (So are many other LMC initials other than sibilants and *ɣ-.) In LMC, Grade III was nonpalatal and Grade IV was palatal. Assuming that the Tangut carried over that distinction into their analysis of their own language, Tangut Grade III initials must have been nonpalatal. Tangut l may have been velarized [ɫ].
IV: all other initials (cf. Grade I which occurs with all non-shibilants)
However, this correlation between grade and initial is not absolute: e.g., 1dzɨa' has a dz- that normally should precede a Grade IV rhyme. Hence the distinction between medial /ɨ/ and /i/ is phonemic as well as phonetic, and the Tangut created separate rhyme categories whenever the medial could not be predicted on the basis of the initial. Minimal pairs like 3371 and 4075 above necessitated the separation of rhymes 21 and 24. (10.10.20:05: 3371 1dzia [not 1dzɨa'!] and 4075 1dzia' actually differ in terms of the presence or absence of the mysterious feature that I write as -', not in terms of medials.)
On the other hand, I presume all medials in rhyme 27 were nondistinctive (and predictable?*) as suggested by the mixture of Grade III and IV in this rhyme 27 fanqie:
Hence there was no need to create separate rhyme categories for -ɨã and -iã syllables.
1ʂɨã (Grade III) = 2ʂɨu (Grade III) + 1kiã (Grade IV!)
I'll start looking at the unpredictable medials of rhymes 21 and its -'-less counterpart 19 this weekend.
*It is possible that -ɨ- and -i- were completely interchangeable in rhymes like 27: e.g.,
1ʂɨã ~ 1ʂiã (cf. Grade III rhyme 36 1ʂɨe; there is no Grade IV rhyme 37 *1ʂie)
1kɨã ~ 1kiã (cf. Grade IV rhyme 37 1kie; there is no Grade III rhyme 36 *1kɨe)
It is also possible that rhyme 27 had only one medial (-ɨ- or -i-) after all initials, so all rhyme 27 syllables were Grade III or IV.
It is not possible to choose between these alternatives at this point. It might be more accurate to write the medial of rhyme 27 with an algebraic symbol like -I-. However, I have already used that symbol to represent a lost unstressed presyllabic vowel conditioning the raising and fronting of pre-Tangut *a to i. I assign medials to rhyme 27 syllables following the general pattern: -ɨ- after shibilants (there are no v- or l-rhyme 27 syllables) and -i- after other initials.
18.104.22.168:32: WHIP = TSU + SHARP + ?
If 0219 2tseʳw 'whip' has three sources, the first two might be one of three tangraphs with a TSU-type reading and 3767 1reʳw 'sharp, pointed end':
What might be the third? There are nine tangraphs sharing a right side with 0219 that I didn't cover last Saturday:
|0054||1tswa||hair worn in a bun or coil||HAIR|
|0375||1ka||second syllable of 2phʊ 1ka 'boots worn in rain or snow'||HAIR (fur boots?)|
|0378||2ʔʊ||second syllable of 1dzɨa' 2ʔʊ 'curled hair'||HAIR|
|1144||2dị||second syllable of 1dzɨa' 2dị 'bun (of hair)'||HAIR|
|2279||1swa||second syllable of 2siọ 1swa 'a kind of grass'||SWA|
|4021||1swa||second syllable of 1niu 1swa 'ear ornament'||SWA|
|4371||1dạ||second syllable of 2me 1dạ 'hair'||HAIR|
|5133||2rieʳ||wool, feather, fine hair||HAIR|
All of the above characters either represent (parts of) words for hair or syllables homophonous with 1swa 'hair'. So 2tseʳw 'whip' is either 'TSU + hair' or 'TSU + sharp + hair'.Two of the above characters (0378, 1144) are only attested after
3371 1dzɨa' 'hair worn in a bun or coil; peak (< like a bun of hair on the top of the head?)' = 2750 1ɣɤu 'head' + 1lwʊ̣ 'to mix, blend'
They may be adjectives modifying 1dzɨa'.
Both the structure and pronunciation of 3371 are odd to me (10.10.20:15: because I reconstructed 3371 incorrectly! It should be 1dzia with a Grade IV rhyme, not 1dzɨa' with a Grade III rhyme.) I wouldn't describe a bun or coil as mixed and blended hair. And Grade III rhymes with -ɨ- normally don't follow alveolars. I will take a closer took at -ɨa' tomorrow.
22.214.171.124:57: WERE TANGUT WHIPS SHARP?On Sunday I concluded that the left side of 0219 2tseʳw 'whip' might be an abbreviation of some tangraph with a TSU-type reading, though I admit the phonetic match is poor:
2tseʳw 'whip' < left of 1tshwiu, bottom left of 2dziu', or right of 2dʐwɨiw?
I also identified the rest of 0219 as being from
2061 2pɤẹ̃ 'hair'
as a whole. And on Saturday I used Google to demonstrate that whips are associated with hair in English, though of course there is no guarantee the Tangut also had such an association.
2061 of course consists of two components. Maybe each of those components in 0219 2tseʳw 'whip' is from a different source. Let's look at eleven possible sources of
the center of 0219:
|2434||1bie||to mend, patch||BE, TATTER (i.e., to fix tatters)|
|3088||1bie||second syllable of 2bə 1bie 'dung beetle'||BE|
|3090||2ɬọ||first syllable of 2ɬọ 2ɬwi 'ugly and old'; can it stand alone?||UGLY|
|3558||2pɤẹ̃||first syllable of 2pɤẹ̃ 2ba 'flattery'||BE|
|3767||1reʳw||sharp, pointed end||SMOOTH (left and center from 1963 'smooth')|
|4330||1ʔị||ladle, scoop||I (bottom center and right from 3101 2ʔị 'to repeat')|
|4817||?ɬə||plane for carpentry||LHY|
I have excluded five tangraphs containing 2061.
The classes can be grouped into three families:
TATTER < BE > HAIR
SMOOTH > LHY > (UGLY if 2ɬọ had a ?ɬə tangraph as phonetic)
The last is an unusual case, as the shape of the bottom center component of 4330 1ʔị 'ladle' does not match its source 3101 2ʔị 'to repeat' in its Tangraphic Sea analysis:
The source of the top and bottom left of 4330 1ʔị 'ladle' is 4368 2dwʊ 'chopsticks'.
Among these characters, the best candidate for a source of 0219 'whip' is 3767 1reʳw 'sharp, pointed end'. I wish I knew more about Tangut material culture. Did Tangut whips have sharp ends?
126.96.36.199:06: THE APPEARANCE OF ANGERTwo of the Tangut words in yesterday's table
0924 2niạ 'anger, rage' and 0996 2mə 'appearance, spirit'
were borrowings from Chinese 惱 'angry' and 模 'pattern' according to Li Fanwen (2008: 156, 167).
The first etymology would work only if there was a pre-Tangut prefix *Sɯ- of unknown function (!) added to *nawʔ from Middle Chinese *nawˀ. The *S- of the prefix conditioned vowel tension (indicated by a subscript dot) and the high vowel *ɯ of the prefix conditioned the -i- in the main syllable:
*Sɯ-nawʔ > *Sɯ-nɨawʔ > *Sɯ-nɨaɯʔ > *S-nɨaɯʔ > *nnɨaɯʔ > *ṇɨaɯʔ > *ṇɨ̣ạɯ̣ʔ > 2niạ
The relative chronology of changes is not entirely clear, though *a-breaking must have preceded *ɯ-loss and *S-tension.
I once thought Tangut rhymes ending in the algebraic symbol -' (corresponding to what I used to reconstruct as long vowels) once had final consonants:
-V' (= -VV) < *-VC
If that were the case - and I don't think it was* - then the absence of -' in 0924 2niạ would not rule an earlier final consonant (i.e., *-w) since -' could not occur with tense vowels. This complimentary distribution is a clue to the identity of -' which had to have some phonetic characteristic that was incompatible with tense vowels.
The second etymology is highly improbable because Middle Chinese 模 *mo 'pattern' should correspond to Tangut *2mʊ, not 2mə. (See Gong 2002: 413 for examples of MC *-uo : Tangut -u which is equivalent to MC *-o : Tangut -ʊ in my reconstruction. I regret not include the raising of *-o to *-ʊ in pre-Tangut.)
There are isolated instances of the correspondences
Tangut -ə : Japhug rGyalrong -u < *-o, -ɯ < *-u
in Jacques (2006), but the general pattern is clear:
Tangut -ʊ (= Jacques' -u) : Japhug rGyalrong -u < *-o, -ɯ < *-u
2mə 'spirit' may be an unrelated homophone of 2mə 'pattern' that was written with the same character.
The Precious Rhymes of the Tangraphic Sea analyzed the graph 0996 for 2mə as being from
the top of 1365 and the bottom of 4744 2ʔiõ 'appearance' (a loan from Middle Chinese 樣 *jɨaŋʰ or Tangut period northwestern Chinese *jõ).
Li may have been tempted to have derived 2mə from Middle Chinese 模 *mo 'pattern' since the word appears with the clarifying character 4744 in Homophones:
He translated that collocation as 模樣 'pattern' which would have been read as *mo jɨaŋʰ in Middle Chinese - a near-mirror image of 2ʔiõ 2mə! I think this resemblance is coincidental. In Tangut period northwestern Chinese, 模樣 was something like *mbʊ jõ which would have been borrowed into Tangut as *bʊ 2ʔiõ. (Tangut tones for Chinese loans are unpredictable, so I have not indicated the hypothetical tone of the first syllable.)
The analysis of 0924 2niạ 'anger, rage' is unknown. Perhaps it was from the top and bottom left of 0948 1na 'to steal' (phonetic) plus 'demon' (semantic) extracted from one of forty-nine different possible characters:
('Demon' has left-hand and right-hand forms which are interchangeable in tangraphic analyses.)
None of the other 'demon' characters mean 'anger', so none stand out as more likely sources than others.
*My old -V' < *-VC hypothesis would not predict Tangut-Japhug rGyalrong comparisons such as these from Jacques (2006):
'nose': 5700 2ni' (not *2ni) : J sna
Correlations between Tangut -' and Japhug final consonants in sets such as
'needle': 4935 1ɣa (not *1ɣa') : J ta-qaβ
'fruit': 2436 1mia' : J sɯ-mat
may be coincidental.
188.8.131.52:59: WHAT PLUS 'HAIR' EQUALS 'WHIP'?
If the center and right components of
0219 2tseʳw 'whip'
2061 2pɤẹ̃ 'hair',
what is the source of the left-hand component
None of the 69 other characters with that component are a plausible semantic match for 2tseʳw 'whip' which may belong to the TSU phonetic class:
|LFW2008||Tangraph||Reading||LFW2008 gloss||Class(es)||Class codes|
|0009||1ʂwɨo||to appear; to raise (< 'cause to appear'?)||APPEAR||S1|
|0020||1tʂɨa||road, way (literal and metaphorical: 'manner'); to lay bricks||CHA, ROAD||P1, S2|
|0486||2paʳ||horse with white trotters||PAR||P4|
|0503||1tʂɨa||the surname Cha||CHA||P1|
|0745||2vɨe||the surname Ve||VE||P5|
|0752||1tʂɨa||ceremony, courtesy||CEREMONY, CHA||S3, P1|
|0760||2dʐɨe||to judge, decide||JE||P6|
|0948||1na||to steal, rob||NA||P7|
|1003||1lew||full, filled, satisfied||not HOLLOW?, LU? (but analysis has 1630 2dziẽ 'carve'!)||S4|
|1026||1tʂwɨa||the name Chwa; luck||CHA||P1|
|1071||2dziu'||first half of 1071 1226 'to hide, conceal'||HIDE, TSU?||S5, P8?|
|1082||2riʳ||second syllable of surnames ending in Rir||RIR||P2|
|1094||2ʐɨə||to go without a burden||GO||S6|
|1226||?T-||second half of 1071 1226 'to hide, conceal'||HIDE, TU?||S5, P3?|
|1360||1va||to hide, conceal'||HIDE||S5|
|1364||1ŋa||hollow, void||HOLLOW, NGA||S4, P9|
|1588||1tʂɨa||sheep guardian god||CHA, SHEEP||P1, S8|
|1630||2dziẽ||to carve||CARVE, JE||S9, P6|
|1641||2dʐɨa||lamb||CHA?, SHEEP? (but analysis has 1043 1lew 'full')||P1?, S8|
|1651||1tshwiu||to salute||CEREMONY, TSU?||S3, P8?|
|2663||1kwiə̣||to kowtow, worship on bended knees||CEREMONY||S3|
|2755||2lwəʳ||the surname Lwyr||LWYR||P10|
|2972||1ŋa||to spread; Grinstead: 'empty'||HOLLOW?, NGA||S4?, P9|
|3049||1xwaʳ||to melt, thaw; to confess (< 'melt down' and release information?)||XA, SPEAK||P11, S10|
|3575||2ni||to listen, hear||EAR||S7|
|3579||2kie||impressive and dignified, eminent||APPEAR (i.e., prominent?), CEREMONY?||S1, S3?|
|3813||2vɨẹ||to see someone off||VE||P5|
|3821||2lʊ||to enjoin; to tell; to give a present||CEREMONY?, LU, SPEAK||S3?, P12, S10|
|3828||1tʂɨə||to give a present; to enjoin; to tell; to know||CEREMONY?, CHA, SPEAK? (but no CEREMONY, CHA, or SPEAK graph in analysis which has 3813 2vɨẹ 'to see someone off')||S3?, P1?, S10?|
|3874||1ʔiə||hunger||HOLLOW (lacking food)||S4|
|3920||1kiụ||to bow, salute||CEREMONY||S3|
|4153||2lɨiw||to gather, assemble; transcription character||LU?||P12?|
|4201||?kha||casket, small box||XA?||P11?|
|4469||2ʂɨi||to go toward, depart||GO||S6|
|4475||?xa||to puff, blow; transcription character||XA||P11|
|4534||2dʐwɨiw||hungry||HOLLOW (lacking food; but analysis has 130 'source')?, TSU?||S4?, P8?|
|4681||1niu||ear||EAR, NU||S7, P16|
|4682||2khiə'||chimney, window, hole, space||HOLLOW, KHY||S4, P13|
|4696||1bạ||cymbals||BA, CYMBAL||P15, S11|
|4744||2ʔiõ||appearance, shape; transcription character||APPEAR||S1|
|4761||1ʂwɨa||to speak, say||SHA, SPEAK||P14|
|4762||1tʂhɨe||to go, walk||GO||S6|
|4766||2bə||a kind of vegetable||BA||P15|
|4812||2rioʳ||to brush, wipe, whisk||RIR?||P2|
|4822||2dzwiə||to go, walk||GO||S6|
|4849||1niu||the surname Nu||NU||P16|
|4894||1mio||to listen, hear||EAR||S7|
|5126||1lɨu'||to carve, engrave||CARVE, LU||S9, P12|
|5412||2lwəʳ||ceremony, rite; to get a haircut; transcription character||CEREMONY, LWYR||S3, P10|
|5693||1vɪʳ||to listen, hear||EAR, VE||S7, P5|
|6010||1kiụ||to bow, salute (= 3920)||CEREMONY||S3|
I have numbered phonetic (P) classes by order of first occurrence in the table above. Class names are in my lay romanization for Tangut which ignores the four grades, vowel tension, and the unknown distinction indicated by -'. Y represents central nonlow vowels.
Phonetic classes organized by Homophones chapter
|Chapter||Initial type||Phonetic class|
|I||Labials||P4. PAR, P15. BA|
|III||Dentals||P3. TU, P7. NA, P16. NU|
|V||Velars||P9. NGA, P13. KHY|
|VI||Alveolars||(no pure VI classes)||P6. JE,
|VII||Alveopalatals (actually retroflex shibilants?)||P1. CHA, P14. SHA|
|IX||Liquids||P2. RIR, P10. LWYR, P12. LU|
Some of the phonetic classes could be combined (P4. PAR + P15. BA, P1. CHA + P14. SHA, P10. LWYR, P12. LU).
P6 and P8 might be split, as I am not certain that mixing class VI and VII initials was permissible in Tangut phonetic series.I have also numbered semantic (S) classes by order of occurrence:
10.6.0:59: Some of those 27 classes could be combined into even bigger classes using ambiguous graphs as pivots: e.g., 0020 can either be CHA or ROAD, so CHA and ROAD graphs could be grouped together. Here is one particularly large group containing 18 classes:
That diagram is meant to be read from left to right: e.g.,
CEREMONY > LU > CARVE > JE
Two smaller groups are
CYMBAL > BA > PAR
EAR > NU, VE
Three classes cannot be combined with others: GO, NA, RIR.
Thus one could say there are six kinds of :
But I doubt literate Tangut actually looked, at, say, 0760 2dʐɨe 'to judge, and thought, 'its left side indicates that it has a JE-like reading like 1630 2dziẽ 'carve', derived from the right side of 5126 1lɨu' 'to carve', in turn derived from the bottom left of 3821 2lʊ 'to give a present', in turn derived from the center of 5412 2lwəʳ 'ceremony':
How did the Tangut learn and perceive their own script?
184.108.40.206:57: ARE WHIPS LIKE HAIR?
It was fun to use tentative Unicode code points for Tangut characters and components in my last post, but now I'm going to use Li Fanwen 2008 numbers again.
I've been trying to figure out the graphic etymology of
0219 2tseʳw 'whip'
The left side is shared with 69 other characters which don't seem to have any phonetic or semantic similarity to 2tseʳw 'whip'. I'll look at them again and post a list tomorrow.
The center and right components appear in five other characters. I already mentioned the first yesterday:
|LFW2008||Tangraph||Reading||LFW 2008 gloss||Character structure|
|2pɤẹ̃||hair||left of 'hair' + left of another graph for 'hair'|
|2mioʳ||second syllable of 2177 0227 1pə 2mioʳ 'rude, coarse, careless'||'language' + 'hair': i.e., coarse words are rude|
|2phʊ||boots worn in rain or snow||'boots' next to 'hair': i.e., furry boots|
|2giu||silk, silkworm||'bug' atop 'hair' (i.e., silk thread)|
|2ɬɤi||smooth, glossy||'not' next to 'hair'|
If the right two-thirds of 0219 were taken as a unit, then 'hair' is the most likely source. Although a whip is not much like a hair, it is even less like 'rude', 'rain and snow boots', 'silk(worm)', or 'smooth'. Moreover, none of the five sound like 2tseʳw.
I'll break up that two-thirds and see if I can find more plausible graphic sources.
10.5.0:30: Are whips associated with hair on Google?
"whip like a hair": 0 results
"whips like hair": 2 results
"whips made of hair": 7 results
"hairs like whips": 229 results
"hairs whip": 374 results
"hair whips" 32,100 results
"hair like whips": 39,400 results
"whip hair": 62,200 results
"hair whip": 93,500 results
"hair like a whip": 273,000 results
Of course modern English usage is not the key to the ancient Tangut mind. Nonetheless, the whip-hair connection is stronger in the 21st century than I had thought.
220.127.116.11:51: UNICODE TANGUT COMING IN JUNE 2016
This has been an exciting week. First, Baxter and Sagart's new Old Chinese reconstruction, then the catalog of Khitan large script characters, and in less than two years, 6,126 Tangut characters plus the Tangut iteration mark and 753 Tangut radicals. Andrew West has documented the long road his team has taken. Bravo!
Finding Tangut characters is easy in Unicode. For example, if I want the first character I mentioned on Wednesday, I can just search for its Li Fanwen 2008 number (0219) on this code chart, and voila!
U+17366 2tseʳw 'whip'
And I can find the second character I mentioned on Wednesday (Li Fanwen 2008 number 1877) by looking through the range of characters sharing its left-hand radical U+1896E (= Nishida 219, gloss unknown). Oddly the source graph for its left side according to the Combined Homophones and Tangraphic Sea has a different radical (U+18954 = Nishida 218 'dog/fox'):
U+1785F 2ʔiəʳ 'whip' =
left (!) of U+175EF 2khɤi 'yak'
all of U+18571 2phʊ 'tree'
Why does 'yak' plus 'tree' equal 'whip'?
The analysis of U+17366 2tseʳw 'whip' is unknown. There are 69 other characters containing the component
U+1892C (= Nishida 103, gloss unknown),
16 other chararacters with the middle component
U+18942 (= Nishida 275, gloss unknown),
14 other characters with the right-hand component
U+18975 (= Nishida 134, gloss unknown),
and five other characters containing the middle and right hand components: e.g.,
U+173F3 2pɤẹ̃ 'hair'.
Is a whip like a giant hair? Maybe. Or maybe there's a more likely source of the right two-thirds of U+1785F 2ʔiəʳ 'whip'. I'll look at the possibilities tomorrow.
18.104.22.168:59: THE KHITAN LARGE SCRIPT IN SRI LANKAI never expected Khitan to be discussed in
Sri Lanka <ś.ri l.ang.k.a>
at WG2 meeting 63. To be more precise, it was the Khitan large script that came up, not the Khitan small script above. I'm much less confident about this attempt to write the name in the large script:
<ś(i) ri la ang ka>
Even if one or more of those characters turns out to be inappropriate for transcribing Sri Lanka, I'm certain that a large script spelling would take up more space than its small script equivalent since the former is not clustered into word blocks like the latter.
The first of the large script characters is identical to the Chinese character 已 pronounced i in Liao Chinese, the northeastern dialect known to the Khitan a millennium ago. Should Khitan large script characters be unified with Chinese characters in Unicode?
The unification was proposed to minimize the security issues caused by co‐existence of similar shaped characters in the CJK Unified Ideograph [i.e., Chinese character] block and Khitan Large Script block.
Not knowing what the security issues are, I oppose unification. Unifying Chinese characters and the Khitan large script would be like unifying Latin A, Greek Α, and Cyrillic А. Would Greek and Cyrillic lookalike letters (e.g., Γ and Г) be assigned to one or the other alphabet while letters unique to Greek or Cyrillic (e.g., Δ and Д) were assigned to separate alphabets? My mind reels.
I also don't think unifying Jurchen (large) script characters resembling Khitan large script characters is a good idea. To me, Chinese characters, Khitan large script characters, and Jurchen (large) script characters are like the Latin, Greek, and Cyrillic alphabets: related scripts that should be kept apart in spite of partial visual overlap.
Encoding issues aside, I've been excited to browse the longest list of Khitan large script characters I have ever seen:
Proposal on Encoding Khitan Large Script in UCS
Part 1: Characters 0001-0472
Part 2: Characters 0473-0963
Part 3: Characters 0964-1455
Part 4: Characters 1456-1930
Part 5: Characters 1931-2218
(10.3.1.1:30: This last file does not include 已 <ś(i)> attested in the epitaph for 多羅里本 Duoluoliben [a.k.a. 突呂不 Tulübu, 1081], though it does include 己 [#1938] and 巳 [#1941] which also look like Chinese characters.)
I especially appreciate the inclusion of images of original characters. (10.3.0:06: But I wish I understood the codes for their sources.) I wanted to continue my series on Baxter and Sagart's new Old Chinese reconstruction, but I had to interrupt it to mention this breakthrough in Khitanology.
Alas, that list does not include any characters that Viacheslav Zaytsev may have discovered in Nova N 176, the longest known Khitan text in either script. As much as I'd love to be able to type the Khitan large script in Unicode as soon as possible, I wonder if it might be a good idea to wait until the characters in that book have been catalogued. It might be odd to have a first Khitan large script encoding covering all texts but Nova N 176. Typing words from what may be the most important Khitan text in the far future might involve going back and forth between a primary Khitan large script block and a Khitan Extended-A block. Awkward.
10.3.1:18: ADDENDUM: The Khitan large script proposal lists several inscriptions that I have never heard of before:
1. 耶律大王墓誌 Epitaph for Prince Yelü (personal name not given; 1051)
2. 耶律準墓誌銘 Epitaph for Yelü Zhun (1068)
3. 耶律李家奴墓誌銘 Epitaph for Yelü Li Jianu (1081)
4. 留隱太師墓誌銘 Epitaph for Master Liuyin (1109)
I wish I could see them.