In 1979, EG Pulleyblank proposed that Old Chinese (OC) had a system of twenty-two consonants consisting of eleven pairs distinguished by a palatalization vs. labialization contrast.  In that same article, he also proposed the following rhyme categories for OC:

OC rhyme type (don't confuse with my notation for the heavenly stems!)'yin' (non-nasal, non-stop final)'yang' (nasal final)'entering' (stop final)
IV-al, -as (< earlier -ats)-an-at
IX-əw-əng w-ək w
X-aw-ang w-ak w
XI-ah w-a' w

This chart ignores final glottal stops and -s, which have been Pulleyblank's sources for the Middle Chinese 'rising' and 'departing' tones* since 1962 (p. 142).

In "Why So Many W-s and Y-s", I asked,

What's odd about l wəw'?

which was what I thought Pulleyblank's 1979 reconstruction of the OC tenth heavenly stem (酉) would have been.  Now I can broaden the scope of that question: what's odd about his 1979 OC rhyme categories (if everything else Pulleyblank has proposed is correct)? The existence of only two vowels (a and ə) has been a given for Pulleyblank since 1963, so that doesn't count.  What does is the curious fact that ten of the eleven final classes contain consonants that are not in his list of twenty-two:

I, II: -m, -p (without palatalization or labialization)

III, IV: -l, -s, -n, -t (without palatalization or labialization)

V, VI: -y, -ñ, -c (completely absent from his list; c = voiceless palatal stop)

VII, VIII: -G, -ng, -k (without palatalization or labialization)

IX, X: -w (without palatalization or labialization)

One could try to reconcile those final consonants with his 22C system by rewriting them slightly:

I, II: -m y/w, -p y/w (palatalization-labialization contrast neutralized in rhyming, or was there a distinction no one has found yet?)

III, IV: -l y;w, -s y/w, -n y/w, -t y/w (palatalization-labialization contrast neutralized in rhyming, or was there a distinction no one has found yet?)

V, VI: -G y, -ng y, -k y (palataized velars instead of palatals)

IX, X: -G w (to match labialized velars -ng w and -k w)

so OC l wəw'  could be re-reconstructed as  l wəG w '

(but what about the final plain glottal stop?  was it ' w? )

but I don't know what to do with the plain velars -ng and -k in VII and VIII, since I've already reassigned the palatalized velars to V-VI and Pulleyblank already used up the labialized velars in IX-X.  Would it be possible for OC speakers to have plain velars (or glottal stops and -s, etc.) in final position that they never invented letters for because they never existed in initial position?  Maybe.  It is possible for a language to have final consonants that do not occur initially: e.g., Vietnamese has final -p but not initial p- (except in loanwords; why?**).

This has been fun for me but it ultimately doesn't matter because Pulleyblank has moved on since 1979.  In his 1991 revision of his 22C proposal (p. 51), he posits "the same set of consonants in syllable initial and final position" (table below adapted from p. 52): 

Voiceless stopsVoiceless fricativesNasalsOthers
Plain velarskxngG
Labialized velarsk wx wng ww
Palatalized velarsk yx yng yy
Palatolabialized velarsng 

(Such a velar-heavy inventory is reminscent of some versions of Proto-Indo-European with three series of velars: plain, labialized, and palatalized.  Guess what language Pulleyblank thinks Chinese is related to.  His IE-Sino-Tibetan idea, however, long predates this 1991 inventory.)

Here are two more problematic calendrical terms.  Assuming once again that Pulleyblank is correct, what's odd about his reconstructions of

the sixth stem (己) as xə̀ẅ' (1991: 57)

and the tenth branch (癸) as k əẅ'  (1991: 74; sic; there should be a grave accent on the schwa; what do the accents in Pulleyblank's reconstruction signify?***)

(for comparison, these are AMR OC klə' and kwi')

And just to show that the forms above aren't a typo on my part, here are four more OC reconstructions of his (1991: 73):

鳥 k w yə́ẅ'  'bird' (AMR OC: tew')

久 k w ə̀G'  'to be a long time' (AMR OC: k w ə')

古 k w áG'  'old (n.)' (AMR OC: k w a')

ăk w ə̀G's 'old (adj.)' (AMR OC: ngk w ə's)

(there is clearly a root 'long time/old' here: Pulleyblank's k w-G', my k w-')

ă is not a short a, but is "a pharyngeal glide" (p. 43).  I could rewrite it as H to make its consonantal nature clearer.

Next: How is xə̀ẅ' odd?

*Middle Chinese had four tone classes: 'level', 'rising', 'departing', and 'entering'. The first two seem like straightforward descriptions; the latter two are more opaque.  Most researchers believe that the four classes reflect OC consonantal distinctions:

MC 'level' rhymes < OC 'yin' and 'yang' rhymes without final glottal stop or -s

MC 'rising' rhymes < OC 'yin' and 'yang' rhymes with final glottal stop

MC 'departing' rhymes < OC 'yin' and 'yang' rhymes with final -s

MC 'entering' rhymes < OC 'entering' rhymes ending in final stops (other than glottal stop)


OC -a, -aw, -ay ('yin'), -am, -an, -ang ('yang') > MC 'level' tone

OC -a', -aw', -ay' ('yin'), -am', -an', -ang' ('yang') > MC 'rising' tone

OC -as, -aws, -ays ('yin'), -ams, -ans, -angs ('yang') > MC 'departing' tone

OC -ap, -at, -ak ('entering') > MC 'entering' tone

Tai and Vietnamese tones had similar origins.  The Tangut 'level'/'rising' tone distinction might also have arisen from lost final consonants.

The extremely rare Tangut 'entering' tone words might be a handful of words that had not lost their final consonants (and were hence still 'entering') or loanwords (though no foreign prototypes for them are known yet).

**Original Vietnamese p- had shifted to b-.  Similarly, original Vietnamese t- had shifted to d- (written đ-; but original Vietnamese k- did not shift to g-!). Vietnamese acquired a new t- from original s-, and in turn developed a new [s] (spelled x) from palatal sh (and also retroflex Sh in the north):

d- < t- < s- < sh-, (Sh-)

However, Vietnamese did not develop a new p- from another source.

***Since 1973, Pulleyblank (1994: 79) has reconstructed

a prosodic difference in Old Chinese between syllables with prominence on the second mora (Type A) and syllables with prominence on the first mora (Type B).

This implies that all OC syllables were bimoraic.

He writes type A syllables with acute accents and type B syllables with grave accents.

In my OC reconstruction, type A syllables have 'emphatic' initial consonants (written with underlining) and type B syllables have plain initial consonants.  Other OC reconstructions like Gong's posit a -y- in type B syllables absent in type a syllables:

OC syllable typePulleyblank OCAMR OCGong OC
Type AtáGtatag
Type BtàGtatyag

My OC follows the Western and Russian schools in permitting open syllables, whereas Pulleyblank and Gong allow no open syllables in their OC reconstructions.  Gong goes so far as to forbid open syllables even in Proto-Sino-Tibetan (1996).  I know of no language which only has closed syllables.  Moreover, Pulleyblank's OC -G and Gong's OC/PST -g (corresponding to zero in the Western/Russian school) are extremely common - perhaps more common than -k - whereas I know of no language with -G and/or -g outnumbering -k. WHY SO MANY W-S AND Y-S?

EG Pulleyblank's 1979 reconstruction of the Old Chinese initial consonant system consists of eleven pairs distinguished by a palatalized (-y-) vs. labialized (-w-) contrast:

Palatalizedp y-m y-t y-n y-l y-s y-k y-ng y-G y-' y-h y-
Labializedp w-m w-t w-n w-l w-s w-k w-ng w-G w-' w-h w-

There are no plain consonants without palatalization or labialization: e.g., p-, t-, k-. I know of no language which lacks plain consonants. Of course, that doesn't mean no such language exists.

Pulleyblank (1979: 31) cited Old Church Slavonic as an example of "a language in which all consonants are either palatalized or labialized". Although I think there may be historical parallels between OCS and OC (Old Chinese), this is not one of them. Krause and Slocum imply that OCS had a three-way contrast between hard, soft, and palatalized consonants. (their 'soft' = mildly palatalized? their 'palatalized' = 'palatalized followed by -y-'? = hyperpalatalized?) They do not mention labialized consonants.

Instead of positing such an exotic system, why didn't Pulleyblank fill up the twenty-two slots with less unusual choices like voiced b, d, g, or the aspirates ph, th, kh? (All six of those consonants are attested in Middle Chinese.)

Pulleyblank (1979: 24) pointed out that the Middle Chinese readings of the twenty-two calendrical terms lack voiced stops and contain only a single aspirated stop (retroflex Th). All scholars would agree with him (and would more or less accept his Middle Chinese initials).

Pulleyblank's 1979 OC initial' w-' y-p w-t y-n w-k w-k y-s y-m y-t w-
Pulleyblank's 1979 Middle Chinese initialk-'-p-t-m-k-k-s-ñ-k-

Earthly branchiiiiiiivvviviiviiiixxxixii
Pulleyblank's 1979 OC initialp y-h w-n y-m w-G y-l y-ngy-ngw-h y-l w-s w-G w-
Pulleyblank's 1979 Middle Chinese initialts-Th-y-m-j-z-ng-m-sh-y-s-w-G-

Note that four consonants occur more than once:

k-: I, VI, VII, X

y-: iii, x

s-: VIII, xi

m-: V, iv, viii

At first this would seem to indicate that the twenty-two terms could not possibly represent an alphabet. However, one could try to explain this by hypothesizing that, for example, the four k-s were originally four different consonants: e.g., k, k w, kh, g. But then one would have to hypothesize that Middle Chinese kw, kh, g were not continuations of the original OC k w, kh, g that merged with original k (at least in the calendrical terms and in phonetically similar syllables).

Pulleyblank did something similar: he derived the four k-s from ' w, k w, k y, t w and regarded kh and g (and other aspirated and voiced stops) as resulting from prefixation. (I also think many - but not all - Middle Chinese aspirated and voiced stops originate from prefixed voiceless stops.)

But that still doesn't answer the title question. What is the purpose of positing palatalization and labialization everywhere in Pulleyblank's 1979 reconstruction - or throughout the velars in his 1991 reconstruction?

I think the answer partly lies in Pulleyblank's even older hypothesis of a two-vowel system for Old Chinese:


Most scholars today would agree that Old Chinese had a smaller vowel system than Middle Chinese, if length is disregarded: e.g., Li Fang-kuei and Gong (whose Tangut reconstructions are often cited here) posit four vowels:


and several scholars in the PRC, Russia, and the West (myself included) posit six vowels:

iï ~ əu

(The sixth vowel was ï or ə, depending on scholar. It might've phonetically been both, depending on environment.)

But how does one get from Pulleyblank's two vowels a and ə to the vocalic rainbow of Middle Chinese with vowels such as e, i, ï, o, O, u in addition to the original two in his in his 1979 reconstruction? Secondary articulations are the key. They 'colored' adjacent vowels until they were no longer recognizable as [a] or [ə]: e.g.,

丁 'fourth stem': OC t yang yt yeng y > MC teyng

酉 'tenth branch': OC l wəw' > l wïw' > MC yuw'

(Only the OC initials and the MC forms are from Pulleyblank's 1979 reconstructions; the OC rhymes are extrapolated from what I know of his reconstruction. For comparison, I reconstruct

丁 'fourth stem': OC teng > MC teng

酉 'tenth branch': OC (rə?)lu' > MC yu'.)

Next: Why only two vowels? No, wait ... before I get to that, look at the OC reconstruction of the 'tenth branch' again. Assume that Pulleyblank's 1979 reconstruction of twenty-two consonants for OC is correct. What's odd about l wəw'? A CHINESE ABC? Y OR N?

The answer is 'N' if you assume my reconstructions of the heavenly stems and the twelve earthly branches are correct. But what if they're wrong? Even if you used the reconstruction of any other scholar besides Pulleyblank (Karlgren [the old school], Starostin [the Russian school], Baxter/Sagart [the Western school]), you'd get a similar result: two or more stems/branches sharing the same initial, and not enough initials to cover the whole of Old Chinese.

Stems (in upper case) /
Branches (in lower case)
Karlgren codeReconstructions of OC initials
Karlgren 1957StarostinBaxter / Sagart-style reconstructionAMR 2006Pulleyblank 1979Pulleyblank 1991
I. 甲629ak-k-r-k-r-k-r-' w-k w-
II. 乙505a'-y-'-r-'-r-'-r-' y-y-
III. 丙757ap-y-p-r-p-r-p-r-p w-p-
IV. 丁833at-t-t-t-t y-t-
V. 戊1231am-y-m-m-m- ~ m-n w-m-
VI. 己953ak-y-k-l-k-(l)-k-l-k w-x-
VII. 庚746ak-k-r-k-r-l-r-k-l-k y-k-
VIII. 辛382as-y-s-s-s-s y-s-
IX. 壬667añ-y-n-n-n- or m-y-m y-n-
X. 癸605ak-y-w-k w-k w-k w-t w-k -
i. 子964ats-y-ts-ts-ts-, s-t-, s-k-, s-p-?p y-k y-
ii. 丑1076ath-n-y-s-n-r-hn-r-r-hn- (< r-s-n-)h w-x w-
iii. 寅450ad-y-dl-l-ng...l-n y-ng y-
iv. 卯1114am-l-mh-r-m-r-m-r-m w-ng -
v. 辰455ad y-y-d(h)-d-n-t-G y-G-
vi. 巳967adz-y-lh-s-l-s-l-l y-x y-
vii. 午60ang-ng(h)-ng-s...ng-ng y-ng-
viii. 未531am-y-w-m-m-m-(k)-ng w-ng w-
ix. 申385ash-y-s-l-hl-hl- (< s-l-)h y-x y-
x. 酉1096az-y-l-l-l-l w-ẅ-
xi. 戌1257hs-y-s-wh-s-m-s-m-s w-x-
xii. 亥937agh-g(h)-g-ng-k-G w-w-

( I don't have all of Baxter and/or Sagart's reconstructions for the 22 calendrical terms, so I have posted my guesses for the initials using their similar reconstruction systems. My system is a modification of theirs.)

(Consonants have been broken up with hyphens so that complex unit consonants [e.g., ts-] are distinguished from clusters [e.g., s-t-]. Hyphens do not necessarily correspond to morpheme boundaries. Secondary articulations are written as superscripts. with two dots represents IPA [ɥ], the glide version of ü. Underlining represents pharyngealization.)

Pulleyblank's reconstructions are the only ones I know of in which all 22 calendrical terms have distinct single initials (which are the only possible initials in his OC reconstructions). All others have clusters and non-distinct initials (e.g., I have four to six terms beginning with s-) - though the precise clusters and initial overlaps vary from reconstruction to reconstruction. There is not a single branch or stem whose initial everyone agrees on (and I haven't even discussed vowels, final consonants, or tones!). This illustrates how much disagreement there is about the pronunciation of Old Chinese. (And there are many other styles of reconstruction out there!)

Pulleyblank's initials stand out from the rest. Not only do their points of articulation often not match the others (e.g., his Gy- and G- corresponding to others' dentals for branch v), but they frequently contain secondary articulations that may correspond to nothing in the others. His 1979 initials all have secondary articulations.


I wish I had brought EG Pulleyblank's 1975 paper introducing the notion of an Old Chinese alphabet. As if that weren't surprising enough, he also hypothesized that the OC alphabet might have been related to our own. I haven't read that paper in over a decade. Maybe I'll look at it again when I go back to Hawai'i and dig up my own copy. For now, I can only quote his 1979 paper at length (pp. 36-37):

Soon after I began to investigate the possibility of a phonetic interpretation of the kan-chih signs [i.e., for the ten heavenly stems [kan] and the twelve earthly branches [chih]), I was struck by many apparent resemblances between these signs and signs of the Phoenician alphabet, which also had exactly twenty-two signs. As I proceeded, I became convinced that such a high degree of similarity, involving both sound and shape, could not be the result of mere coincidence, however improbable it might seem on other grounds that there could be any connection between phenomena that were separated by the whole length of the Asian continent, even though they were more or less contemporary in time. In my public lecture at Stanford in 1975 which was entitled, "The Chinese Cyclical Signs and the Origins of the Alphabet", I pointed out these resemblances and attempted to outline a hypothesis to explain the common origin of the two writing systems.

Though the impression of similarity persisted and though my recent revisions in the sound values postulated for the kan-chih signs even seemed to improve the correspondence in certain respects, notably in suggesting a relationship between the orders of the kan-chih and the alphabet, certain difficulties have also persisted and increased as I have learned more about the early history of alphabetic writing in the Near East. One disquieting difficulty of a formal kind is that in some cases the kan-chih sign agrees better with a later than with an earlier western form ...

The number twenty-two, which is of the essence in the case of the kan-chih signs, appears to be more or less accidental in the case of Semitic, corresponding to the number of [consonantal] phonemes in Phoenician around the end of the second millennium. Other early Semitic alphabets, for example the South Arabic, contain more letters. Particularly significant is the Ugaritic cuneiform alphabetic writing of the fourteenth century BC, which has five more consonants than the Phoenician alphabet, but arranges the letters in an order which is basically the same as the later standard alphabet with a few variations. This suggests that there was already an early form of the Semitic alphabet in existence by that time which contained more than twenty-two letters.

I am regretfully forced to the conclusion that, remarkable as some of the similarities seem to be, they probably are a matter of coincidence and that it is best to proceed on that assumption rather than to speculate further about possible common origins. Much is unknown about the origins of the alphabet and even more about the beginnings of writing in China, but there seems to be little advantage to either side in connecting the two problems.

To abandon the proposed western connection does not weaken the case for a phonetic interpretation of the kan-chih signs, which is based on internal Chinese evidence and will receive its verification by its success or otherwise in furthering the analysis of the early Chinese script and the reconstruction of proto-Chinese phonology.

The characters for the stems that I listed yesterday - and the characters for the branches that I'll list tonight - may sometimes seem too complicated to be have been used as letters, but complicated letters do exist (e.g., Thai ฐ for th) and the forms used in OC times were much simpler than the ones in my font: e.g., the first stem, now written 甲, used to be 十 (which looks like the modern graph for 'ten' - an ancient graph for 'ten' was simply a vertical line |). 壬, the ninth stem, looked like a capital I (with short horizontal lines on the top and bottom), and 癸, the tenth stem, looked like an X (rather than a compound of 癶 'trampling feet' and 天 'heaven'). Even the simpler characters looked different: e.g., the earliest form of 丁, the fourth stem, was a square 囗.

Here are the characters for the branches in both Chinese and Tangut:

Earthly branchAssociated animalSinographOld ChineseTangraphGong's reconstructionRhyme
irattsə' (ts- was a single consonant; a variant of 子 also represented earthly branch VI slə', so earthly branch I might have sounded similar: e.g., stə' or stlə' instead of tsə'?)
iiitigerngələr (ng- was a single consonant)le2.7
viihorsenga' (ng- was a single consonant: a uvular nasal?); possibly disyllabic singa in some archaic [southern?] dialect (borrowed into the Tai lg Ahom as shingaa; also cf. Viet ngựa 'horse' < late OC ngïa' < earlier OC singa'?; a native Viet word for 'horse' would have been t... < proto-Austroasiatic seh?)gyiy1.36
viiisheep; also used for unrel. homophone
'not yet'
m(k?)əts < m 'not' + 旣 kəts 'already'? (Pulleyblank 1991: 59-60); is Hakka nguy 'not yet' < mkəts?myo2.44
ixmonkeyhlin (hl- was a single consonant)wyị1.67
xiipigngkə' (ng- was a single consonant: a uvular nasal?)gyu1.3

None of the Chinese branch characters are drawings of animals. Nor are their readings the regular names of those animals in Chinese. (I'll discuss the Tangut characters later. As usual, they are more complex than their Chinese counterparts.)

The data for the stems is in "A Seventh Sort of Sinograph?"

Now that I have listed my OC reconstructions of all 22 stems and branches, you may be better equipped to answer the questions:

- Why won't Pulleyblank's phonogram idea work with my reconstructions?

- How could Pulleyblank still salvage it?

Sorry if you were expecting the answers. But I thought providing the rest of the data might be preferable to giving answers based on less than half the data.


Djamouri (2006: 10) classified the characters for the cyclical terms of Chinese as semagraphs.

(It is not clear that he regards all 22 as semagraphic, since his cyclical examples

甲 丙 丁

Old Chinese krap, prang', teng

'first, third, fourth in a series of ten'

are drawn only from the set of ten heavenly stems; none are from the set of twelve earthly branches. The two sets of terms are paired to form the sixty stem-branch names of the sexagenary cycle.)

In any case, most of the cyclical characters appear not to be graphic compounds (and hence probably could not be huiyi or xingsheng compounds). If they originated as xiangxing or zhishi representational drawings (e.g., 戊 'fifth heavenly stem' looks like "a drawing of some kind of weapon" [Karlgren 1957: 315]), their cyclical usage must be jiajie. In any case, they are simpler than their Tangut counterparts:

Heavenly stemElemental associationSinographOld ChineseTangraphGong's reconstructionRhyme
Vearthmus ~ muswe2.7
IXwaternəm (myəm?)ney2.30
Xkwi' (kw- was a single consonant)duu1.5

(I will discuss the characters for the earthly branches in future posts. Their complexity in both Chinese and Tangut is comparable to the complexity of the heavenly stem characters.)

According to EG Pulleyblank (1991: 69), "there is no generally accepted explanation" for the sinographs of "the majority of stems and branches". He proposed (1991: 39)

that the twenty-two signs ... orignated as phonograms; that is, as names of the consonants of the Chinese language at the time of the invention of the script.

This idea dates back to a 1975 paper of his and has been revised over the years. The latest revision I know of is from 1991. Without knowing anything about Old Chinese other than the reconstructions for the heavenly stems in the chart above, can you guess

- why the phonogram idea won't work

- and how Pulleyblank could still salvage it?


Djamouri (2006: 10) lists a nontraditional sixth sort of sinograph: the sign graph (his translation of Qiu Xigui's term 記號文字 Md jihao wenzi). I'll call them semagraphs (< Greek σημα sema 'sign'). Such characters are "abstract ... referring directly to specific words, without representing any object in a figurative or iconic way". I initially had a hard time distinguishing this category from the traditional zhishi category, but now I think I see the difference: zhishi characters are iconic, whereas semagraphs are nonrepresentational and arbitrary:

zhishi: 一 'one', 二 'two', 三 'three', 亖 'four' (rare)

semagraphs: 四 'four' (normal), 五 'five', 六 'six', 七 'seven', 八 'eight', 九 'nine', 十 'ten'

these do not consist of four, five, six lines, etc.

(Djamouri [2006: 11] cites 'five' through 'seven' as examples; I assume the others also fall into this category.)

Djamouri mentions the possibility that these characters might represent "digital gestures" (which themselves would still be nonrepresentational and arbitrary). Such an explanation would not account for the Tangut numeral characters:



They are much more complex than their Chinese counterparts. They are not pictures of anything. They are not iconic for the most part (though ONE and TWO seem to contain Chn 一 'one' and 二 'two'). Janhunen (1994) pointed out that they lack a common 'numeral' semantic element. The element that some (but not all) of them share looks like PERSON. Why is PERSON in some numeral graphs but not others? What is so PERSON-al about FOUR, SIX, or NINE? Why is EIGHT on the bottom of SEVEN? Like Janhunen, I think these tangraphs are phonetic. SEVEN (shyạ 1.64) sounds like EIGHT ('yar 1.82) plus a consonant sh- (represented by the top element of SEVEN?; unfortunately, this element is unique to SEVEN, though sh- was a common initial in Tangut). I think they are more like the English written forms one, two, three than the symbols 1, 2, 3 (or Chn 一 'one', 二 'two', 三 'three').

An alien who knew nothing about the Roman alphabet might guess that English seven and ten rhyme because they are both written with -en. Similarly, the Tangut A words for 'seven' and 'eight' nearly rhymed and were both written with a common element. The PERSON in the numeral tangraphs may have represented a common initial consonant or syllable in Tangut B: in left-hand position, it would be like the s of English six and seven rather than Chn 亻 'person'.

(There are Chinese precedents for writing numerals with 亻 'person'. Chn 五 'five' has a derivative 伍 'group of five' with 亻 'person' that can also represent 'five'. Chn 億 'hundred million' is written with 亻 'person' and 百 'hundred' and 千 'thousand' have variants with 亻 'person' (佰仟) used when writing checks [by analogy with 伍 'five' and 億 'hundred million'?]. However, it is still not clear why PERSON is in some but not all numeral tangraphs.)

Next: Beyond Djamouri's six sorts. Is there a category VII? ÉCDUeR 5: THE SIX SORTS OF SINOGRAPHS: 形聲 XINGSHENG

The fifth sort of sinograph is 形聲 xingsheng, translated by Djamouri (2006: 10) as 'form and sound'. I prefer the nearly alliterative 'shape and sound'. Xingsheng characters are compounds of phonetic and semantic elements. Xingsheng and another term for the same category, 諧聲 xiesheng, consist of xingsheng / xiesheng characters (Md = Mandarin; OC = Old Chinese):

Md xing < OC Nkeng 'shape':

'arable land'* [Karlgren 1957: 213] (not a standalone character; phonetic element in graphs for syllables like keng; not to be confused with 开, the simplified form of 開 Md kai < OC khəy 'open' used in the PRC)

彡 OC rsam/rshlam?, rsam(')/rshlam(')? 'hair decoration' (used as a semantic element meaning 'pattern')

Md sheng < OC skheng 'sound':

殸 Md qing < OC khengs 'musical stone'** (phonetic) +

耳 Md er < OC nəng' 'ear' (semantic)

for the loss of -ng, see Sagart (1999: 61-62)

Md xie < OC Nkrəy 'in harmony':

言 Md yan < OC ngan 'speech'**** (semantic) +

皆 Md jie < OC krəy 'all'***** (phonetic)

I had originally intended to describe xingsheng characters as 'clarified jiajie (false loans)' since many would be jiajie characters (rebuses) if their semantic elements were removed: e.g.,

妣 OC pi', pi's 'ancestress' - 女 'woman' (semantic) = 比 OC pi' 'compare', pi's, Npi's 'follow'; also jiajie for OC pi', pi's 'ancestress' (phonetic)

The addition of 女 'woman' distinguishes 妣 OC pi', pi's 'ancestress' from 比 OC pi' 'compare', pi's, Npi's 'follow'.

However, some phonetic elements really aren't 'false loans' since they may represent semantically relevant, etymologically related words: e.g.,

聲 OC skheng 'sound'

謦 OC kheng' 'clear the throat'******

contain the phonetic element 殸 OC khengs 'musical stone' (rather than a phonetic element having nothing to do with sound). All three OC words appear to be derivatives of a root kheng 'sound' with different affixes (s-, -', -s).

(There is also variation in the root consonant: kh ~ kh [phonetically [qh]?]; perhaps uvulars became velars following the s- of skheng [followed by a high and/or front vowel?]?: s(i)kh- > s-kh-?)

Note that there is no correlation between graphic elements and affixes: e.g., 耳 'ear' does not signify the s- of 聲 OC skheng 'sound'. A morphologically complex word may have a simple graph, and an unaffixed root may have a complex xingsheng or huiyi graph.

Djamouri (2006: 24) cites an even more striking case of related words written with xingsheng characters containing various semantic elements attached to a shared phonetic element representing a common root n-w-ng 'thick':

農 OC nung (< n-w-ng?) 'to be thick or dense'

(could also represent unrelated words 'to plant'; a name nung)

濃 OC rnong, nong (< n-ə/o-w-ng?) 'thick dew'

(with the semantic element 氵 'water')

襛 OC rnong, nong 'thick clothes'

(with the semantic element 衤 'clothes')

穠 / 檂 / 欁 OC rnong, nong 'dense vegetation'

(with the semantic elements 禾 'millet', 木 'wood' [once: 檂 or twice: 欁; 'surrounder' semantic elements 木...木) are uncommon])

醲 OC rnong, nong 'thick alcohol'

(with the semantic element 酉 'wine')

(note the homophony with 'thick dew'; two uses of the same word 'thick fluid' written differently for different referents?)

噥 OC nung 'strong taste'

(with the semantic element 口 'mouth')

(could also represent unrelated word nung 'mutter, murmur')

膿 / 癑 OC nung 'pus'

(with the semantic elements 月 'flesh', 疒 'illness')

(犭 + 農) Late OC noung, Næu, nau 'dog with thick fur'

(with the semantic element 犭 'dog')

(not attested in early OC; could have been from earlier OC nung, rnaw, naw?; why was the root reduced to n-w?)

齈 Middle Chinese nowng, nongh 'nasal catarrh'

(with the semantic element 鼻 'nose')

(not attested in OC; could have been from earlier OC nung, nongs?)

The vast majority of sinographs belong to the xingsheng category. With some exceptions, one can count on similar-looking graphs to have a common semantic domain or a similar pronunciation (in Old Chinese, but not necessarily in modern Chinese languages like Mandarin).

That is not the case in Tangut. Tangraphic series that appear to be like the n-w-ng series examined above (農 濃 襛 穠 檂 欁 醲 噥 膿 癑 齈 ...) may have no common denominator: e.g. (Grinstead 1972: 62),

TT4918 EXIST (Gong dyu 1.3)

TT3866 MANY (Gong 'yi 1.11)

looks like PERSON + EXIST

do not sound alike or have an obvious semantic relationship.

Compare that pair with its Chinese graphic equivalent:

有 侑

Md you < OC wə' 'exist'

Md you < OC wə's 'assist; encourage to drink; forgive'

(with the semantic element 亻 'person'; could also be written 佑 with a different phonetic element 右 OC wə' 'right')

Chn 有 'exist' and 侑 'assist' sound alike and are written nearly alike, whereas Tangut EXIST and MANY are written nearly alike even though they do not have a semantic or phonetic common denominator.

MANY cannot be easily explained as a huiyi (semantic compound) tangraph since PERSON + EXIST could imply INHABITED, not MANY. Tangraphic Sea analyzed MANY as

left of TT3539 HAVING (Gong lheew 2.41) +

all of TT4918 EXIST (Gong dyu 1.3)

which does not obviously imply MANY. Moreover, there is nothing in MANY that tells the reader that PERSON was taken from HAVING as opposed to one of hundreds of tangraphs with PERSON: e.g.,

TT4478 HOW (Gong lyọ 2.64)

TT4479 BLACK (Gong nyaa 1.21)

TT5365 DOG (calendrical term; Gong na 1.17)

sounds like BLACK but neither it nor BLACK sound like HOW

and how can one explain PERSON in these three graphs?

And EXIST in turn was analyzed as

(top) right of TT3866 MANY (Gong 'yi 1.11) +

bottom of TT0203 RICH (Gong lo [rhyme unknown])

which does not obviously imply EXIST.

I suspect that MANY - and many other tangraphs - could be crypto-xingsheng tangraphs consisting of semantic elements plus phonetic symbols for Tangut B readings: e.g.,

Tangut A reading

Tangut B reading


dyu 1.3



'yi 1.11

X, or something similar to X (the PERSON element on the right could be a silent semantic element of dubious worth, or it could represent a Tangut B syllable Y - if so, then MANY was YX whereas EXIST was X)

Next: Category VI: Sign graphs.

*Pulleyblank might reconstruct the similar standalone character 井 'well' as kyàngy' using his 1991 OC reconstruction and regard 井 'well' as phonetic in 形 'shape' (his akángy?).

For now I go along with the mainstream by reconstructing 井 'well' as tseng' and not regarding it as phonetic in 形 'shape' (Nkeng in my reconstruction), but I do find Pulleyblank's OC system tempting at times.

Wait ... 井 'well' could also be skeng' in my system, so maybe it is phonetic in 形 Nkeng 'shape' after all.

(In Middle Chinese, 井 'well' was tsyeng, and MC ts- could come from OC ts-, sk-, st-, or sp-; sk- and sp- probably shifted to st- which then metathesized to ts-.)


OC khengs 'musical stone'

"a drawing of a musical stone [声**] and a hand wielding a club [殳] for beating it" (Karlgren 1957: 220)

has a xingsheng variant

with the semantic element 石 OC dyak 'stone' added. In turn, this variant 磬 could serve as a jiajie character for the unrelated, homophonous word OC khengs 'be visible; be like'.

***In the PRC and in Japan, 声 is used as an abbreviated form of 聲 'sound'.

****I have seen

OC ngan 'speech'

explained as lines representing what comes out of a mouth (口), but in reality this character is a jiajie recycling of a drawing representing OC ngan 'big flute' (a visible, drawable object, unlike speech). 言 'big flute' had other jiajie uses: it could also represent

OC ngan (ngar?) 'I, we'

cf. 吾 OC nga 'I', 我 OC ngay' (< ngal'?) 'we; (later) I'

OC ngan 'high and large'

(was 'big flute' an extended use of this word?: 'large (one)' > 'large flute'?)

OC ngən 'contented'


OC Nkrəy 'in harmony'

is a prefixed form of 皆 OC krəy 'all; always, everywhere; in accord; complete, plentiful'. Other members of this word family are

偕 OC krəy (~ Nkrəy?) 'together; plentiful; numerous'

喈 OC krəy 'in unison'

also jiajie for the unrelated word OC (N)krəy 'cold', also wriitten as 湝 with the semantic element 氵 'water'

and perhaps also

楷 OC krəy 'model'

(something one harmonizes with?)

揩 OC krəy 'wooden box, beaten to mark time in music?'

(again, something one harmonizes with?)

written with different semantic elements:

亻 'person', 口 'mouth', 木 'wood', 扌 'hand'

The only OC words written with the phonetic element 皆 OC krəy 'all' that don't belong to that family are

湝 OC (N)krəy 'cold'

階 OC krəy 'steps'

(that which has numerous parts in harmony? Too much of a stretch.)


OC kheng' 'clear the throat'

is a compound of

殸 OC khengs 'musical stone' (phonetic) +

言 OC ngan 'speech' (semantic; clearing the throat, like speaking, is a noise coming out of the mouth)