Amaravati: Abode of Amritas

06.10.21.23:59: SINITIC SANS SIBILANTS? (PART 1)

... or, strictly speaking, alveolar sibilants.

In "The nu-clear Solution", I tried to explain why the Old Chinese (OC) alveolar sibilants s-, z, ts-, tsh-, dz- did not become palatals like the OC dentals t-, th-, d-, n-, hn-. I proposed that the consonants previously reconstructed as alveolars might have been palatal shibilants (stage 1 below) that shifted to alveolar sibilants (stage 2), leaving a gap to be filled by new palatals derived from old dentals (stage 3). The gap left by the departed dentals was then filled by 'deemphasized' dentals (stage 4). (The chart below is simplified.)

OC consonant class	Dentals		Alveolars		Palatals
OC consonant class	Nonemphatic	Emphatic	Nonemphatic	Emphatic	Nonemphatic	Emphatic
Stage 1	t-, th-, d-, n-, hn-	t-, th-, d-, n-, hn-	none	none	sh-, ch-, chh-, j-	sh-, ch-, chh-, j-
Stage 2	t-, th-, d-, n-, hn-	t-, th-, d-, n-, hn-	s-, ts-, tsh-, dz-	s-, ts-, tsh-, dz-	became nonemphatic alveolars	became emphatic alveolars
Stage 3	became nonemphatic palatals	t-, th-, d-, n-, hn-	s-, ts-, tsh-, dz-	s-, ts-, tsh-, dz-	ch-, chh-, j-, ñ-, sh- (from old nonemphatic dentals)	none
Stage 4 (loss of emphatics)	t-, th-, d-, n- (from old emphatic dentals)	became nonemphatic dentals	s-, ts-, tsh-, dz- (partly from old emphatic alveolars)	became nonemphatic alveolars	ch-, chh-, j-, ñ-, sh-	none

This is a bad idea for several reasons. First (and hardly foremost), I have never heard of emphatic palatals. Maybe they should be reconstructed as emphatic alveolars:

OC consonant class	Dentals		Alveolars		Palatals
OC consonant class	Nonemphatic	Emphatic	Nonemphatic	Emphatic	Nonemphatic only
Stage 1 (revised)	t-, th-, d-, n-, hn-	t-, th-, d-, n-, hn-	none	s-, ts-, tsh-, dz-	sh-, ch-, chh-, j-

Second, even if the revised chart above were correct, I know of no language that has emphatic alveolars (e.g., s) but no nonemphatic alveolars (e.g., s). If a language has to have one s, that s will be a nonemphatic s.

One could say, 'ah, but that's why the palatals shifted to alveolars - to fill the gap!', but that doesn't explain why the gap would have arisen in the first place*.

Next: A simpler solution.

*06.10.22.1:22: Middle Vietnamese (the stage of the language more or less preserved in Vietnamese spelling) had such a gap - and the reason for that gap is known:

Early Vietnamese had an s- (stage 1):

Consonant class	Dentals	Alveolars	Retroflexes	Palatals
Stage 1	t- ...	s- ...	none	ch-, chh- ...

However, t- became the implosive alveolar D-, leaving the language without any t- (stage 2):

Consonant class	Dentals	Alveolars	Retroflexes	Palatals
Stage 2	(no t)	D-, s- ...	(Sh-?, T-?)	ch-, chh-?/sh-? ...

s- became the new t- (stage 3):

Consonant class	Dentals	Alveolars	Retroflexes	Palatals
Stage 3	t ...	D- (no s-)	(Sh-?, T-?)	ch-, chh-?/sh-? ...

At some point after stage 1, retroflexes developed from earlier clusters:

consonant + -r- > Sh-

consonant + -l- > T-

and the affricate chh- had lost its stop element, becoming sh-.

Although there was an s in Middle Vietnamese romanization, that symbol represented a retroflex [Sh] (IPA [ʂ]), not an alveolar s:

Consonant class	Dentals	Alveolars	Retroflexes	Palatals
Stage 4	t ...	D- (no s-)	Sh-, T-	ch-, sh- ...
Romanization	t ...	đ- ...	s-, tr-	ch-, x-

(I have read that this use of s reflects the Portuguese use of s for [sh]. However, the [sh]-pronunciation of Portuguese s does not occur in initial position, whereas Vietnamese s- occurs only in syllable-initial position. Moreover, I do not know whether the [sh]-pronunciation of Portuguese s existed in the seventeenth century when Vietnamese romanization was first devised. The use of x- for palatal [sh] [IPA [ɕ]] and nh- for palatal [ñ] [[IPA [ɲ]] ) definitely reflect Portuguese influence.)

In (most? all?) dialects of modern Vietnamese, [sh] has shifted to [s] (but is still spelled x-):

Consonant class	Dentals	Alveolars	Retroflexes	Palatals
Hanoi	t ...	D-, s-	(Sh- became s-, T-ch-) became	ch- ... (sh-s-) became
Saigon	t ...	D-, s-	Sh-, T-	ch- ... (sh-s-) became
Romanization (same for both dialects)	t ...	đ-, x- ... (and s- in Hanoi; Hanoi [s] < earlier retroflex [Sh] and palatal [sh])	s-, tr- (Saigon only)	ch- ... (and tr- in Hanoi)

Hence foreign words with [s] are spelled with x- in Vietnamese (Nguyễn Đình-Hoà 1966: 551):

xà-lách 'salad' (< Fr salade)
(-ach is pronounced [at] in southern Vietnamese)

xà-lông 'living room (furniture)' (< Fr salon)
xà-phòng 'soap' (< Fr savon)
(surprisingly not xà-vòng with -v-)

The spelling for these words were of course devised after the shift of palatal [sh] to [s]. They are not evidence for a [sh]-pronunciation of French s-words.

There are French [sh]-initial words which have been Vietnamized with x-:

xà-lan 'barge, scow' (< Fr chalande)
xà-lúp 'longboat' (< Fr chaloupe)

Were these borrowed when Vietnamese still had a palatal [sh]? I doubt it. Their x- was probably pronounced [s] at the time of borrowing.

French ch- [sh] has also been borrowed as Vietnamese s- (Hanoi alveolar [s], Saigon retroflex [Sh]) (Nguyễn Đình-Hoà 1966: 386):

săm 'rented room' (< Fr chambre)

06.10.20.2:45: THE NU-CLEAR SOLUTION

Here's how I would reconstruct the initials of the 丑 'second branch' phonetic series (Karlgren 1076). This solution is largely based on the proposals of Laurent Sagart. The reconstruction of Old Chinese (OC) r- in initial rather than medial position is due to the influence of Zev Handel's "Rethinking the medials of Old Chinese: Where are the r's?"

Sinograph	Old Chinese	Middle Chinese
丑杻/杽	Initial 1: rhnu' (hn is a voiceless nasal, not an h-n cluster)	Thu' (Th = aspirated retroflex stop)
杻狃紐	Initial 2: rnu'	Nu' (N = retroflex nasal)
忸	Initial 2: rnuk	Nuk (N = retroflex nasal)
羞	Initial 3: snu	su
手	Initial 4: hnu'	shu' (sh = palatal fricative, not s + h)

The OC reconstructions all share a common core nu which may be followed by a back consonant (a velar stop k or a glottal stop '). This does not mean that they all share a common root (symbolized below by √). Two pairs of words appear to be related:

手 hnu' (< s-nu'?) 'hand' and 杽/杻 rnu' (< rə-nu'?) 'manacles'
< √nu' 'hand'
(Could 丑 1076a-d rhnu' 'second branch' have originally represented a hand [hnu']? The early clawlike graphs in Karlgren [1957: 277] resemble the early forms of 又 995a wə', a drawing of a right hand. 丑 might have been a jiajie [phonetic loan] for 'second branch', just as 又 was used to write the unrelated, homophonous, abstract word wə' 'also'.)
(I just checked Sagart [1999: 155], who mentioned that Unger [1995] also noticed the similarity between early forms of 丑 'second branch' and 又 'also'. Unger cited another meaning of 狃, glossed by Karlgren as 'be familiar with': 'animal's fingers' from the 爾雅 Erya.)
羞 snu (< sə-nu?) 'shame' and 忸 rnuk (< rə-nu-k?) 'ashamed'
< √nu 'shame'

The other words are presumably not related. It is not clear whether the consonants surrounding their 'nu-clear' cores are affixes or part of the root: e.g., was 丑 1076a-d rhnu' 'second branch'

< r- + √hnu + -' (or even < r- + s- + √nu + -') or
< √rhnu'?

I favor the latter intrepretation until I can identify other words with similar meanings containing a similar phonetic core.

The change of 羞 OC snu 'shame' to Middle Chinese (MC) su is not surprising, but the other OC initials have undergone more complex changes:

OC r- + dental clusters > MC retroflexes (r-flavored consonants):
Initial 1: rhn- > (metathesis) hnr- > (denasalization) htr = thr (ht/th represents a voiceless aspirated dental stop) > Th (voiceless aspirated retroflex stop)
Initial 2: rn- > (metathesis) nr- > N- (retroflex nasal)

(Why did OC hn- denasalize before non-nasal r, unlike n? Was this because hnr- was harder to pronounce than nr-? But what about the denasalization of OC hn- before a vowel without an intervening -r-: e.g.,
Nonemphatic hn- > sh-:
恕 94t 'generous': OC hnas > MC shuh
[phonetic is 女 OC rna' 'woman' > MC Nïə']
Emphatic hn- > th-:
嘆 152a / 歎 152c 'sigh': OC hnan(s) > MC thanh
[phonetic also in 難 152d 'difficult' OC nan, 'difficulty' OC nans > MC nan, nanh]
Clusters like rn- are attested in Written Tibetan whose spelling reflects earlier phonology: e.g., rna 'ear' (borrowed from 耳 OC nəng' 'ear' after it lost its final nasal and with a prefix r- added?).

OC dentals > MC palatals:
Initial 4: hn- > hñ- > hỹ- > hy- > sh-

I don't know why OC alveolar sibilants did not palatalize: i.e.,
OC s-, ts-, tsh-, dz-
did not become
MC sh-, ch-, chh-, j-

whereas the dentals
OC t-, th-, d-, n-
did become
MC ch-, chh-, j-, ñ-

Perhaps the OC sibilants were actually palatal shibilants that underwent a chain shift:
Early OC sh-, ch-, chh-, j- > Later OC s-, ts-, tsh-, dz-

leaving the sh-, ch-, chh-, j- slots open to be filled by
Early OC hn-, t-, th-, d- > Later OC sh-, ch-, chh-, j-

(I wanted to briefly discuss the pros and cons of this hypothesis, but I fell asleep. More later.)

06.10.18.7:59: MUDDLING THRU' PART THREE

I wanted to call part two "Second Thoughts", but I also wanted to plant a hint. I may use that title if I ever get around to blogging about the second stem (which alliterates - second branch doesn't).

I asked,

All of these words (MC Thu', Nu', su, Nuk, shu') presumably had similar initials in OC. Can you guess what those initials might have been?

One could claim 'they all had initial ThNssh- and different consonants fell off' but

1. That's a bizarre four-consonant cluster. Having four consonants in a row is not strange (e.g., Written Tibetan brgyad 'eight'), but having, for instance, an s next to a palatal sh is. Reordering the four consonants won't help, as I don't know of any language with a cluster containing Th, N, s, and sh.

2. Deriving all those MC words with different initials from OC words with a single cluster would violate the principle of regular sound change. In general, an earlier form A will not mutate at random into later forms X, Y, Z, etc.:

ThNssh- > Th-, N-, s-, sh-

Four different later forms are assumed to have come from four different earlier forms:

Mystery OC initial 1 > MC initial Th-
Mystery OC initial 2 > MC initial N-
Mystery OC initial 3 > MC initial s-
Mystery OC initial 4 > MC initial sh-

Thus MC Thu', Nu', su, Nuk, shu' could not have come from a single OC form; they must reflect five different OC forms that were similar enough to be written with the same phonetic element 丑 'second branch/ox' in most cases: 丑杻狃紐羞忸. (手 MC shu' 'hand' is the sole exception.) What were those five forms?

All the MC forms have the vowel u, so we can assume their OC predecessors also had the vowel u. (This may not necessarily be true, but let's not complicate things unless the evidence compels us to.)

We can also project the MC final consonants -' and -k back to OC. Apparently 丑 was used as a phonetic to write words ending in -u' ~ -uk. A final glottal stop sounds like a -k, so this is reasonable. But why would an open syllable -u (in 羞 'shame') also be written with 丑? Karlgren and Gong would say that this syllable had lost an earlier final -g. So one could then say that 丑 represented syllables ending in -u' ~ -uk ~ -ug. However, most Chinese historical linguists today would not reconstruct a -g in 羞 'shame', for reasons I won't go into.

I was going to state what I thought the four mystery OC initials were, but I fell asleep in the middle of writing this. So I'll give you an extra day to fill in the blanks:

Sinograph	Old Chinese	Middle Chinese
丑杻/杽	Mystery initial 1: _u'	Thu' (Th = aspirated retroflex stop)
杻狃紐	Mystery initial 2: _u'	Nu' (N = retroflex nasal)
忸	Mystery initial 2: _uk	Nuk (N = retroflex nasal)
羞	Mystery initial 3: _u	su
手	Mystery initial 4: _u'	shu' (sh = palatal fricative, not s + h)

Note that an 'initial' can be a single consonant or a consonant cluster.

06.10.18.4:10: I NEED TO SNU-ZE

so I'll have to make this post very short.

Before returning to the problem of how 丑 'second stem; ox' could be phonetic in 丑+女 'love', let's look at its xiesheng series. The OC glosses and Mandarin readings are taken from Karlgren (1957: 277) and the Middle Chinese reconstructions are my own.

Karlgren code	Sinograph	Semantic element	Old Chinese gloss	Middle Chinese	Mandarin
1076a-d	丑	itself?	'cyclical character [second earthly branch; ox]'	Thu' (Th = aspirated retroflex stop)	chou
1076e	杻	木 'tree'	'a kind of tree'	Nu' (N = retroflex nasal)	niu [nyow]
1076f	狃	犭 'dog'	'be familiar with; treat with contempt; repeat, practise'	Nu'	niu [nyow]
1076g	紐	糸 'thread'	'to tie, knot'	Nu'	niu [nyow]
1076h	羞	羊 'sheep'	'nourish; viands; to present; diffidence; shame'	su	xiu [shyow]
1076k	忸	忄 'heart'	'ashamed; practise'	Nuk	nu (sic; now read as niu [nyow])

Also note that 杻 1076e 'a kind of tree' has another reading meaning 'manacles'. This reading is homophonous with 丑 1076a-d MC Thu', Md chou 'second branch/ox'. 杻 'manacles' has another graph

杽

with 手 1101a 'hand' as phonetic. Sagart thinks 杻 'manacles' is cognate to 手 MC shu', Md shou 'hand'. (MC sh- was a palatal fricative, whereas Md sh- is a retroflex fricative.)

All of these words (MC Thu', Nu', su, Nuk, shu') presumably had similar initials in OC. Can you guess what those initials might have been?

06.10.17.00:51: HAO WERE PHONETIC ELEMENTS CHOU-SEN?

That's one big unresolved question in Old Chinese phonology. The obvious, easy answer is, 'syllables were written with graphs for similar-sounding syllables'. That's indisputable, but the limits of 'similarity' are open to debate. Where did OC scribes draw the line between an acceptable and an unacceptable phonetic element? When did they think to themselves, 'That doesn't sound similar enough?'

Note my use of the plural. Chinese writing (sinography) was not invented by a single individual at a single point in time (whereas Tangut writing [tangraphy] might have been). Different writers might have had different ideas about what constituted phonetic similarity. And of course there was no uniform 'Old Chinese' across space and time*. A valid phonetic element for one writer at space/time point A would be invalid for another at space/time point B. William Boltz (2006: 66) wrote,

There are two formidable problems that bear on a claim that a given pronunciation [for a Chinese character] is "similar enough [to another pronunciation] to satisfy the demands of the xiesheng [phonetic compound, a.k.a. xingsheng] principle." The first is that our present knowledge of the sound system of Old Chinese is still very imperfect and fraught with uncertainties, notwithstanding the great progress both in factual detail and in methodological rigor that the field has enjoyed in the past three or four decades. The second is that it is likewise far from clear what the phonetic requirements or parameters were for a viable xiesheng series. Phrased another way, this means simply that we cannot say precisely how close two pronunciations must be to each other in order for them to be writable with the same characters (the basis in principle of a xiesheng series.) The phrase "similar enough" of the foregoing statement is not precisely specifiable at the present tage of our knowledge. Beyond this, it is surely a misrepresentation even to imply, as my formulation here does, that the same single Old Chinese language underlies all instances of jiajie [phonetic loan] usage and xiesheng characters that have made their way into the received orthography, and that there is a constant set of phonetic criteria that governs the formation of all xiesheng series. Both the sound system of the language and the phonetic structure of a xiesheng series surely varied over time [and space! - AMR], and since all characters and all xiesheng series were not created at the same moment [and point] in the linguistic history of Old Chinese, we are pursuing a phantom if we expect a precise phonetic description of each of these phenomena that will apply to these characters.

Without transchronal telepathy, we will never have the perfect answer to the title question. Nonetheless, trying to make generalizations about phonetic elements is still worthwhile, not only to better understand sinography, but also to better understand tangraphy. The Tangut must have confronted similar problems when they devised their script, and it is possible that they reinvented solutions devised by the Chinese two millennia earlier.

(One must, however, be careful about viewing tangraphy solely through a sinocentric lens, as parallels with other writing systems cannot be ruled out. We must not be blinded by a geographical bias. The Tangut did not have to merely imitate their neighbors. [And we must not forget that the Khitan as well as the Chinese were next door to the Tangut.])

To outline the borders of OC phonetic 'similarity', I am always on the lookout for characters with unexpected phonetic components. Pulleyblank's proposals for Old Chinese alphabets contain a number of such characters as well as some unusual phonetic glosses and word families. He uses all of this as evidence for his twenty-two consonant systems. Although I reject his systems, I cannot reject his evidence: it must be explained, one way or another. Here is one example of his that really bugs me at the moment:

好 'love' (now read Md hao), usually explained as a huiyi (semantic compound) character consisting of 女 'woman' plus 子 'child', has a variant 丑+女 with 丑 'second branch; ox' (now read Md chou) instead of 'child' (Pulleyblank 1991: 55). This can't be a huiyi character since 'second branch' and 'ox' have no known semantic connection with 'love'. So it must be a xiesheng character. But in my reconstruction system,

丑 'second branch; ox' was OC rhnu' (r-hn-u'; hn- is a voiceless nasal, not an h-n- cluster)
and 好 'love' was OC xu's

They belonged to similar rhyme categories (-u' and -u's), but their initials were very different. I assume that xiesheng character series tended to share a common core surrounded by varying peripheral elements (e.g., r- in rhnu'). In this case, the core definitely ended in -u', but it is hard to reconcile a voiceless dental nasal hn- with an 'emphatic' velar (uvular?) fricative x-.

Moreover, these Tai borrowings of the Chinese word for 'second branch' (Li 1945; cited in Pulleyblank 1979: 25, 1991: 55) have initials at a third point of articulation: labial!

Ahom plaaw
Lü paw

Dioi (aka Buyei) piaw

The other Tai names for the branches are clearly Chinese borrowings, and so I assume that these are also a borrowing despite the labial initials.

How can all this be reconciled?

Next: Second branch, second thoughts.

*The 'Old Chinese' reconstructions on this site are composite portraits embodying characteristics of diverse dialects. They are approximations that are surely unlike any particular variety spoken by anyone. But a 'correct' reconstruction should bear a strong resemblance to the real thing.

06.10.16.3:21: NAMING THE SYLLABLES OF TANGUT B

For almost a year now, I've been seriously entertaining the hypothesis that there were actually two Tangut languages, Tangut A and Tangut B. There is no direct evidence for Tangut B; its existence is something I have inferred from the structure of the Tangut script. There are many Tangut characters (tangraphs) which cannot be explained as semantic compounds or on the basis of Tangut (A) phonetics. I suspect these characters reflect the phonetics of Tangut B. I assume that Tangut writing (tangraphy) must have been phonocentric because it was extremely successful. There is much more written in Tangut than in the two other sinoform (Chinese-like) scripts of the period, Khitan and Jurchen. Although the success of a script is due to societal as well as linguistic factors, I think that tangraphy caught on because it was relatively easy to learn. It was certainly not something as simple as an alphabet or a syllabary since it contains semantic elements*. However, the question is whether tangraphy is semantocentric unlike any other known script or phonocentric like all other scripts.

Names can be used to discover the phonetic component of a script. Unless a name has an obvious concrete referent, it is difficult to draw and must be written phonetically: e.g., the Greek name Ptolemy (Ptolemaios) was spelled in Egyptian hieroglyphs as p-t-w-r-m-y-s. (The Wikipedia entry has p-t-o-l-m-i-i-s [i-i =y], but there was no letter o in Egyptian, and the letter r could represent foreign l.)

Tangraphy has many onomagraphic (< Greek onoma 'name') characters whose sole purpose is to transcribe names. For a long time I've assumed that these tangraphs would be the key to the script. Like all tangraphs, they have monosyllabic Tangut (A) readings. However, if the Tangut B hypothesis is correct, they may also have had polysyllabic Tangut B readings. What relationship was there, if any, between the Tangut A and B readings of onomagraphs? Tonight it occurred to me that Tangut A readings might sometimes be 'key' syllables extracted from polysyllabic Tangut B names. These readings would be similar to Chinese monosyllabic names extracted from polysyllabic Sinified foreign names: e.g.,

America > 阿美利加 Md Ameilijia > 美 Md Mei 'America'

Hence if an onomagraph consisted of components ABC representing Tangut B syllables X, Y, and Z, its Tangut A reading might have been similar to X, Y, or Z. (Not terribly constrained, but still better than no constraints on Tangut B readings at all!) If this was indeed the case, then we should expect recurring components with strong but not absolute tendencies to occur in similar-sounding onomagraphs: e.g., onomagraphs containing component A might tend to be pronounced like X. (I wrote "tend" because 'nonextracted' components would be visible but would not be phonetically reflected in Tangut A.)

Suppose that English were Tangut B, and some language preferring short words were Tangut A. The name Elizabeth would be written as ABCD (E-li-za-beth) reflecting English but might be pronounced as Bet in the other language. Component D (Beth/Bet) might occur in a graph DEF (Beth-a-ny) whose final component might occur in a graph FG (Ni-cole), etc.

This 'extraction' technique might also explain the complexity of tangraphs for Sanskrit syllables. A tangraph for, say, Skt va might be written with components representing Tangut B va, ja, ra (for Skt vajra 'thunderbolt', a common Buddhist term) and pronounced wa in Tangut A.

Next: How did the Tangut write Skt va?

*If Tangut B really existed, it is possible that elements previously assumed to be purely semantic may have been phonetic in Tangut B. Suppose that English were Tangut A and Vietnamese were Tangut B. Now suppose there is a character pronounced jail that looks like HOUSE+STONE. It's clearly a semantic compound, right? Well, it is ... sort of, because Vietnamese nhà đá, lit. 'house stone', means 'jail'. The structure of the character reflects a Vietnamese bimorphemic word, even though the character also has an unrelated monomorphemic English reading.

Similarly,

<+

TT4106 RIDE (Gong dzeey 2.34) analzyed as
TT4105 PERSON (Gong dzywo 2.44) atop
TT5233 HORSE (Gong ryi^ry 1.74)

could have been a bimorphemic word in Tangut B composed of the Tangut B words for 'person' and 'horse' whose Tangut A equivalent was a(n unrelated?) monomorphemic word dzeey 2.34 'ride'.

(dzeey 2.34 'ride' does look like a potential fusion of dzywo 2.44 'person' and ryi^ry 1.74 'horse', implying that dzeey 2.34 'ride' might have been cognate to the polysyllabic Tangut B word for 'ride' [whose components were still transparent?] However, other tangraphs of this type do not have seemingly 'fused' readings: e.g.,

<+
TT4062 DESERT (Gong kha^r 1.80) analyzed as

left of TT4023 EARTH (Gong lyị̈ 2.61)
bottom of FIRE (Gong məə 1.31)

does not sound like lyəə [i.e., lyị̈ + məə].)

An actual script with many mismatches between character structure and morphemic structure is Japanese: e.g.,

鷄

'chicken'

is pronounced niwatori (its native Japanese reading), a compound of niwa 'garden' and tori 'bird'. But the graph is a compound of 奚 'slave/where/how' (not 庭 niwa 'garden') and 鳥 tori 'bird'. This reflects the fact that the character was originally intended to write an unrelated Old Chinese word ke 'chicken' which sounded like 奚 OC ngke 'slave'.

奚 OC ngke 'slave' was also used to write the unrelated homophonous word OC ngke 'where, how', probably related to

胡 OC ngka 'why'
何 OC ngkay 'what, why, how'
曷 OC ngkat 'what, why, how'
(the latter two look like ngk(w)a 'why' with suffixes; -t might be from 之 OC tə 'it')

OC ngk- would be the OC (non-cognate!) equivalent of English wh-. (How, like the wh-words, originated from Germanic hw-words.)

In a Pulleyblank-style reconstruction, the four words above would all begin with ăká-:

奚 OC ăkáy, 'where, how'

胡 OC ăkáG 'why'
何 OC ăkál 'what, why, how'
曷 OC ăkát 'what, why, how'

(The phonetic of 胡 OC ăkáG 'why' is 古 k^wáG' 'old (n.)'. I thought of reconstructing 胡 with -^w- but decided not to since there is no evidence for labials in its relatives. It might be possible that -^w- reflects some earlier labial prefix: e.g., in my reconstruction system: ^p/_m-ngka > mngka > ngk^wa or 'u-ngka > 'u-ngk^wa > ngk^wa.)

06.10.15.23:55: GLOTTAL GHOST AND PHANTOM PHARYNGEAL

In "Where's w'?" I presented EG Pulleyblank's 1991 proposal of twenty-two consonants (22C) in Old Chinese (OC) and asked what was odd about his reconstructions of

己 xə̀ẅ' 'sixth stem'
癸 k ^ẅəẅ' 'tenth branch'

鳥 k ^wyə́ẅ' 'bird'
久 k ^wə̀G' 'to be a long time'

古 k ^wáG' 'old (n.)'
舊 ăk ^wə̀G's 'old (adj.)'

Although ă looks like a vowel, it was actually Pulleyblank's (1991: 43) symbol for "a pharyngeal glide". It is not in his list of twenty-two OC consonants. Neither is the glottal stop (') which is the final or penultimate consonant in the six reconstructions above.

Since Pulleyblank hypothesized that the twenty-two OC consonants were represented by twenty-two phonograms with readings beginning with those consonants, one could explain the glottal stop's absence by assuming that the glottal stop never occurred in initial position. However, Pulleyblank (1991: 60) also hypothesized "that glottal stop was an automatic onset for vowels": i.e., it automatically occurred before a vowel without any other preceding consonant. So glottal stops did occur in initial position, just like his "voicing prefix" ă- (1991: 43).

Were ' (which was predictable and hence nonphonemic initially) and ă somehow considered to be 'less' worthy of inclusion than the other twenty-two?

The omission of glottal stop is understandable since most English speakers are probably unaware of the glottal stops at the beginning of English 'vowel-initial' words (though that perception could be an artifact of English spelling).

However, an initial pharyngeal glide ă- was obviously as salient as any other consonant, or else pairs like p- : ăp- (corresponding to p- : b- in later Chinese) would not have been distinct. Pulleyblank proposed that ă- is "probably cognate" to "Tibetan ha-chung and the Burmese prefix 'a-". Perhaps ă- was still a vowel (with an automatic glottal stop initial: ['a]?) at the time the twenty-two OC consonants were identified and became a glide after the stems and branches were established.

Yet another possibility is that the Chinese came up with the number of twenty-two by adding twelve (the number of months) and ten (the base of their numeral system) and assigned consonants to them on the basis of a preexisting arbritary order (cf. the arbitrary order of the alphabet and the assignment of numerical values to letters in Hebrew). The inclusion of glottal stop and ă- would have increased the number to twenty-four, so they were left out*.

I don't actually think any of that happened. But the only way I can test Pulleyblank's hypothesis is to assume it is correct, and pursue its ramifications. Pulleyblank has been the primary influence on my thinking about Chinese over the past thirteen years, and I share his

attitude ... that one should be prepared to entertain provisionally even seemingly outlandish hypotheses provided one is very strict in testing them against counterevidence and in looking for alternative hypotheses that will account for the same facts. The great danger is in becoming so enamoured of one's own ideas and so unwilling to reconsider them that one tries at all costs to discount any possible counterevidence.

(1996 ms., pp. 10-11, printed in International Review of Chinese Linguistics 1.1)

*A crude analogy: Suppose that a language had twenty-six initial sounds that just so happened to correspond to the twenty-six letters of the Roman alphabet. Now suppose that speakers of that language wanted to map those sounds onto twenty-two calendrical terms. They would have to throw out four of the twenty-six initials. So they might get rid of e, i, o, u and keep a to represent vowel initials in general. Then they would assign the first ten sounds to the 'stems' -

a, b, c, d, f, g, h, j, k, l

- and the last twelve sounds to the 'branches':

m, n, p, q, r, s, t, v, w, x, y, z