The English Wikipedia derives Spanish cerdo 'pig' from Latin seta 'bristle' and the Spanish Wikipedia derives it from Latin setula, a diminutive of seta. Are those folk etymologies? I see several problems:

1. Is hair really the most prominent feature of a pig?

2. Latin s- should remain s- and not become c-.

3. Latin -t- should become -d-, not -rd-.

4. Latin -tul- should become -j- or -ld-, not -rd-: cf.

viejo 'old person' < vetulus 'little old'

espalda 'back' < spatula 'broad, flat piece'

Seda 'silk' looks like the regular Spanish reflex of Latin seta.

Steven Schwartzman derives cerda from Vulgar Latin *cirra 'a tuft of hair in an animal's mane'. But pigs don't have manes. And I would expect Spanish to retain *rr rather than shift it to rd: e.g., Latin carrus 'wagon' became Spanish carro, not cardo.

Might cerdo 'pig' have no Latin etymology? Might it be a borrowing from some substratal language? WAS TANGUT 2WUQ1 'TO AID' BORROWED FROM CHINESE? (PART 2)

In my last post, I expressed doubts about


0645 2wuq1 'to aid'

being a borrowing from Tangut period northwestern Chinese (TPNWC) *3wu3 or an earlier form (e.g., Early Middle Chinese *wuʰ) on the basis of its initial: why would Chinese *w- be borrowed as Tangut w- [ʔw]? Tangut had no simple initial [w]; the two obvious choices for imitating Chinese *w- were v- and w- [ʔw]. Gong (2002) did not identify any instances of Tangut w- [ʔw] for what I reconstruct as Chinese *w- before -u-type rhymes, but I can't say that it would be impossible for a Tangut wu to be from a Chinese *wu.

I thought the tense vowel rhyme of 2wuq1 would even more strongly rule out a borrowing scenario. As Gong first proposed, Tangut tense vowels derive from an earlier preinitial which I write as *S.-: e.g.,


0359 1tuq1 < *S.toŋ 'thousand' (cf. Written Tibetan stong 'id.'; more cognates at STEDT)

I use capital *S.- to indicate the possibility of multiple sources of tenseness-triggering *S.-. In that particular case I think *S.- really was [s], but in the case of another numeral, I am not so sure:


0359 1soq1 < *S.sum 'three' (cf. Written Tibetan gsum 'id.'; more cognates at STEDT)

I suspect that *S.s- was originally *ks- which then merged with *ss- via a *xs-stage.

For now I reconstruct all Tangut tense vowel syllables as having *S.- in pre-Tangut. But perhaps I should reconsider given that Gong (2002: 425) identified six Chinese loanwords with tense vowels. Likely Chinese sources are in bold.

Li Fanwen #
Tangut period NW Chinese
Middle Chinese
𗐯 4719


to write
𗒨 4696


*mujʰ taste

*ɕiˀ arrow
𗄭 1941

to gather

Chinese had long ago lost *sC-clusters, so the tense vowels in the Tangut borrowings do not reflect a Chinese *s-.

At least two of those loans postdated Middle Chinese:

- 'to write' reflects the Chinese sound change *-jæ > *-e

- 'taste' reflects the Chinese sound change *mu- > *v-

('World' is ambiguous.) Am I to believe that a prefix *S.- was present as late as the turn of the millennium and added to those loans which almost immediately developed tense vowels? E.g.,

TPNWC *3ke2 > *2S.-ke2 > *2kke2 > *2kkeq2 > 2keq2

all in the space of about a century?

Three loans are early:

- 'cymbals' preserves¹ Middle Chinese *b-

- 'arrow' underwent the Tangut *-i > -y shift which seems to have postdated Middle Chinese; it may date from the late first millennium AD (see 'to gather' below)

- 'to gather' underwent that same shift and preserves¹ Middle Chinese *dz-. Compare with 'taste' which has a post-Middle Chinese initial but did not undergo the Tangut *-i > -y shift, a change that must have occurred before it was borrowed. The potential of using loanwords to date Tangut sound changes has yet to be fully explored.

But not so early that they would have had *sC-clusters that would become single consonants + tense vowels in Tangut.

I can think of five ways to deal with the problem of why those six loans have tense vowels.

1. They are unrelated native Tangut lookalikes that once had *S.-.

I'd buy this if I had internal etymologies for at least some of the six, but I don't.

2. They are the random byproducts of misperception.

But what in Chinese could sound like tense vowels to Tangut ears?

3. They are sporadic attempts to emulate Chinese phonetic features absent in Tangut.

It may not be a coincidence that all the loans are from Chinese words with tones 2-4 from final glottals or stops. The trouble is that the two late loans, 'to write' and 'taste', had no final glottals in Chinese by the time they were borrowed.

4. They acquired tense vowels by analogy with other words with tense vowels.

But which words would have been the models for analogy?

5. Perhaps 'taste' acquired a tense vowel by assimilating with


1079 2lenq3 'sweet' (this resembles lem-type words for 'sweet' in Sino-Tibetan, but a pre-Tangut *S.lem would have become lonq, not lenq.)

in the compounds

𘕉𗗘 𗗘𘕉

1viq3 2lenq3 and 2lenq3 1viq3, both 'sweet' (see Gong 2002: 352-353 for attestations).

That is, an earlier *1vi3 2lenq3/*2lenq3 1vi3 became 1viq3 2lenq3/*2lenq3 1viq3, and 1viq3 retained a tense vowel even as an independent word.

I could then claim that 'world' acquired a tense vowel by assimilating with


0359 1soq1 'three'

in the phrase


1soq1 2keq2 'three worlds' (a calque of Chinese 三世 'three worlds' or Tibetan dus gsum 'three times': i.e., past, present, and future).

but I think that's pushing it. And I have no phrases to explain the tenseness in the other four loanwords.

Should 2wuq1 be added to that set of loanwords with anomalous tense vowels? Maybe.

¹It would be more precise to say "preserves the voicing of", since Middle Chinese voiced obstruents were oral, whereas they were borrowed into Tangut as prenasalized stops b- [mb] and dz- [ndz]. WAS TANGUT 2WUQ1 'TO AID' BORROWED FROM CHINESE? (PART 1)

In my last post, I remarked upon the similarity of Tangut


0645 2wuq1 < *Sʌ-ʔwə/oH 'to aid'

to the Sino-Korean reading 우 u for 祐 'to aid'. I considered and rejected the possibility that the Tangut and Chinese words were cognates: i.e., inherited from Proto-Sino-Tibetan.

But I didn't consider yet another possibility: could the Tangut word be a borrowing from Chinese? That would explain the similarity between 2wuq1 and Sino-Korean u: they were both borrowed from roughly contemporary varieties of Chinese. 2wuq1 looks like Edwin G. Pulleyblank's Early Middle Chinese 祐 *wuwʰ (= my *wuʰ) and Tangut period northwestern Chinese (TPNWC) *3wu3.

However, "looks" does not mean "sounds". My w- is [ʔw], not a true [w] like Pulleyblank's *w-. Middle Chinese *w- corresponds to Tangut v- ([v]? [ʋ]?), not w- [ʔw] in Gong's list of Chinese loans in Tangut (2002: 407-408):


0403 1von1 : 王 *wɨaŋ 'the surname Wang'


2340 1von1 : 旺 *wɨaŋʰ 'bright'

I wrote "corresponds" because 'Middle Chinese' is a Platonic entity distinct from whatever northwestern dialect the Tangut were in contact with.

On the other hand, Gong's list of Tangut transcriptions of Chinese in the Forest of Categories (2002: 436-437, 444-445) shows vacillation between v-and w- for Chinese *w-syllables (correspondence types A and E). That seems to imply that the Tangut lacked a simple initial [w]: they could only approximate a Chinese initial [w] with either v- ([v]? [ʋ]?) or w- [ʔw].

Homophones B chapter and homophone group
Li Fanwen number
Tangut reading
Tangut period NW Chinese
Early Middle Chinese
Corresponence type
𗍁  II 1

*wɨejʰ A: v- : w-
II 2


𘍵 II 3



II 9



𗍾  II 9



II 26


B: Ø- : w-
*1hun3 *wuŋ C: h- : w-
𗭴 VIII 5087
*wɨaŋ B: Ø- : w-
𗇝 VIII 4689

*wɨet D: yw- : w-
𗫖 VIII 2094

E: w- : w-
𗤭 VIII 3128


𗨂 VIII 3685

*wɨep B: Ø- : w-
VIII 3628
*wɨen F: gh- : w-
*2/3wen3 *wɨenˀ/ʰ

*wuŋ C: h- : w-

There are also four other types of correspondences:

B: Tangut Grade IV Ø-syllables may have begun with [j], and Chinese Grade III *w- may have become [ɥ], a glide absent from native Tangut words. (But see correspondence D below.)

C: Unique to transcriptions of 雄 *1hun3 (for †1wun3) which must have developed the same irregular fricative found through much of Chinese: e.g., Cantonese hoŋ and Mandarin xiong < *hjuŋ.

D: Tangut ywa [ɥa] is a special rhyme in the readings of only three characters:

𗇝 4689 1ywa4 'glittering'

𗇜 5014 1ywa4 'to go fast; quick' (only attested in the Tangraphic Sea dictionary)

𗮞 5099 1shywa3 'transcription character for Sanskrit śva'

The first two words may be borrowings from 'Tangut B', the non-Sino-Tibetan language that I think is the source of much Tangut vocabulary and possibly even reflected in the structure of the more obscure characters.

F: Tangut ghw [ɣw] might have been an attempt to approximate Chinese [w] without the initial stop of Tangut w- [ʔw]. ghw- is from Gong's reconstruction; it corresponds to w- in Sofronov and Nishida's reconstructions converted into my notation. If Sofronov and Nishida are right, the use of 3628 is simply another instance of correspondence E.

Given that TPNWC 右 *2wu3, the phonetic of TPNWC 祐 *3wu3, was transcribed in Tangut as both 1vi3 and 2ew4, I would expect TPNWC 祐 *3wu3 (or an earlier Early Middle Chinese *wuʰ) to have been borrowed as †1vi3 or †2ew4 with initial †v- or †Ø-,  not 2wuq1 with initial w-. However, the existence of correspondence pattern E (Tangut w- [ʔw] : Chinese *w-) weakens an initial-based argument against a borrowing scenario. Note, however, that pattern E is not attested with the rhyme type of 右 and 祐. That may suggest that w- [ʔw] was inappropriate for 右 and 祐 even though it was appropriate for TPNWC 雲 *1wun3 and 員 *1wen3. TPNWC *w- could have had different allophones before different rhymes.

As I will explain in part 2, I think the rhyme of


0645 2wuq1 'to aid'

may even more strongly rule out a borrowing scenario.

