Here are the remaining types of 白水村 Baishuicun (BSC) 'White Water Village' -ai forms that are not from non-*a vowels followed by nasals (see part 5).

4. 白 pai and : Middle Chinese *bæk

These look like loans from standard Mandarin bai [paj] and southwestern Mandarin pə. The 小學堂 Xiaoxuetang database does not list Hunan Mandarin forms, so the southwestern Mandarin forms in this post are from Guangxi to the west of Hunan.

5. 擲 tsai : Middle Chinese *ɖiek

I would expect *tsiə. A loanword? But the rhyme doesn't match southwestern Mandarin tsɿ. Maybe tsɿ was borrowed as *tsə whose schwa shifted to -ai. See the next class.

6. 而兒爾 ai : Middle Chinese  *ɲɨ, *ɲie, *ɲieˀ

All three are ə in southwestern Mandarin. Perhaps all three were borrowed as *ə which later broke to -ai. That shift is parallel to the shift of Old Chinese *-əˁ to Mandarin -ai (see part 6 for examples of Mandarin -ai forms borrowed into BSC).

7. 日入 ai : Middle Chinese *ɲit, *ɲip < Old Chinese *nit, *nip;  栗 lai : Middle Chinese *lit

These are i, y, and li in southwestern Mandarin. All three once had a final glottal stop. Were they borrowed into BSC as *iʔ, *iʔ, and *liʔ whose *i broke to ai before a now-lost glottal stop? Other BSC forms which may be loans of -i forms without glottal stops do not end in -ai.

The first two have alternate forms.

BSC 入 y (if I am reading Xiaoxuetang correctly) looks like a recent direct loan from southwestern Mandarin.

BSC 日 ɲi is close to Middle Chinese *ɲit and may be an old loan postdating the palatalization of Old Chinese *n-.

BSC 日 na and 入 na may be native. Could their n- be a retention of Old Chinese *n-?

8. 睡 fai could be a borrowing from a southwestern Mandarin form resembling Lingui suei and Luorong suɐi. f- may be a sporadic simplification of *sw-. I cannot find any other examples of f- from sibilants in BSC.

When I wrote part 6, I thought there were eight classes of exceptions, but now I see ten. (I broke up one class into classes 4, 5, and 7. Although 4 and 5 both had *-k, 7 is completely dissimilar and its inclusion was a mistake.)

9. BSC 甫 pai : Middle Chinese *puoˀ

I have no idea why this form has a -i.

10. BSC 些 s-, l- + -ou, -iə, -ai, -əi : Middle Chinese *sjæ

I don't know for sure which initials go with which finals. I suppose the first form is sou and the last one is ləi. BSC -iə regularly corresponds to Middle Chinese *-jæ, so I guess the second form is siə. If the third form is lai, it cannot be related to the first two since BSC l- is not from *s-. A DIP INTO WHITE WATERS (PART 7): AI-XCEPTION CLASSES 2-3

In part 5 I proposed the following chain shift in 白水村 Baishuicun (BSC) 'White Water Village':

*-VN > *an > *-ai > *-oi > -o

*V was a non-*a vowel.

Most BSC -ai are from *-VN with the exception of eight types of forms.

The first type in part 6 was borrowed after the shifts took place.

There is only one example of the second type: 鏘 *tshɨaŋ 'jangling noise'. Onomatopoetic words may be exceptions to general developments.

There is also only one example of the third type: 黌 *ɣwæŋ 'school' whose coda may have later fronted to *-ɲ.

Types 2-3 may be borrowings postdating the shift of *-VN to *-an but predating the shift of *-an to *-ai:

Early pre-BSC
Late pre-BSC
(N/A since these are loans)
thai (tshai?)
*kæŋ > *kaɲ > *kan

The th of thai may be a typo for tshai in the 小學堂 Xiaoxuetang database, as it lists no other examples of BSC th- from *tsh-, and thai sounds even less like a jangling noise than tshai does.

鏘 and 黌 are low-frequency characters, so their BSC readings thai (tshai?) and xai may be literary borrowings without colloquial (i.e., native) equivalents *tshoi and *xoi. <HACÑ(Ī)>

And now for a Thai-Islamic detour that will bring me back to 白水村 Baishuicun 'White Water Village' ...

Last night I found the Thai Wikipedia article for hajji which is titled หัจญี <hacñī> [hàtjiː]. It lists another form ฮัจญี <ɦacñī> [hátjiː]. According to thai-language.com, the 1982 edition of the Royal Dictionary lists two more forms:

หะยี <haḥyī> [hàʔjiː]

หัจญี <hacñī> [hàtjiː]

Although Thailand has a large Muslim population which is two-thirds Malay, I never looked at any Thai terminology for Islam or Thai transcriptions of Malay until now. The above forms made me wonder:

1. Is Thai terminology for Islam based on Malay: e.g., are [hàtjiː] etc. loans from Malay haji rather than directly from Arabic ḥajjī?

2. Thai has no consonant like Malay and Arabic j [dʒ]. I am accustomed to seeing English [dʒ] rendered as [j] or [tɕ]: e.g.,

เอเนท์ <ʔecend̽> [ʔeːên]

เอเนต์ <ʔecent̽> [ʔeːên]

เอเนต์ <ʔeyent̽> [ʔeːjên]

(The falling tone of the second syllable is unwritten in those spellings. I assume the tone is falling on the basis of alternate spellings with a first tonal marker*, though I would expect a high tone in a final syllable that orignally ended in a nasal-stop cluster in English.)

But I have never seen a foreign [dʒ] rendered as Thai จญ <cñ> [tj] before. Another instance of <cñ> is

ฮัจญ์ <ɦacñ̽> [hát] 'hajj' (with a silencer over <ñ>; syllable-final <c> is [t])
ญ <ñ> normally represents [j] from an earlier *ɲ. Why is it in transcriptions for a nonnasal consonant?

3. What principles underlie the choice of tones for Islamic/Malay loans in Thai? The first closed syllable of 'hajji' has both high and low tones (the two most common possibilities for closed syllables with short vowels), and the second open syllable has an unmarked mid tone (like some but not all English loans).

4. Who devised these Thai spellings? Were they Thai-Malay bilinguals? Did they know the Jawi script for Malay (which I briefly mentioned here)?

The Malay spoken in the Thai-Malaysia border region has a number of interesting phonetic characteristics that are apparently not reflected in Jawi spelling which seems to be historical. The fronting of *aN to /ɛː/ reminds me of the *-an > *-ai shift from parts 2 and 5 of my series of Baishuicun; in both cases, a final nasal conditions the fronting of a preceding *a.

*The first tonal marker indicates a falling tone in a sonorant-final syllable when it is atop a *voiced consonant symbol such asญ <ñ> or ย <y>. It indicates a low tone in such a syllable when it is atop an *implosive or *voiceless consonant symbol. The starred (i.e., reconstructed) qualities are not necessarily retained in modern Thai. *Voiced obstruents have devoiced,  *voiceless sonorants have voiced, and the *implosives are no longer implosive: e.g.,

*ban¹ > [ân]

*an¹ > [màn]

*ɓan¹ > [bàn]

On the other hand, *voiced sonorants and *voiceless obstruents retain their original voicing qualities:

*man¹ > [mân]

*pan¹ > [pàn]

*an¹ > [àn]

A tonal split (*¹ > falling/low) compensates for the loss of voiced obstruents and voiceless sonorants. The implosives and have moved into the space vacated by orignal *b and *d (which have become [pʰ] and [tʰ]), but vowels following implosives still bear tones associated with *implosives. HAJI, HAZHE, HAZHI

Sorry, I'm on another Sino-Islamic detour.

The Chinese Wikipedia article for hajji has three types of transcriptions (readings here are in Mandarin unless stated otherwise and tones are not included):

1. *velar-initial second syllable:

哈吉 haji [xatɕi], 阿吉 aji

These must postdate the recent palatalization of *k in Mandarin.

2. affricate-initial* open second syllable:

哈只 hazhi [xatʂr̩] (the transcription in the article title), 哈芝 / 哈指 / 哈治 / 哈志 hazhi

The second syllables of these transcriptions have different tones:

'yin level': 芝

'rising': 只指

'departing': 治志

3. affricate-initial *closed second syllable:

哈哲 hazhe [xatʂɤ] (Cantonese haazit [haːtsiːt])

Mandarin 哲 zhe has lost the *-t retained in Cantonese and has a 'yang level' tone in the standard language.

I have several questions:

1. What are forms for 'hajji' in the Chinese varieties spoken by the 回 Hui people?

2. Is there a standard tone class for the second syllable in 回 Hui speech (which is not to be confused with 徽州 Huizhou Chinese)?

3. How is 'hajji' written in the toneless Xiao'erjing script?

4. Do the spoken and written forms in the Hui community match the transcriptions of the non-Hui Chinese world?

5. What is the oldest known Chinese character transcription of 'hajji'? My guess is that the earliest transcriptions were of the hazhi type.

6. Wikipedia states that 哈哲 hazhe (Cantonese haazit) is the transcription used in Hong Kong and Macao. If this transcription was devised by a Cantonese speaker, why does its Cantonese reading have a -t corresponding to nothing in Arabic? If I set the Chinese Wikipedia page to display in Hong Kong or Macao complex characters, the title of the article is still 哈只 hazhi (Cantonese haazi [haːtsiː]) which is a better phonetic match in Cantonese.

*These affricates are not original either, but their affrication predates Islam and is not relevant. AFANTI

I'm going to take a northern detour away from 白水村 Baishuicun 'White Water Village' to look at Mandarin 阿凡提 afanti 'effendi' (< Arabic afandī or the like). I've long had the impression that such Islamic loanwords were borrowed into Mandarin in recent centuries. However, afanti has an aspirated -t- [tʰ] which is a weak match for foreign -d-. The t- [tʰ] of Mandarin 提 is from *d-. Was  afandī borrowed into a Chinese language that retained *d-? I doubt that for two reasons.

First, I would expect Islamic loans to be from the northwest, and Tangut transcription evidence indicates that *d- had become *tʰ- in the northwest by the early second millennium AD.

Second, a Chinese language retaining *d- in 提 would probably also have retained *v- in 凡. Was afandī borrowed as *avandi? I suppose one could try to evade this problem by proposing that this Chinese language devoiced *v- before *d-, so afandī was borrowed as *afandi. I have thought that the Chinese variety underlying Sino-Vietnamese (John Phan's 'Annamese Middle Chinese'; AMC) might have devoiced *v- before *d-*, but I am not sure. In any case, AMC could not have been the source of afanti for geographical and phonological reasons. 凡 had a final *-m in AMC that does not match the -n- of afandī.

By coincidence the ultimate Greek source of afandī is αὐθέντης <authéntēs> with a -t- corresponding to the  -t- [tʰ] of Mandarin afanti. Obviously afandī postdates several changes in Greek:

- the shift of au to

- the shift of aspirates to fricatives (tʰ > θ)

- the devoicing of β to -f- before voiceless consonants like θ

- the simplification of -fθ- to -f-

- the voicing of -t- to -d- after -n-

- the raising of -ē- to -i-

So I'm back to where I started: why does afandī correspond to Mandarin afanti?

*My logic was that *v- patterned like *f- in Sino-Vietnamese (SV):

AMC SV stage 1 SV stage 2 SV stage 3 Modern SV
*pʰ- *pʰ- *pʰ- *pʰ- ph- [f]
*v- > *f-
*b- *b- *p- *ɓ- b- [ɓ]
*p- *p-

Early Vietnamese had no *f-, so *pʰ- was the closest equivalent of AMC *f-.

If AMC still had *v-, I would expect it to correspond to SV b- < *b-. (I assume early Vietnamese had no *v-, and that modern v- is from *w-. There are no cases of Chinese *v- corresponding to Vietnamese v-, which leads me to believe that Vietnamese had no *v- at the time of borrowing and that the shift of *w- to v- postdates borrowing.)

I now think SV reflects a stage of AMC in which all voiced obstruents had been devoiced:

AMC SV stage 1 SV stage 2 Modern SV
*pʰ- *pʰ- *pʰ- ph- [f]
*v- > *f-
*b- > *p- *p- *ɓ- b- [ɓ]

Vietnamese spelling reflects the *pʰ-/ɓ-stage of the 17th century. A DIP INTO WHITE WATERS (PART 6): AI-XCEPTION CLASS 1

About one-seventh of 白水村 Baishuicun (BSC) 'White Water Village' -ai forms cannot be traced back to rhymes with non-*a-vowels plus nasals at the left end of this chain from part 5:

*-VN > *an > *-ai > *-oi > -o

I have classified those remaining forms into eight categories.

The first category consists of -ai forms from Old Chinese *-əˁ:

tai (borrowing layer 2), to (borrowing layer 1) < *dˁəˁ < *Cʌ-dəʔ or *Nʌ-təʔ

mai (borrowing layer 2) < *mˁəˁʔ < *Cʌ-məʔ

tai (borrowing layer 2), lo < *CV-tai (borrowing layer 1) < *Nʌ-tˁəˁsˁ < *Nʌ-təs

the tone of tai (but not lo!) indicates a voiced initial *d- which may be from *Nʌ-t-

pai (borrowing layer 2), (native) < *bˁəˁ < *Cʌ-bə or *Nʌ-pə

I think there are at least three layers in these forms.

is native and may directly reflect Old Chinese *-əˁ. BSC borrowed from prestige dialects whose *-əˁ developed a glide:

*-əˁ > *-əɰ > *-əj > *-aj

The first layer of borrowings predates the *-ai > -o shift in BSC and the second layer postdates it.

The first layer of borrowings also predates lenition and the loss of presyllables in BSC. A DIP INTO WHITE WATERS (PART 5): A CH-*AI-N SHIFT?

The 小學堂 Xiaoxuetang database is back, so I can add a new link (in bold) to my 白水村 Baishuicun (BSC) 'White Water Village' chain shift from part 2:

*-VN > *an > *-ai > *-oi > -o

Here are some sample words with composites of prestigious Early and Late Middle Chinese cognates for comparison:

Sinograph Early Middle Chinese Late Middle Chinese Pre-BSC BSC
*dəm *dam *tan tai
*len *lien *lan lai
*ʂɤan *ʂæn *san sai
*khwan *khwan *khan khai
*təŋ *təŋ *tan tai
*lɨəŋ *lɨəŋ *lan lai
*neŋ *nieŋ *nan lai ~ nai
*tshɨm *tshim *tshan tshai
*mun *vun *man mai ~ uai
*mon *mon *man mai
*kən *kən *kan kai
*sin *sin *san sai
*touŋ *toŋ *(CV-)tan lai
*luoŋ *lyoŋ *lan lai

Rhymes with non-*a-vowels plus nasals merged into *-an, which shifted to -ai after an earlier *-an shifted to *-oi and an earlier *-oi shifted to -o. Here is how that merger might have taken place:

Stage 1 Stage 2 Stage 3
*-in *-en *-an
*-ɨm *-en or *-on or *-ən
*-on *-on

In stage 1, pre-BSC had a vowel system resembling prestige EMC.

In stage 2, pre-BSC front vowels merged into *e before nasals and back vowels merged into *o before nasals. Central vowels could have merged into *e, *o, or before nasals.

In stage 3, pre-BSC *-en and *-on merge into *-an. The *-n conditions an *-i- that remains after the nasal is lost:

*-an > *-ain > -ai

There are two forms in the first table that would not be in the second:

*san > sai

*khan > khai

I think those forms were borrowed after earlier *-an became *-ai which then became -oi. Hypothetical native cognates would be *soi and *khoi.

mai may be native, whereas 文 uai is a borrowing from some late Tang or newer form resembling Sino-Vietnamese văn. (It is geographically impossible for BSC to have borrowed from Sino-Vietnamese, but a local neighboring language could have had a similar form.) uai tells us that the *-an to -ai shift postdates the *m- > v- shift in the source of uai.

The l- of 冬 loi and 東 loi is due to lenition after a prefix *CV- that was lost at some unknown point.

I have kept 龍 loi⁴¹ separate from the homophones 冬 loi⁴⁴ and 東 loi⁴⁴ since they have different tones: a mid-high falling tone from a *voiced initial (*l-) and a mid-high level tone from a *voiceless initial (*-t-).

As has been the case so far, a single rule cannot account for all forms with the same rhyme. I will write about other sources of *-ai in part 6.

