Last week, Jim Shooter blogged about revolutionary comics artist Bill Sienkiewicz, whose name got me thinking about palatalization in Polish. Polish consonants before *e were palatalized: e.g., in the first and second syllables of Sienkiewicz' surname:

*se > *sʲe > sie [ɕɛ]

*ke > kie [kʲɛ]

Foreign words borrowed after these changes were not subject to palatalization: e.g.,

sens 'sense' (not *siens)

kenozoik 'Cenozoic' (not *kienozoik)

(See their paradigms at this dictionary hosted in Shooter's hometown of Pittsburgh!)

However, there are native se-, sę-, kę-words without palatalization. Are they really exceptions?

Polish e can be from and in strong positions as well as *e:

*Cь > palatal(ized) C + e: e.g, dzień < * 'day'

note also how ь in weak position at the end was lost but left a trace in the palatality of ń [ɲ]

cf. Russian день (palatalized C + e; not дэ [de])

*Cъ > nonpalatal(ized) C + e: e.g., sen < * 'sleep'

note also how ъ in weak position at the end was lost

cf. Russian сон (nonpalatalized C + o)

I presume that back might have fronted to schwa (cf. how ъ is schwa in Bulgarian today) before completely fronting to e in Polish.

The nonpalatalized s- of Polish serce 'heart' implies *sъ- and I'd expect its Russian cognate to have со-, yet the actual Russian cognate is сердце with a palatalized с- from Old Russian сьрдьце (not сърдьце!) ... and the Old Polish form that Vasmer lists is siеrсе with the expected palatal si-. Did sierce irregularly become serce? Or are the old and modern Polish words 'aunt' and 'niece' rather than 'mother' and 'daughter'? Vasmer lists the Old Church Slavonic word for 'heart' as сърдьце. OCS is not ancestral to Polish or Russian, but it and the modern Polish form could imply vocalic variation at the Proto-Slavic level not unlike the e ~ o variation in 'heart' at a higher level: e.g., Greek κρ but Latin cord- 'heart'. Vasmer reconstructed Proto-Slavic *sьrdьko.

Like Polish oral e, Polish nasal ę has front and back sources: and *ǫ. The frontness of the original vowel is indicated by the (non)palatality of the modern initial consonant:

będę (not *biędię) 'I will be'; cf. OCS bǫdǫ, Russian буду

pięć (not *pęć) 'five'; cf. OCS pętь, Russian пять

So I presume Polish sę-, kę- are from *sǫ-, *kǫ-, perhaps with an intermediate stage [sə̃ kə̃]. Czech seems to confirm this, since Czech u < corresponds to Polish ę after s-, k-:

Cz sup : P sęp 'vulture' < *?sǫp (no Russian cognate?)

Cz kus 'piece' : P kęs 'mouthful' < *?kǫs (cf. Rus кусок 'piece') BEYOND MY K-HEN

Khmer is full of initial consonant clusters: e.g., the khm- in its name. These clusters are not necessarily original. Some result from affixes plus root initials: e.g.,

ពាន <baan> [pien] 'to walk on'

with infix -n-:

ភ្នាន <bhnaan> [phnien] 'plan' < 'means of crossing'

the aspiration is predictable: /pn/ = [phn]

with prefix s-:

 ស្ពាន <sbaan> [spien] 'footbridge'

What appear to be unit initials and are written as single letters (e.g., ខ <kh> and ធ <dh>) in the Khmer script may be stop + h sequences: e.g.,

ហំ <haṃ> [hɑm] 'strong':

with prefix k-:

ខំ <khaṃ> [khɑm] 'to strain'

with prefix *d- (? - see below):

ធំ <dhaṃ> [thom] 'large'

According to Jenner's (1980-81) analysis of Khmer morphology, Khmer h-roots can be preceded by both

earlier voiceless prefixes: <k c t p>

earlier voiced prefixes: <g j d b> (e.g., <d> in 'big' above)

yet Jenner's list of prefixes in his introduction (p. xxvii) only includes the voiceless series. I initially proposed that the voicing of the stop was conditioned by the earlier voicing of the root: e.g.,

voiceless stop prefix + root-initial *h = voiceless aspirate

voiceless stop prefix + root-initial = voiced aspirate

However, there is no evidence that Khmer ever had an *h : distinction. Moreover, this hypothesis predicts that a Khmer h-root will always be preceded by either voiceless stops or voiced stops but never both, but <haṃ> above can be preceded by both voiceless <k> and voiced <d>.

Jenner analyzed <ghəəɲ> [khəəɲ] 'to see, understand' as <g> plus a root <həəɲ> 'see' that is not attested independently or with any other prefix. He linked this root to Thai เห็น <hen> [hěn] 'to see'. I presume he thought that the Thai word was borrowed from the Khmer root. Khmer [ɲ] never comes from Thai [n], so it's not possible for K <həəɲ> to be from T <hen>. However, the reverse is also unlikely since K <əə> should have been borrowed as T <əə>, not T <e>. Moreover, the Thai word has cognates in Tai languages without Khmer loans and can be reconstructed at the Proto-Tai level. Li Fang-Kuei (1977: 121) reconstructed its Proto-Tai initial as *thr- to account for initial th- and ɣ- as well as h- in modern Thai languages. Note that Jenner's Khmer root <həəɲ> doesn't appear with a <t> prefix (or any prefixes other than <g>!). I wonder how Pittayaporn would reconstruct the initial of the Proto-Tai word for 'to see'. I assume the Khmer and Thai words are unrelated lookalikes from earlier *ghəəɲ and *thren. W-E TWO

David Boxenhorn pointed out that Persian صد <ṣd> sad 'hundred' is nearly homophonous with the Persian name صاد <ṣaad> saad of the letter ص <ṣ>. I had noticed that before, but I dismissed it as coincidence until now. Could 'hundred' have been spelled (respelled?*) to nearly match the familiar name of the letter? If so, the spelling must have been coined by Persians who could not distinguish between plain [s] and pharyngeal [sˁ]. A Persian able to pronounce Arabic properly would not have perceived [sad] 'hundred'** and the letter name [sˁaad] as being homophonous. Even their vowels would have differed. Arabic long [aa] could have been a back, pharyngealized [ɑˁɑˁ] after pharyngeal [sˁ] whereas the Persian short [a] was not pharyngealized (and might have been mid or even front rather than back).

The near-homophony of Persian sad 'hundred' with the name of its first letter reminds me of the Japanese use of the Roman letter W (Japanese daburyuu) to represent the nearly homophonous English loanword daburu 'double'. How far back does this practice go? My guess is that it began after WWII. I may have first seen it in the name of Takara's 1982 Microrobot W toy line. Two 'W' robots were able to combine into one. More recent examples are the TV series Kamen Rider W (2009) and the play W ~ Double.

*Was 'hundred' first attested in Arabic script as سد <sad> with plain س <s>? Is 'hundred' the sole survivor of a wave of Persian words that were once written with nonetymological pharyngeals and other Arabic-only words? I could not find any native words with initial ص <ṣ> in Steingass, but I have no way to quickly search for native words with medial and final ص <ṣ>. Could there even have been a time when Persians deliberately Arabized native words in pronunciation as well as spelling: e.g., pronouncing 'hundred' as [sˁad], a true homophone of the name of the letter ص <ṣ>?

**I am assuming that Persian short a was [a] rather than [æ] in the past.

