Last night's Aiāx : <Aíās> mismatch was minor compared to Greek Odysseús : Latin Ulixēs 'Ulysses':

Greek δ υ σσ εύς
Transliteration O d y ss eús
Latin U l i x ēs

Lewis and Short regarded Latin Ulyssēs as an error based on the Greek form. It looks like a compromise between the two extremes of <Odysseús> and Ulixēs:

Greek transliteration: Odysseús O d y ss eús
Latin compromise: Ulyssēs U l ēs
Latin: Ulixēs i x

Liddell and Scott list eight other Greek forms (B-I below in transliteration only for ease of comparison). I is an exact match of the Latin form since Greek <ou> = [u]:

Greek A: Odysseús O d y ss eús
Greek B: Ōlysseús l
Greek C: Olysseús Ō
Greek D: Olytteús O tt
Greek E: Olyteús t
Greek F: Olyseús s
Greek G: Oliseús i
Greek H: Oulixeús Ou x
Greek I: Oulixēs í ēs
Latin Ulixēs U i

According to Mommsen, Ulysses was Utuze or Uthuze in Etruscan.

Can Greek dialect variation and Etruscan phonotactics account for all this variation?

5.27.00:50: My completely uninformed guess is that part of the variation might reflect different attempts to borrow a non-Indo-European name: e.g., <d> and <l> for a foreign ɮ (cf. the Zulu spelling dl for [ɮ]). AJA_

I normally expect Latin borrowings from Greek to resemble the originals: e.g.,

Latin phoenīx < Greek φοῖνιξ <phoĩniks> 'phoenix'
By analogy I would expect

Latin Aiāx < Greek Αἴᾱξ <Aíāks> 'Ajax'

But the Greek source of Latin Aiāx is in fact Greek Αἴᾱς <Aíās>, not *Αἴᾱξ <Aíāks>. Although the nominative singular of Greek Αἴᾱς <Aíās> ends in -ς <s>, its stem actually ends in an -nt that is lost before the suffix -s: *-nts > -s.

I would have expected Greek Αἴας <Aíās> (gen. sg. Αἴαντ-ος <Aíant-os>) 'Ajax' to have been borrowed into Latin  as *Aiās (gen. sg. *Aiant-is), just as Greek Ἄτλᾱς <Átlās> (gen. sg. Ἄτλαντ-ος <Átlant-os>) 'Atlas' was borrowed into Latin as Atlās (gen. sg. Atlant-is).

So why does Latin Aiāx 'Ajax' (gen. sg. Aiāc-is) have a final -x /-k-s/ implying that it came from a Greek -k stem rather than a Greek -nt stem?

5.26.2:16: Addendum: Comparison of Latin and Greek -k and -nt stems

case/number Latin borrowings from Greek Greek
k-stem nominative singular -x = /-k-s/ -k-s
genitive singular -c-is = /-k-is/ -k-os
nt-stem nominative singular -s -s < -nt-s
genitive singular -nt-is -nt-os

5.26.3:26: Native Latin -nt stems have nominative singulars in -ns: e.g., ns < -nt-s 'tooth' (cf. gen sg. dent-is).

5.26.3:44: Schwering (2010) might be able to answer my question, but that article costs $42!

Some may be primary: i.e., they were inherited as such from Proto-Mon-Khmer. I'll look into one possible instance of a primary cluster tomorrow.

Others may be secondary: i.e., due to developments after Proto-Mon-Khmer. Secondary Cʔ-lusters in turn have two subtypes:

Cʔ- from prefix C- + root-initial ʔ-: e.g.,

lʔɑɑ 'good' (the first Khmer Cʔ-word I ever learned) < l- + √ʔɑɑ 'mount, swell, rejoice' (Jenner and Pou 1980-1981: 413)

Jenner and Pou (1980-1981: xlii) defined the prefix L- (with allomorphs l- and r-) as "perfective and generalized frequentative" (emphasis theirs), but neither of these uses seem to be, uh, good matches for the l- of  lʔɑɑ 'good':

'having mounted' or 'mounting repeatedly' > 'good'?

'having swollen' or 'swelling repeatedly' > 'good'?

'having rejoiced' or 'rejoicing repeatedly' > 'good'?

5.25.00:49: Shorto (2006: 193) reconstructed Proto-Mon-Khmer  *ləʔ 'good' with a word-final glottal stop that metathesized in Khmer. Khmu ʔ 'good' and Mal ʔ 'good' preserve the original structure.

Kui ʔɑ 'good' and Bru ʔɑ̃ 'good' are listed as loans from Khmer in SEAlang's Mon-Khmer database. Why do they lack l-? Do they reflect the bare Khmer root √ʔɑɑ (which should then be reconstructed at the PMK level contra Shorto), or are they unrelated soundalikes? The nasality in Bru has no Khmer source. Moreover, as far as I know, Bru isn't even spoken in Cambodia.

If the Kui and Bru forms are loans, they must postdate the ɔ > ɑ shift in Khmer, whereas Thai laʔɔɔ 'eye-pleasing, fair' predates it. (A later loan into Thai would be *laʔaa.)

PMK *ləʔ 'good' may be related somehow to PMK *[d1]lak 'good' (what does the subscript 1 represent) and PMK *r[a]k 'to love' (borrowed from Khmer into Thai as rak?).  The MK database includes Riang _rak under  PMK *[d1]lak 'good' and Lamet (Lampang) lɑk 'good' under *ləʔ 'good', implying that all three roots are related.

Cʔ- from root-initial C- + infix ʔ-: e.g.,

tʔɨŋ 'to be reluctant' < tɨŋ 'to be taut, unyielding' + -ʔ- 'iterative' (Jenner and Pou 1980-1981: l)

5.25.0:07: I guess 'unyielding' became 'unwilling' and then 'reluctant', but I don't see how being 'reluctant' is iterative, unless it refers to a personality trait (being constantly unyielding) rather than a momentary unwillingness.

5.25.1:09: Neither Huffman nor Ehrman mention this infix in their Khmer grammars.

Last night I initially posted incorrect statistics for Khmer consonant-glottal stop clusters. I caught my error when I looked through Franklin Huffman's 1967 PhD dissertation for examples of cʔ- and realized I had forgotten the word  ឆ្អឹង cʔəŋ 'bone'! I had overloohked several possible spellings of consonant-glottal stop clusters including <chʔ>. Here is a list of all typeable <ʔ>-clusters in SEAsite's Khmer dictionary. Hypothetical pronunciations are in parentheses. When two frequencies are listed per cluster, the first includes compounds and the second does not.

voiceless voiced voiceless
unaspirated aspirated unaspirated aspirated nasal approximant fricative
glottals Khmer script អ្អ
transliteration <ʔʔ> <hʔ>
modern Khmer pronunciation ([ʔə̆ʔ]) [hə̆ʔ]
frequency 0 3
velars Khmer script ក្អ ខ្អ គ្អ ឃ្អ ង្អ
transliteration <kʔ> <khʔ> <gʔ> <ghʔ> <ʔ>
modern Khmer pronunciation [kə̆ʔ] ([kə̆ʔ]) ([kə̆ʔ]) ([kə̆ʔ]) ([ŋə̆ʔ])
frequency 45 0 0 0 0
palatals Khmer script ច្អ ឆ្អ ជ្អ ឈ្អ ញ្អ យ្អ ឝ្អ
transliteration <cʔ> <chʔ> <jʔ> <jhʔ> <ñʔ> <yʔ> <śʔ>*
modern Khmer pronunciation ([cə̆ʔ]) [cə̆ʔ] ([cə̆ʔ]) ([cə̆ʔ]) ([ɲə̆ʔ]) ([jə̆ʔ]) ([sə̆ʔ])
frequency 0 82 ~ 20 0 0 0 0 0
'retroflexes'** Khmer script ដ្អ ឋ្អ ឌ្អ ឍ្អ ណ្អ រ្អ, ឡ្អ ឞ្អ
transliteration <ṭʔ> <ṭhʔ> <ḍʔ> <ḍhʔ> <ṇʔ> <rʔ>, <ḷʔ> <ṣʔ>*
modern Khmer pronunciation ([ɗə̆ʔ]) ([tə̆ʔ]) ([ɗə̆ʔ]) ([tə̆ʔ]) ([nə̆ʔ]) ([rə̆ʔ], [lə̆ʔ]) ([sə̆ʔ])
frequency 0 0 0 0 0 0***, 0 0
dentals Khmer script ត្អ ថ្អ ទ្អ ធ្អ ន្អ ល្អ ស្អ
transliteration <tʔ> <thʔ> <dʔ> <dhʔ> <nʔ> <lʔ> <sʔ>
modern Khmer pronunciation [tə̆ʔ] ([tə̆ʔ]) ([tə̆ʔ]) ([tə̆ʔ]) [nə̆ʔ] [lə̆ʔ] [sə̆ʔ]
frequency 12 0 0 0 2 96 ~ 30 61
labials Khmer script ប្អ ផ្អ ព្អ ភ្អ ម្អ វ្អ [f] is only in loanwords
transliteration <pʔ> <phʔ> <bʔ> <bhʔ> <mʔ> <vʔ>
modern Khmer pronunciation [pə̆ʔ] [pə̆ʔ] [pə̆ʔ] ([pə̆ʔ]) [mə̆ʔ] ([və̆ʔ])
frequency 10 58 1 0 3 0

Khmer spelling is generally historical; it preserves voicing distinctions that were later lost. Taken at face value, the spelling frequencies indicate that in earlier Khmer,

- only voiceless unaspirated velar and dental stops could precede glottal stops

- only voiceless aspirated palatal stops could precede glottal stops

- all labial stops other than bh could precede glottal stops

Can this skewed distribution of clusters be projected back into earlier Khmer? To what degree do the spellings reflect arbitrary conventions? How many clusters were respelled after the four-way neutralization of initial stops in clusters?

*5.24.3:31: Obsolete letter for Indic loans.

**5.24.3:31: Khmer has never had retroflexes. Indic retroflex letters represent (a) Khmer dentals and alveolars corresponding to Indic retroflexes and (b) Khmer implosive [ɗ].

***5.24.4:13: There are modern Khmer words with initial <raʔ> [rɔʔ]. See Jenner and Pou (1980-1981: 504-505) for a list. Perhaps these words once had initial *rʔ-, but none are in J&P's Old and Middle Khmer index. BRUGMANN'S LAW: F-RʔE-QUENCY

In "A Glottal Hurdle", I had to syllabify Proto-Indo-European *swesorʔe 'two sisters' (nom./acc./voc. dual) as


with an unusual final syllable *rʔe to make Brugmann's Law work.

Such a syllable isn't impossible. The one language I know that has initial consonant-glottal stop sequences is Khmer. I looked up every sequence I could think of at SEAsite's Khmer dictionary and found the following frequencies:

totals glottals: 3 velars: 45 palatals: 82 ~ 20 alveolars/dentals: 171 ~ 105 labials: 71
stops: 67 ʔʔ-: 0 kʔ-: 45 cʔ-: 82 ~ 20 tʔ-: 12 pʔ-: 69
nasals: 4
ŋʔ-: 0 ɲʔ-: 0 nʔ-: 2 mʔ-: 2
fricatives: 64 hʔ-: 3

sʔ-: 61
approximants: 96 ~ 30

jʔ-: 0 rʔ-: 0, lʔ-: 96 ~ 30 vʔ-: 0

(5.23.4:59: The above numbers include multiple listings for the same words in isolation and/or in compounds; the actual numbers must be smaller.)

The number of lʔ- depends on whether one counts compounds or not.

The distribution is not random:

- no vowellike first consonants (j- and v- are like i and u)

- few tʔ- relative to the other initial stops other than glottal stop

- almost no nasals; no back nasals

I'd like to derive these Cʔ-clusters from earlier *CVʔ-syllables. Did some Cʔ-clusters merge with others after losing their vowels: e.g.,

*nʔ-, *rʔ-  > lʔ-?

(5.23.4:50: It's also possible that some *CVʔ- were reduced to C- without a glottal stop: e.g., *tʔ- > ɗ-.)

I looked for rʔ-s in the 213 languages of the SEAlang Mon-Khmer Etymological Dictionary and I only found two examples in the Bahnaric language Stieng:

rʔiː '(kind of) basket'

rʔiːu 'to become rancid'

(5.23.0:08: For comparison, that dictionary has 701 examples of rV-syllables. The dictionary is not comprehensive, so that is only a small fraction of the rV-syllables in the vocabularies of those languages.)

If rʔ-syllables are infrequent even in Mon-Khmer languages - some of which abound in clusters like the Khm- of Khmer - is it likely that they existed in faraway Proto-Indo-European?

5.23.4:26: I can't find any examples of initial Cʔ-clusters in Haupers' (1969) "Stieng Phonemes", though his formula for Stieng syllable structure does not rule out rʔ-:


P = presyllable initial

Ơ = presyllable vowel

S = 'syllabic'; nasals or liquids

C = initial consonant of main syllable

H = /h w l r/

W = /w l/

V = main syllable vowel

F = main syllable coda

rʔiːu would be S-CVF:

S = presyllabic r-

C = initial ʔ- of main syllable

V = main syllable vowel

F = main syllable coda u (= Hauper's /w/)

I thought presyllabic r- might be pronounced as syllabic r̩, but according to Haupers (1969: 135),

Word-initially a non-distinctive voicoid [which he symbolized as /a/] is often heard preceding the onset of the trill.

So perhaps rʔiːu is phonetically [arʔiːw].

I should look at the scans of Franklin Huffman's Stieng notebook for another perspective. BRUGMANN'S LAW: A GLOTTAL HURDLE

Brugmann's Law can be hard to grasp because syllabic boundaries (indicated with periods) don't match morphemic boundaries (indicated with hyphens). Here are all the instances of the unexpected long ā of the strong stem of Sanskrit svasār- 'sister' with mātar- 'mother' for comparison:

Vowel before PIE *r Case Proto-Indo-European Sanskrit
PIE *o in open syllables (subject to Brugmann's law) > Skt long ā accusative singular *swe.so.r-m̥ sva.sā.r-am
nominative/accusative/vocative dual *swe.so.r-ʔe (!?) sva.sā.r-ā (later > -au)
nominative/vocative plural *swe.so.r-es sva.sā.r-as
PIE *e in open syllables (not subject to Brugmann's law) > Skt short a accusative singular *meʕ.te.r-m̥ mā.ta.r-am
nominative/accusative/vocative dual *meʕ.te.r-ʔe (!?) mā.ta.r-ā (later > -au)
nominative/vocative plural *meʕ.te.r-es mā.ta.r-as

To make Brugmann's Law work while using the PIE dual ending *-ʔe from Beekes (1995: 194), I had to syllabify the duals in an unnatural manner: *.r-ʔe instead of *r-.ʔe. Initial consonant-glottal stop clusters are unusual; the only language I can think of with them is Khmer: e.g., ល្អ lʔɑɑ 'good'.

I'd like to say that Brugmann's Law applied after PIE had developed long vowels from laryngeals:

PIE *swe.sor-.ʔe > *swe.so.r-ē (lengthening with resyllabification) > Skt sva.sā.r-ā

But the trouble is that according to Beekes (1995: 142), *ʔe is supposed to become Skt short a, not Skt long ā. Earlier today I considered reconstructing the dual ending as *eʔ which

- has a final glottal stop like the other dual endings *-ʔ and *-iʔ

- would develop into Skt long as expected

- and most importantly for this post, begins with a vowel that links with the preceding *r to form a syllable after *o:

PIE *swe.so.reʔ > Skt sva.sā.r-ā

However, the old* Lithuanian perfect particple dual ending (if I understand Beekes 1995: 195 correctly - I don't know Lithuanian) is short -e < *ʔe, not long < *eʔ. Did the ancestor of Proto-Indo-Iranian undergo metathesis in its dual endings to avoid awkward consonant + glottal stop clusters and match the other dual endings?

PIE *-ʔe > *-eʔ > *-ē > PII * > Skt and Avestan

Such a metathesis must have predated Brugmann's Law (which I now see as a two-step process - 00:28):

Early Proto-Indo-European *o in closed syllable *swe.sor-.ʔe
metathesis and resyllabification *swe.so.r-
lengthening before glottal stop *swe.so.r-ēʔ
loss of glottal stop *swe.so.r-ēØ
lengthening of *o in open syllables (Brugmann's Law part 1) *swe.sō.r-ē
vowel bleaching and lowering (*o and *e lose labial and palatal qualities and become *a; Brugmann's Law part 2) *swa.sā.r-ā
Sanskrit (-v- [ʋ] < *-w- is cosmetic) sva.sā.r-ā

5.22.00:48: Greek also has short -e instead of long corresponding to Skt long -ā: e.g., meter-e 'mothers' (dual nom./acc./voc.)

*5.22.00:36: The dual is almost extinct in Lithuanian. BRUGMANN'S LAW: AN ABSENCE OF Ā-BUNDANCE? (logonote*)

In "Mātur Day", I tried to explain the unexpected long ā of the strong stem of Sanskrit svasār- 'sister' by deriving it from *oH with a laryngeal conditioning length. However, while looking at Beekes' (1995: 119) section on the Proto-Indo-European *-on (= Skt -an) declension class, I was reminded of Brugmann's Law:

PIE *o in open syllables > Skt long ā

(whereas PIE *o in closed syllables and *e everywhere > Skt short a)

Although Brugmann's Law nicely accounts for vowel length in the r-declension,

Gloss PIE Latin (preserves *o, *e when not colored by laryngeals) Sanskrit (. = syllable boundary)
sister *swesor- soror sva.sā.r-V...
mother *meʕter- māter mā.ta.r-V...
father *pʕter- pater pi.ta.r-V...

it predicts an ā-bundance of long ā. PIE had a lot of open syllables ending in short o, so its daughter Sanskrit should therefore have a lot of open syllables ending in long ā. Yet that may not be the case, as short a is still "2.4 times more common than its long counterpart ā".

Wikipedia mentions two potentially problematic workarounds:

- The law doesn't apply to non-alternating *o.

Morphologically insensitive laws are simpler and preferable to morphologically sensitive laws. But maybe there's no choice.

- PIE *o that did not become Skt long ā came from PIE *ʕʷe (which presumably didn't become o before Indo-Iranian split off):

PIE *ʕʷe Proto-Indo-Iranian *a (not < PIE *ʕe without labialization in open syllables)
Other IE *o

Will a lot of *ʕʷe have to be reconstructed in PIE to make this work? *ʕʷ is an unusual consonant to begin with. No languages in UPSID have it.

5.21.1:02: Two modern languages with ʕʷ are Abaza and Liloouet.

*5.21.1:15: I wanted to title this entry "Ā-n Ā-bsence of Ā-bund-ā-nce", but I decided to place a macron atop the only a in an open syllable to symbolize Brugmann's Law: "Ā-bundance".

