Hungarian has a large consonant phoneme inventory unlike those of its best-known relatives Finnish and Estonian - or their common ancestor Proto-Uralic:


Consonants absent from Finnish or Estonian native words are in red.

The only Finnish consonant phoneme absent from Hungarian is the unwritten glottal stop /ʔ/.

The only Estonian consonant phonemes absent from Hungarian are palatalized alveolars whose palatalization is not written. Are these only in Russian loanwords?

Ly is still present in Hungarian spelling, but it is now pronounced [j] in the standard language; its original pronunciation [ʎ] is preserved in northern dialects. Is it a retention from Proto-Uralic (whose consonant inventory otherwise looks more like those of Finnish and Estonian)? CSE(H) ÉS OLA(H)

Yesterday I wrote about the ambiguity of ly [j] ~ [li] in Hungarian proper nouns. Today while looking through the phonology section of Kenesei et al. (1998), I discovered that h is ambiguous in Hungarian common nouns. Until now I assumed that h was always pronounced as a fricative:

[h] initially

[ɦ] between vowels

[x] as a coda after a back vowel

[ç] as a coda after a front vowel

Rounds (2001: 4) wrote that "there are no silent letters" in Hungarian. (I presume she was excluding names: e.g., the silent h of the surname Tóth [toːt]).


[i]n a small set of nouns stem-final /h/ is deleted in syllable coda position, i.e., wordfinally, including at compound boundary, and before consonant initial suffixes: e.g., cseh [tʃɛ] 'Czech', Csehország [tʃɛorsaːg] 'Czech Republic', cseh-nek [tʃɛnɛk] 'Czech-DAT'. The following nouns also provide evidence for /h/-deletion: méh [meː] 'bee', juh [ju] 'sheep',' düh [dy] 'fury', céh [tseː] 'guild', pléh [pleː] 'tin', oláh [olaː] 'Wallach', rüh [ry] 'scab', keh [kɛ] 'cough'.

Before vowel-initial suffixes* /h/ is syllabified into the onset and escapes deletion. (Kenesei et al. 1998: 450; phonetic notation converted into IPA and IPA added for examples other than 'Czech')

Hence the plurals of cseh and oláh are cseh-ek [tʃɛɦɛk] and oláh-ak [olaːɦɒk]. [tʃɛɛk] ([tʃɛːk]?) and [olaːɒk] without [ɦ] are also possible in fast speech.

Those two words are of obvious foreign origin; another loanword is céh < Middle High German zeche. Méh and méz 'honey' may be early loans from Indo-European (cf. Proto-Indo-European *medhu- 'honey') into Proto-Finno-Ugric (or Proto-Uralic if one rejects Proto-Finno-Ugric?).

I don't know the etymologies of the other words. Keh 'cough' looks onomatopoetic.

I conclude that h-loss must have occurred early enough to affect early loanwords but not before Hungarian was first written (unless final -h was restored on paper by analogy with forms with intervocalic [ɦ]).

Was moh 'mustiness', the only /h/-retaining exception listed by Kenesei et al., borrowed after /h/-loss? Or was its final /h/ restored because it seems to be a low-frequency word? What are other exceptions to /h/-loss?

*Kenesei (1998: 448) treat vowel-initial suffixes that behave like consonant-initial suffixes as 'C-initial suffixes'. /h/ is lost before C-initial suffixes: e.g., cseh-ért [tʃɛeːrt] 'Czech-CAU' and oláh-ért [olaːeːrt] 'Wallach-CAU'.

Did C-initial suffixes originally have an initial consonant: e.g., (which is not a phoneme in modern Hungarian)?

Why doesn't the causalis suffix -ért 'for' have a variant -árt? Are such invariable case suffixes (see Rounds 2001: 111 for a list) newer than the rest of the case system? EL-I VAGY ELLY?

Today I discovered Victor E. Hanzeli (Hanzéli Győző*)'s translation of Gyarmathi Sámuel's Affinitas linguae hungaricae cum linguis fennicae originis grammatice demonstrata (Grammatical Proof of the Affinity of the Hungarian Language with Languages of Fennic Origin). The surname Hanzéli looks like a variant of Hanzély which is much more common (167 vs. 11,300 Google results on .hu sites; most of the 167 results refer to Victor E. Hanzeli).

At first I thought Hanzély was pronounced [hɒnzeːj] since Hungarian ly (the digraph elly [ɛjː]) is normally [j]. (Of course it was historically [ʎ] which is still preserved in northern dialects according to Wikipedia. When did it weaken to [j] in eastern Hungarian and merge with [l] in western Hungarian?) And I wondered if Hanzéli was also pronounced [hɒnzeːj] with only two syllables in spite of its final -i.

However, I had forgotten about this table of unusual spellings in Hungarian names which indicates that ly can also be pronounced as [li]. Ny, normally [ɲ], can similarly also be pronounced as [ɲi] (not [ni]!).

Has anyone compiled a list of names in which y is [i]?

Hungarian spelling - including even most unusual spellings - is so close to being unambiguous, and yet ...

*A Magyarization of Victor. Győz is 'he/she/it wins' and is an active participle (and by extension, agent) suffix, so Győző is literally 'winner, victor'. I SCANNED, YOU ... ?

Yesterday while looking for a Latvian grammar, I found Defective Paradigms: Missing Forms and What They Tell Us. You can read the introduction here. That book came to mind this evening when I saw that hjp.novi-liber.hr only listed a first person singular aorist for Croatian skenirati 'to scan' (starred forms are what I would expect on the basis of Brown and Alt 2004: 53):

Person \ Number Singular Plural
1st skenirah *skenirasmo?
2nd *skenira? *skeniraste?
3rd *skenira? *skeniraše?

Is it really impossible to say 'we/you/he/she/it/they scanned' with the aorist? (Croatian has other past tenses.) According to Brown and Alt (2004: 44), the aorist "serves to narrate events and express surprising perceived events". So has it become obsolete except for first person narrators who might express surprise at what they perceive?

Brown and Alt (2004: 50-54) list complete aorist paradigms for tresti 'to shake', čuti 'to hear', čitati 'to read', and moliti 'to pray'.

However, hjp.novi-liber.hr only lists a first person singular aorist for čuti and no aorist forms for tresti, čitati, or moliti. On the other hand, Wiktionary lists a complete paradigm for čuti but not for any of the other three verbs. Are different verbs losing the aorist at different speeds?

Skenirati looks like a recent coinage. Did it ever have a full aorist paradigm? It's hard to Google an answer for that question since all the aorist forms I expect are homophonous with other forms:

1s aorist skenirah = 1s imperfect

23s aorist skenira = 3s present (!)

1p aorist skenirasmo = 1p imperfect

2p aorist skeniraste = 2p imperfect

3p aorist skeniraše = 23s (not 3p!) imperfect

The only form in the imperfect paradigm without a homophone in the aorist paradigm is 3p skenirahu.

Am I misunderstanding what's happening? BR-UO-KEN VOWELS IN LATVIAN AND MANDARIN

Latvian has an unbalanced vowel system with three front vowels but only one back vowel in native words (disregarding length; [ɔ] and [ɔː] are only in loanwords.) This is like a mirror image of back-heavy systems like Manchu and Middle Korean:


i - u
ɛ -
æ a


i - u
- ʊ
ə o
a -

Middle Korean

i - ɯ u
- ə - o
- ʌ -
a -

The o-lessness of Latvian is not original. Its remote ancestor Proto-Indo-European had *o, and its close relative Lithuanian has o [ɔ] ([oː] in loanwords). Did *o break to [uɔ] (spelled o) in Latvian?

According to the Britannica (which spells Latvian [uɔ] as uo),

The differences between Lithuanian and Latvian can be summarized in very broad terms by saying that Lithuanian is far more archaic than Latvian and that modern written Lithuanian could in many instances serve as a 'protolanguage' for it. For example, Lithuanian has quite faithfully preserved the old sound combinations an, en, in, un (the same is true of Old Prussian, Curonian, Selonian, and, possibly, Semigallian), while they have passed in every case to uo, ie, ī, ū in Latvian; thus, Lithuanian rankà (Old Prussian rancko) = Latvian rùoka “hand,” Lithuanian peñktas (Old Prussian penckts) = Latvian piekt(ai)s “fifth,” Lithuanian pìnti = Latvian pīt “to weave, to twine,” and Lithuanian jùngas = Latvian jūgs “yoke.”

All old *VN sequences have Latvian reflexes with high vowels (in bold):

*in > *ĩ > ī [iː] *un > *ũ > ū [uː]
*en > *ẽ > ie [iɛ] *an > > *õ > o [uɔ]

The intermediate nasal vowel stages are my guesses. Lithuanian has long vowels from old nasal vowels; their nasality is still indicated with a subscript nosinė: ą [aː], ę [æː], į [iː], and ų [uː].

The shift of to reminds me of the shift of *aŋ to o in Tangut and its northwestern Chinese neighbor.

The partial raising of *ẽ and *õ contrasts with the lowering of French nasal vowels: e.g.,

fin [fɛ̃] < Latin finis

un [œ̃] < Latin unus

Unlike my pre-Latvian, French has no high nasal vowels. How could nasal vowels develop into two different directions?

The Latvian shift must have occurred after the borrowing of sods [suɔts] 'penalty' from some Slavic source similar to Polish sąd [sɔnt] 'court' (< Proto-Slavic *sǫdъ [sõdʊ]). I'd like to see more early loanwords with [uɔ].

Are there any nonnasal sources of [uɔ]?

Standard Mandarin has an [uɔ] that is the result of breaking before labial and coronal but not velar initials:

*pɔ > bo [puɔ] 'wave'

*tɔ > duo [tuɔ] 'many'

*kɔ > ge [kɤ] (not *go [kuɔ]!) 'song'

I can understand why a labial [u] would develop before a labial initial, but not why it would develop before a coronal initial. THE THIRD PERSON AND ZERO MARKING

Jacques (2009: 4-6) drew parallels between the distribution of his Proto-Tangut third person patient suffix *-w and Northern Qiang -w 'id.' but also noted where the parallels break down:

In Qiang, the -w suffix is present on all transitive forms with 3rd person patient, whereas in Tangut, only the 1Sg>3 and 2Sg>3 forms have stem 2, not 1Pl>3, 2Pl>3 and 3>3 as would be expected if the distribution of *-w in proto-Tangut were identical to Northern Qiang -w.

However, the restriction of a suffix originally found on all forms to only 1Sg>3 and 2Sg>3 is not undocumented in languages of the Qiangic branch. In Rgyalrongic languages, the past tense -s suffix was originally found on all verbs (transitive and intransitive) and on all forms, as in modern-day Situ Rgyalrong. However, in Japhug Rgyalrong (where it is realized -t or -s depending on the subdialect), it only occurs in the 1Sg>3 and 2Sg>3 forms of open-syllable transitive verbs (Jacques 2004:337). Therefore, it does not seem absurd that the -w suffix of proto-Tangut would have undergone a restriction from all transitive 3rd person patient forms to only 1Sg>3 and 2Sg>3, follow a path of evolution typologically similar to the past tense suffix of Japhug.

This got me thinking about how much of Slavic lost *-tь in the third person singular present but retained other person/number suffixes:

Old Church Slavonic (i.e., old South Slavic) -tъ

West and modern South Slavic

Belarusian and Ukrainian ~ -tʲ (depending on conjugation)

Russian -t < -tъ

My understanding is that Hungarian usually has zero marking for the third person singular present except for -ik verbs which have the suffix -ik (Rounds 2001: 26). So I was not surprised to see zero marking for the third person singular present in Proto-Uralic according to verbix.com. (How did the third person singular present develop a final long vowel in Finnish? Is that a trace of a suffix cognate to Estonian -b which must be a post-Proto-Uralic innovation?)

I was going to propose that the third person singular might be more prone to phonetic erosion (or the absence of affixes) because it might be the most frequently used*. I assume we talk about third parties more often than we talk about ourselves or those listening to us. However, English is a counterexample - the present is unmarked except in the third person singular!

I'd like to see a survey of zero marking. Where is it most and least likely? I doubt there is a language in which all verb forms have affixes except for the second person dual subjunctive.

And what about subtractive marking: e.g., Russian knig, the genitive plural of kniga 'book'? In that particular case, I know that the genitive plural used to end in the short vowel which was lost (e.g, in the third person singular present ending -tъ above), but does that explanation apply to all other cases? (I don't know, since I've never seen the phenomenon outside Slavic.)

*Could frequency be the reason for the third person being called prathamapuruṣa- 'first person' in traditional Sanskrit grammar?

