Home

18.5.5.23:59: THE GENDER OF 'DATE' IN BALTO-SLAVIC AND ROMANCE

On the same Wiktionary page as Dutch datum 'date' (masculine despite its Latin neuter ending -um!) are

Czech datum (neuter); cf. Slovak dátum (masculine; why a long á that doesn't match Czech or Latin?; its neighbor Hungarian dátum also has a long vowel)

Serbo-Croatian and Slovene datum (masculine)

Macedonian <datum> is also masculine. The shift to masculine in Slavic is understandable since consonant-final nouns are generally masculine, and Latin -um is not a Slavic suffix and hence prone to reinterpretation as the ending of a stem.

Leaving Slavic, Latvian has no neuter, and its feminine stems generally end in vowels, so masculine datums is also understandable.

However, Latvian's sister Lithuanian has feminine data (which looks like the Latin plural!) rather than masculine †datumas (see Wikipedia on LIthuanian declension).

And going back to Slavic, Polish also has feminine data, and Bulgarian, Macedonian, Belarusian, Russian, and Ukrainian have feminine <data>. Romance languages have feminine data (French date and Romanian dată) too. Wiktionary derives the Romance forms from a Late Latin data. fdb explains:

Italian, Spanish, Portuguese (etc.) data, and French date (whence English date) are all taken from Mediaeval Latin data, the plural of classical Latin datum, but reinterpreted in these languages as a singular noun. German and Dutch use the classical singular form datum.

All of these are bookish borrowings from Mediaeval or Classical Latin (so-called cultisms) and not organic descendants of the Latin words.

[Someone asks what organic descendants would look like.]

In that case one would expect *dada in Spanish, Portuguese and Italian.

Are the -um forms in Slavic and Latvian borrowings from German Datum?

5.6.0:01: English date then got borrowed into German as das Date which is presumably neuter by analogy with Datum.

5.6.0:09: Added quotation from fdb.

5.6.0:28: Danish date from English has common gender (cf. German above).

5.6.0:32: Added Romanian dată.


18.5.4.23:59: THE GENDER OF DUTCH '-ISM'S AND 'DATE'

Not in time for May Day ...

French communisme is masculine, as is its Latinized German equivalent Kommunismus with a restored Latin masculine nominative singular ending -us. So why is Dutch communisme (and other -isme words like socialisme) neuter?

Conversely, datum has a Latin neuter nominative singular ending -um, and Datum is still neuter in German. So why is Dutch datum masculine unlike, say, neuter museum which is still neuter in Dutch?

Are the genders by analogy with semantically similar words? Was there ever a time when de communisme and het datum were acceptable?

5.5.0:33: Google Books has examples of het datum from the 18th and 19th centuries. But I can't find any examples of de communisme in Dutch (as opposed to French where that is a preposition-noun sequence rather than a definite article-noun sequence).

Treffers-Daller (1994: 140) discusses French-Dutch gender mismatches and mentions Van Marle's hypothesis that French borrowings are marked and may receive the marked gender: the less frequent neuter gender (only 25% of Dutch nouns are neuter according to Tuinman 1967).

She also writes,

According to Volland (1986), many French loans obtain neuter gender when borrowed into German. About 60 percent of the borrowings keep the original gender in German, and 40 percent are allocated another gender. In most cases it is the masculine nouns who become neuter in German. It is remarkable that the same tendency for masculine words to become neuter exists in German and in Dutch.

Obviously Kommunismus is not one of those masculine words (though its -us may have made it resistant to gender shift).


18.5.3.23:59: CZECH VOWEL ASYMMETRY AGAIN

Judging from the IPA for Czech at Wikipedia, Czech vowels are phonetically as well as distributionally asymmetrical:

/iː/ [iː]

/u uː/ [u uː]
/i/ [ɪ]




/o oː/ [o oː]
/e eː/ [ɛ ɛː]



/a aː/ [a aː]

The front part of the system 'tilts downward' with the exception of /iː/ which is high.

Short /i/ is lower than long /iː/ and has no back counterpart at the same height.

/e eː/ are lower than /o oː/.

How did this system come about? /i iː/ are from earlier front *i *iː and central *ɨ *ɨː.

Was there a Ukrainian-like phase in which the central high vowels became *ɪ *ɪː? (Ukrainian has no phonemic vowel length, though.) The four front vowels in stage 2 then merged into an English-like subsystem with a higher long vowel and a lower short vowel in stage 3:

Stage 1
*i

*iː
*ɨː
Stage 2

*ɪː
Stage 3
[ɪ]
[iː]

Unlike Czech, Slovak is next door to Ukrainian, and according to the IPA at Wikipedia it has no [ɪ]; its vowel system is truly symmetrical on the phonetic level if one ignores the increasingly marginal vowel [æ]:

[i iː]

[u uː]
[e eː]
[o oː]

[a aː]

The Slovak phonology article at Wikipedia, however, paints a more complex picture: e.g., /e eː/ [e̞ e̞ː] may be phonetically higher than /o oː/ [ɔ̝ ɔ̝ː] - the reverse of Czech. (Did the presence of low [æ] - a vowel absent from Czech - incentivize speakers to raise /e eː/ for greater contrast during its heyday in the past?) Nonetheless it seems that length is not correlated with height differences unlike Czech where short and long /i/ have different heights.

Like Czech /i iː/, Slovak /i iː/ are from earlier front *i *iː and central *ɨ *ɨː So I suspect Slovak also had a Ukrainian-like phase in which the central high vowels became *ɪ *ɪ.

But maybe at some earlier point Czech and/or Slovak had a Rusyn-like stage in which central *ɨ *ɨː coexisted with front *ɪ *ɪ. I still don't understand how Rusyn can have both central /ɨ/ and front /ɪ/ since I assume both are from *ɨ. Are they in complementary distribution? Is one native and one borrowed?

5.4.0:40: Are Czech /e eː/ lower mid because they merged with */ě/ *[ɛː]? */ě/ was historically long, but its reflexes in Czech are both long and short for reasons I don't understand:

*bělъjь > bí /bliː/ 'white'

*svě > svět /svjet/ 'world'

The short reflex is /e/ which may be preceded by a secondary palatal consonant: e.g., /j/ in the case of /svjet/.


18.5.2.23:55: CZECH VOWEL ASYMMETRY

Having written about Slavic and vowels in my last two entries, I'm going to combine the two topics together.

The standard Czech vowel system appears symmetrical if one only looks at vowels in isolation. Each short vowel has a long counterpart:

/a/
/i/
/u/
/e/
/o/
/aː/
/iː/
/uː/
/eː/
/oː/

And the diphthongs form a triangle:

/eu/

/ou/

/au/

But distribution tells a more complex story.

Original *uː became /ou/ except "chiefly in noun prefixes" (Short 1993: 456). e.g., úraz 'injury' but urazit 'to injure'. Why was the prefix *u lengthened to an *uː later preserved in nouns? I still don't understand the backstory of length in Slavic.

Original *oː became uo and then a new /uː/ written <ů> (which I think of as <o> atop <u>); cf. Polish <ó> /u/ and Slovak <ô> /uo/ from earlier *oː. (I'd like to see a chronology of *oː-shifts in West Slavic.)

Loanwords supplied a new /oː/ and /au eu/ to balance /ou/.

Those back vowel developments did not have exact front vowel parallels. *iː did not become †/ei/ (though Short 1993: 464 reports ý /ɨː/ > /ej/ in colloquial Czech), and *eː only sometimes became /iː/ (Short 1993: 464).


18.5.1.23:59: INDEPENDENT VOWEL SYMBOLS IN THE INDIC SCRIPTS OF THE PHILIPPINES

Indic scripts typically have two kinds of vowel symbols:

- dependent vowel symbols attached to/in 'orbit' around consonant symbols

- independent vowel symbols

Depending on the script, vowels may be written with dependent vowel symbols plus a carrier <°a>, independent vowel symbols, or a mix of the two.

The Indic scripts of the Philippines generally only have three independent vowel symbols each, and on closer observation, some of those symbols are derived from others:

Baybayin for Tagalog on central Luzon in the north: three truly independent symbols <°a °i °u>

Hanunoo on southern Mindoro in the center: independent <°a °u:>; <°i> looks like <°a> plus a stroke on the bottom right (unlike either the dependent vowel <i> on the top or the dependent vowel <u> on the bottom)

Buhid on southern Mindoro in the center: independent <°a °u>; <°i> looks like <°a> plus a stroke on the bottom like the dependent vowel <u> rather than the dependent vowel <i> on the top)

Tagbanwa on Palawan in the southwest: <°a °i> have the same basic shape with different extra strokes: one on the bottom for <°a> and another on top for <°i>; neither stroke matches the dependent vowel <u> on the bottom or the dependent vowel <i>); only <°u> is not derived from another symbol

Kulitan for Kapangpangan on central Luzon in the north: independent <°i °u>; <°a> looks like <°u> plus an extra stroke on the bottom left (unlike the dependent vowel <u> on the bottom); <°e °o> look like <°a°i> and <°a°u>, reflecting their apparent origin as "monophthongized diphthongs".

Tagalog is the most conservative; it alone preserves three completely different vowel symbols that still resemble their Indic prototypes.

It is not surprising that the Mindoro scripts have the same innovation (replacing <°i> with a <°a>-derivative).

Tagbanwa and Kulitan seems to have each gone their own way. Tagbanwa is isolated by the sea, but Kulitan is next door to Baybayin.


18.4.30.23:59: WHAT HAPPENED TO UKRAINIAN NOMINATIVE PLURAL ADJECTIVES?

I almost 'corrected' Ukrainian <zorjani> 'stellar (nom. pl.)' to †<zoryany> with a <y> ending that I expected by analogy with Russian <ye> and Belarusian <yja> < *-ye after 'hard' (nonpalatalized) stems. But the nominative plural ending is <i> regardless of stem type. Compare:

stem type
'hard'
'soft' (palatalized)
gloss
'new'
'autumn'
gender/case/number
m. nom. sg.
nom. pl.
m. nom. sg.
nom. pl.
Russian
<novyj>
<novye>
<osennij>
<osennie>
Belarusian
<novy>
<novyja>
<asenni>
<asennija>
Ukrainian
<novyj>
<novi>
<osinnij>
<osinni>

Did <i> spread by analogy through all adjective paradigms despite the fact that hard stems outnumber soft stems (which would have led me to guess that <y> would win out)? Did the higher frequency and lower markedness of <i> in Ukrainian help it to defeat its less palatal competitor <y>?

5.1.0:07: Added table.

5.1.22:22: Maybe Ukrainian shares an areal feature with Polish which has soft novi 'new (m. pers. nom. pl.)' instead of †nowy. (But the non-m. pers. nom. pl. is still hard nowe rather than soft †nowie.)

Slovak, another neighbor of Ukrainian, has a mixed pattern like Polish: soft noví 'new (m. anim. nom. pl.)' ~ hard nové (other nom. pl.). A consistently hard paradigm would have †nový́ ~ nové and a consistentl soft paradigm would have noví ~ †novie. (Both í and ý́ are /ɨː/, but in the past I assume ý was something like /ɨː/. No long /ieː/ exists.)

So does Czech: noví 'new (m. anim. nom. pl.)' instead of †nový. (As in Slovak, both í and ý are /ɨː/, but in the past I assume ý́ was something like /ɨː/.) Unlike any of the above languages, Czech has three types of nominative plurals:

1. soft noví 'new (m. anim. nom. pl.)'

2. hard nové 'new (m. inanim. + fem. nom. pl.)' instead of soft †noví < *-ie

3. hard nová 'new (neut. anim. nom. pl.)' instead of soft †noví < *-ie < *-a̋

Interslavic doesn't have a 'soft' e, so the non-m. anim. nom. pl. has to be hard:

soft novi 'new (m. anim. nom. pl.)'

hard nove 'new (other nom. pl.)'

This two-way distinction is hard for me to grasp since I'm accustomed to Russian having a single form for both categories.


18.4.29.23:59: STAR WARS IN SLAVIC

Having just linked to the Belarusian Wikipedia's entry on Star Wars, I was surprised by how Star was translated as <Zordnyja> which isn't cognate to the 'star' word in most of the other Slavic titles for the movie:

South Slavic

Bosnian zvijezda 'star'

Croatian Zvjezdani 'stellar'

Serbian zvezda 'star'

Slovenian zvezd 'of the stars'

Bulgarian <Mežduzvezdni> 'interstellar'

Macedonian <zvezdite> 'the stars'

West Slavic

Polish Gwiezdne 'stellar'

Silesian Gwjezdne stellar' (did an author of this article translate the title?)

Slovak Hviezdne 'stellar'

East Slavic

Russian <zvëzdnye> 'stellar'

The exceptions are Ukrainian <Zorjani> 'stellar' and Czech Star (Wars) (untranslated!).

I was expecting a Belarusian adjective derived from <zvjazda> 'star' (the name of this newspaper that I've seen online) - something like Interslavic zvězdne. <Zvjazdnyja>?

Wiktionary derives Belarusian <zorka> 'star' from Proto-Slavic *zorja. But is the word attested outside East Slavic? The only cognate I know of is Ukrainian <zirka> 'star' whose <i> is unexpected; normally *o > <i> before or *ъ, not *a. (The Ukrainian adjective <Zorjani> 'stellar' preserves *o.)

4.30.1:30: Filled out the list of equivalents of Star and added the final note about <Zorjani>.

4.30.21:21: I might as well survey the second half of the title in Slavic as well. I'm going to guess that it's some cognate of Belarusian <vojny> 'wars' almost everywhere: cf. Interslavic vojny 'wars'. I seem to recall an exception other than the untranslated Wars in Czech - ah, it was Serbo-Croatian!

South Slavic

Serbo-Croatian ratovi 'wars' (but would vojne be theoretically possible?)

Slovenian vojna 'war'

Bulgarian <vojni> 'wars'

Macedonian <vojna> 'war'

West Slavic

Polish and Silesian wojny 'wars'

Slovak vojny 'wars'

East Slavic

Ukrainian <vijny> 'wars' (nom. pl. of <vijna>; as with <zirka>, why did *o become <i> even without a following or *ъ?)

4.30.23:23: Duh, the word was *vojьna in Proto-Slavic. And I suppose <zirka> 'star' is from a earlier *zorьka or  *zorъka.

Russian <vojny> 'wars'

Serbo-Croatian rat turns out to be the cognate of Ancient Greek ἔρις éris 'strife' ... and English earnest!? I see the word is in East Slavic as well, but not West Slavic, so vojna is the best choice for Interslavic since it's understood across the entire family.


Tangut Yinchuan font copyright © Prof. 景永时 Jing Yongshi
Tangut character image fonts by Mojikyo.org
Tangut radical and Khitan fonts by Andrew West
Jurchen font by Jason Glavy
All other content copyright © 2002-2018 Amritavision