This entry concludes my trilogy on the Hindi song title "Āp jaisā koī" (Someone Like You).

Hindi koī 'someone' is from a combination of two interrogative pronouns:

Gloss 'who' 'what'
Proto-Indo-European *kʷo (> Eng who) *kʷid (> Latin quid)
Sanskrit kaś- < ka-s before c- -cid (a suffix 'some'; 'who' + 'some' = 'someone')
Pali ko < *kaw < *kaẓ < *kas before voiced consonants -ci
Hindi ko-


1. *kʷi- also has an unexpected Skt reflex ki- without palatalization: e.g., ki-m 'what'. Perhaps this k- is not a partial retention of original *kʷ-; it could have been restored by analogy with other k-forms like ka-s 'who'. (Start of the paradigm of kas, kim, etc. on p. 194 of Whitney 1896; the rest is like that of ta- 'he' on p. 189 of Whitney 1896.)

2. Pali is only given as an example of a Middle Indo-Aryan language. It is not the ancestor of Hindi. See below.

3. In Pali, ko, originally the most common sandhi variant of Skt kas, became the only nominative singular form of 'who', and replaced kaś- before -ci < Skt -cid. (Full paradigm of ko from Minayeff 1872.)

4. Hindi presumably lost medial -c- with compensatory lengthening of -ī.

5. The Hindi word for 'who' is kaun < Skt kaḥ punar  kas punar 'who again'. It is not from Pali ko pana.

6.3.0:36: Hindi o in koī 'someone' is a long [oː] and is not a retention of the short *o in Proto-Indo-European *kʷo 'who'. The vowel has gone nearly full circle: PIE *o > Skt a-s > *aẓ > *aw > o [oː]. LIKE A DRAGON'S EYE

Since I traced the origin of the first word in Hindi "Āp jaisā koī" (Someone Like You) in my last entry, I might as well cover the other two words.

Today I opened my copy of the new edition of Watkins' (2011) The American Heritage Dictionary of Indo-European Roots to look for the Proto-Indo-European roots of Hindi jaisā 'like':

Sanskrit (Turner #10458) Proto-Indo-European (Watkins 2011) English cognates (Watkins 2011)
yādṛśa- 'which like' ya-d 'which' *yo-d* (relative stem) < *i- (pronominal stem) yon(der)
yes, yet
id, identical, identity, ibid. (with neuter -d also in Skt)
ilk (with -lk cognate to like), if, iterate, item (p. 37)
dṛś 'see' *dr̥k 'see' dragon, drake, rankle, tarragon < Greek drakōn 'dragon' < 'monster with the evil eye' (p. 17)
-a- ́*-o- "Ultimately appears in the English combining vowel -o-,- from Latin and Greek combining vowel -o- (used to join the members of a compound)" (p. 61)


1. If Skt *y- became Hindi j-, does Hindi y- have any sources other than y-initial loans (e.g., from Sanskrit)?

I wonder if Vietnamese *y- ever had a *j-stage along the way to becoming Middle Vietnamese d ~ [dy] which both later became Hanoi [z] (still spelled d). (See Gregerson 1969: 156-157.)

2. Is the first long ā of Skt yādrśa- compensating for the loss of -d-? (The word should theoretically be *yad-drśa- with a short first a and a double -d-d-.)

3. Why do the shorter Skt form yādrś- and its root noun drś- 'seeing' have Vedic nominative singulars yādṛ and dṛwith a final nasal - [ŋ]? They later have the expected nom. sg. -k in Classical Sanskrit.

4. Is the final long ā of Hindi jaisā a contraction of some longer ending added to Skt yādṛśa-?

*Watkins reconstructed *yo-. I added the neuter suffix *-d.

I prefer to think of the root as *y:

zero grade: *y-Ø-d = *id > Latin id 'it', Skt id-am 'this' (neuter), i-va 'like' (see below)

(*i was the vowel counterpart of *y)

*o-grade: *y-o-d > Skt yad

6.2.00:16: English it looks like a cognate of Latin id 'it' but is actually from Proto-Indo-European *kid. Old English hit 'it' (cf. Dutch het 'it') has an h- < PIE *k- that was lost later. Its masculine counterpart English he (cf. Dutch hij 'he') retains initial h-.

6.2.1:44: -is- is all that remains of Skt -dṛś- in Hindi jaisā. So I doubt many Hindi speakers realize that -is- is related to dekh-nā 'see' < *dek-ṣ- (somehow from Skt √dṛś 'see'; Turner #6507).

Perhaps Sanskrit iva 'like' (see above) has no descendants in modern Indo-Aryan (judging from its absence from Turner) because the extreme changes that reduced -dṛś- to -is- (would have?) reduced iva to almost nothing, and a longer word was more easily understood than *i. YOUR OWN FEATURAL AVERAGING

I have long been puzzled by the Hindi polite second person pronoun आप āp which doesn't resemble any other Indo-European second person pronouns I can think of. It looks like Skt āp 'obtain' or āp- 'water'. Today I heard it in the song "Āp jaisā koī" (Someone Like You) and learned that āp is from Sanskrit ātman 'self' (Turner #1135).

(6.1.00:22: ātman is the source of Gandhi's epithet Mahātmā < mahā- 'great' + ātmā, nominative singular of ātman 'self'.)

The shift of āp from 'self' to 'you' reminds me of Japanese onore 'self' which became an impolite second person pronoun (and once was also a first person pronoun*).

The -p- of āp is like a featural average of -tm-:

- it's an oral stop like -t-

- but it's also labial like -m-

I would have expected the polite second person pronoun to be from Skt yuṣmad- 'you' which turns out to be the source of the Hindi familiar second person plural pronoun तुम tum 'you' (Turner #5889). The initial t- absent in Sanskrit is presumably by analogy with the Hindi second person singular pronoun तू  tū** < Skt tvam 'thou'.

*6.1.00:10: Descendants of Skt ātman 'self' also include first person pronouns:

Old Mārwāṛī āpa 'we (inclusive)'

Mālwāī Punjabi āpã 'we'

Gujarati āpaṇ 'we (inclusive)'

Kacchī Sindhi ̃ ʻwe, usʼ, pāṇ ʻweʼ

Unlike Jpn onore, none of these forms are singular. Why are they all plural?

**6.1.00:11: Why does Google Translate convert English you into the Hindi nonword तु tu with a short vowel instead of the correct  तू  with a long vowel? _____ = OSCAN

Last night, Schwering (2010) made me realize something obvious: of course Latin would have Greek loans through other languages like Oscan as well as loans directly from Greek. Geography necessitates this: the original Latin-speaking area was isolated and far from speakers of Greek.

The situation reminds me a bit of how Chinese loanwords entered Japanese through the Korean peninsula:

Greek > Oscan (and others?) > Latin

Chinese > Paekche (and others?) > Japanese

How do we know that Latin Aiāx 'Ajax' is from Oscan? Here's my understanding of the process - any errors are mine and not Schwering's:

1. Greek Αἴᾱς <Aíās> is borrowed into Oscan with a final -s.

2. In Oscan, the old *-k-s nominative singular became -s(s). However, the final *-k of the stem remains in oblique cases: e.g.,

meddíss < *-k(e)s (an Oscan title; nom. sg.; Latin meddix may have been borrowed from Oscan before the *-k-s > -ss shift.)

(5.31.4:17: This word was written in the Greek alphabet as μεδδειξ <meddeix> (Buck 1904: 23), a spelling that may reflect the earlier *-k-s.)

medíkeís (gen. sg.)

3. Therefore Oscan speakers declined Aias as a -k stem by analogy with other -s(s) nouns:

*Aias(s) (nom. sg.; cf. meddíss)

*Aiakeís (gen. sg.; cf. medíkeís)

4. Latin speakers then borrowed the Oscan form as a -k stem and changed the final consonant of the nominative singular to match that of other -k stems including earlier Oscan loans like meddix:

Oscan *Aias(s) : Latin Aiāx (nom. sg.)

Oscan *Aiakeís : Latin Aiāc-is (gen. sg.)

5.31.4:14: The Oscan letter transliterated as í with an acute accent "is used to indicate an open i-sound, representing etymologically a short i, a [long] ē, a short e in hiatus, and occurring regularly in i-diphthongs and in the combination representing [long] ī" (Buck 1904: 22). AJA_ FROM _____?

Thanks to a generous dōnātor, I got to see a copy of Schwering (2010) which reveals a solution to the mystery of Ajax. I'm embarrassed to admit that I didn't guess it myself. One clue is the number of underscores after "from" in the title of this post. More soon. A TOL TALE (PART 1)

At the end of "Pre-Greek: The Pre-Greek loans in Greek" (2007), Robert SP Beekes wrote,

The more we know about Indo-European, the less is possible. As our reconstructions become more and more precise, they have to conform to all the rules we have established by now. This holds for all etymological work: in a way, then, it becomes more difficult.

Similarly, the more we know about the 'Altaic' languages, the less is possible. The 'Altaic' family itself may not be possible.

But let's suppose it existed. How far can I go to reconstruct an 'Altaic' word for stone? On the 19th I came up with a new (?) take on an old etymology which I've been refining over the past couple of days. It happens to involve l and d which were the subjects of my last two posts on Ulysses and Tangut ld-.

Proto-Altaic ɔɔ λ ɔ gun
Proto-Turkic d aa λ

Proto-Mongolic c i l a ɣun
Proto-Tungusic j o l o
Proto-Koreanic t o r a k
i s o-i


Starostin reconstructed *ty-, but a voiced consonant is needed to account for voiced initials in Turkic and Tungusic, and a cluster *ty- or *dy- should correspond to a Proto-Koreanic cluster *ty-, not simple *t-. (A workaround: PA *tiy- > PK *ty- but PA *ty- > PK *t-.)

*ɔ(ɔ) is a compromise between *a(a) and *o. It's unlike PM and PJ *i, but see below.

5.29.20:21: *-gun could be dropped from the PA reconstruction and regarded as a Proto-Mongolic suffix, but a maximalist who wants to relate everything would see a connection with PK *-k. 


is the source of Chuvash l and non-Chuvash  Turkic š/s.


The *ɔɔ became a palatal vowel to assimilate to the preceding palatalized consonant and the following palatal consonant:

*dʸɔɔλɔgun > *dʸiλagun > *cilaɣun

Can you spot what is improbable about this derivation? Answer next time.

(5.29.23:25: Also next time: a 'deeper' PM reconstruction.)


PK has no initial voiced obstruents, so PA *dʸ- devoiced to PK *t-.

(5.29.20:26: Only the beginning of the word survives in modern Korean 돌 tol.)

For more on Koreanic words for 'stone', see "Precious Petros" (part 1 / part 2)


Japanese ishi 'stone' looks nothing like any of the above forms. It has a very complicated derivation involving a PJ suffix *-i in place of the original *-gun:

*dʸɔɔλɔ(gun) > *dʸɔɔiλɔ > *jɔiɕɔ*yoiso-i > *yɨsɨ > *yisi > Old Japanese isi > modern ishi

5.29.20:42: One could even have *-ɔgun be the end of the PJ stem:

*-ɔgun > *-ɔgũ > *-ɔgu > *-ɔɣu > *-ɔu > *-ɔ > *-o


In "Pre-Greek: The Pre-Greek loans in Greek" (2007), Robert SP Beekes listed "variations which are not found in inherited [Greek] words" and which "are due to the adaptation of words of a foreign language [his non-Indo-European 'Pre-Greek' substratum] to Greek." One of those variations was <l> ~ <d> which was in last night's list of variants of 'Ulysses'. In his section 5.7, Beekes speculated that <l> ~ <d> "might point to a [Pre-Greek] dental fricative, ƛ." I am not sure what ƛ represents. It is not an IPA symbol; in Americanist usage it stands for a voiceless alveolar lateral affricate [tɬ]. I would rather guess that <l> ~ <d> might point to a Pre-Greek voiced alveolar lateral affricate [dɮ] (= λ in Americanist notation).

Now I wonder if Tangut also had [dɮ]. For years, I've followed Gong's (1997) reconstruction of five class IX consonants in Tangut with phonetic speculations of my own and have reluctant to adopt more elaborate reconstructions like Tai's six-consonant set including clusters (2008):

Gong 1997 l- lh- r- z- ź-
My phonetic speculation l- ɬ- r- ɮ- ʐ-
Tai 2008 l- ld- lh- r- ʁz- ʁź-
Most common Tibetan transcriptions (Tai 2008) l- ld-, zl- lh- r- gz- gzh-

I was skeptical about ld- because that is the only lC-cluster in Tai's reconstruction. Are there other languages which have ld- as their only lC-cluster? However, now I think it could be reinterpreted as

voiced [dɮ] corresponding to Gong's voiced l-

one instance was transcribed in Tibetan as c- possibly implying an affricate

voiceless [tɬ] corresponding to Gong's voiceless lh- (my [ɬ])

The -d- in the Tibetan transcriptions reflects the stop halves the affricates whereas the z- reflects the fricative halves of the affricates.

Alas, there are no Tibetan transcriptions* like

*lj- and *ldz- for [dɮ]

*sl-, *lc(h)-, and *lts(h)- for [tɬ]

which would make me feel more comfortable about my lateral affricate hypothesis.

My new set of class IX initials could be diagrammed as

L- or R- voicing approximant fricative affricate
lateral voiceless (hl-?) ɬ- tɬ-
voiced l- ɮ- dɮ-
retroflex r- ʐ- (VII: dʐ-)

For parallelism, I would like a voiceless approximant hl- corresponding to the voiceless fricative ɬ-, but I do not know of any evidence for a distinction between the two. If Tangut ever had both, they might have merged.

The affricate dʐ- corresponding to the fricative ʐ- is a class VII initial and is therefore in parentheses.

Perhaps there had once been a voiceless *hr- corresponding to the class VII voiceless fricative ʂ- and voiceless affricate tʂ-. Maybe hr- merged into ʂ- just as *hl- might have merged into ɬ-.

I remain hesitant to reconstruct ʁZ-clusters instead of Z-fricatives. Is there any language that has ʁZ-clusters but lacks simple ʁ- and Z-fricatives? Simple z- and zh- Tibetan transcriptions also exist. The common preinitial g- may be a tonal letter rather than an attempt to transcribe Tangut ʁ-.

*The hypothetical transcriptions *lch-, *lts-, *ltsh-, and *ldz- are un-Tibetan. The transcribers were not hesitant to write unusual consonant combinations for Tangut consonants that lacked counterparts in their dialect(s) of Tibetan.

Tangut fonts by Mojikyo.org
Tangut radical and Khitan fonts by Andrew West
Jurchen font by Jason Glavy
All other content copyright © 2002-2012 Amritavision