Indo-European Cognate Words

There is this new resource called IE-CoR

https://iecor.clld.org/

It is a cognate database for a basic word list similar to Swadesh-100 called 'Jena 170' - I didn't find any documentation of the list but it does not matter.

 

I have always wondered whether there are words that stay being cognates in the major Indo-European languages of the modern world. So with this resource, I did a little analysis.

 

The way I go about it is to go through all 170 "meanings," and look to see if the modern English, Hindi, and Spanish word for the same meaning are all cognates. English, Hindi, and Spanish are the most spoken Indo-European languages nowadays, and so I started there. My findings are below:

 

1. There are 19 meanings out of the Jena 170 list for which the English, Hindi, and Spanish words are cognates. (11%)They are:

A) one, two, three, four, five;

B) nail, nose, tongue, tooth, foot, eye;   

C) name, new;

D) sun, star, day;

E) full, sew, horn.

 

2. So at least from the criteria "cognate retention" in English, Hindi and Spanish, the major words that stay cognate are A) basic numbers; B) body parts, mostly on the face; the rest are mostly nouns, C) starts with "n" as the initial consonants; D) are vaguely a group.

 

3. Look at this 19 list from a different angle. Modern major Indo-European languages beyond English, Hindi, and Spanish -- based on Ethnologue (2023) from Wikipedia  on total L1+L2 speakers -- includes French, Bengali, Portuguese, Russian, Urdu, German, Marathi, Persian, Italian. Can the above sets of words find cognates in the other 9 modern I-E languages? I have expected the Indo-Aryan, Romance and Germanic languages to work, so the real test is Russian, and maybe Persian. The results are:

- Cognates in all 12 modern I-E languages: one, two, three, four, five, nail, name, new, sun. Of this list, I also looked at whether cognates are in the ancient languages of Vedic Sanskrit, Pali, Latin and Greek (the real test is Greek), and only "one" does not have a cognate in Greek.

- Cognates in 11 modern I-E: foot, star, tooth (not cognate in Russian); nose (not cognate in Tehran Persian)

- Cognates in 10 modern I-E: full (not in Bengali, Marathi), day, sew (not in Persian, German); tongue (not in Urdu, Persian)

- Cognates in 9 modern I-E: horn (not in Russian, Persian)

- Cognates in 8 modern I-E: eye (not cognate in Russian, Persian, Bengali, Marathi)

 

To repeat, the true top winners are: two, three, four, five, nail, name, new, sun

High honorable mentions: one, nose, tooth, foot, star    

 

3. Now back to just for cognates among English, Hindi and Spanish, there are also 6 other words that may or may not be counted:

F) eat, flower, right

G) navel

H) give, smoke, snake. 

 

4. Eat, flower and right might are not ultimately cognates according to the I-E database, 

Eat is from PIE *h₁ed-, but Hindi khānā is from PIE *kʰād-. I don't know enough linguistics to know whether these two are irreconcilable, or whether they can be dialectic difference in PIE.

Flower is from PIE *bʰleh₃-, but Hindi phūla is from Sanskrit phulla-. Is the Sanskrit ultimately derivable from *bʰleh₃-? I can't tell, so this is also a "maybe"   

Right is from PIE *h₃reg̑-, Spanish is derecho; Hindi dāyām̐ is from PIE *dek̑s-. The PIE root looks pretty different, but it is just that both Spanish and Hindi starts with a d and is a bit confusing for me.

Note in all 3 cases, it is the Hindi word that is not cognate with English and Spanish.

 

5. Navel is a strange case. Navel is the English gloss for the meaning, but then the IE-CoR base uses "belly button" as the lexeme for that gloss in English. For me, this is the one most likely to join the group of 19 words above. It is also a body part, and also starts with an n. Navel is cognate in 10 modern I-E language, including English, but excluding Russian and Marathi.

 

6. Give is the gloss that gives Spanish dar which is ultimately from PIE *deh₃-. Give is not a cognate, but in English donate is cognate, with essentially the same (more specialized) meaning. But I can understand that donate is less fundamental than give.

Smoke is the gloss that gives Spanish humo, ultimately from PIE *dʰu̯eh₂-. English fume is a cognate.

Snake is the gloss that gives Spanish serpiente, and easy to see English serpent is cognate.

In these 3 cases, English has a non-Germanic word that is cognate with Hindi and Spanish, but the most basic words are not cognate    

 

7. Of the final list of 20 (including navel), they all have cognates in Latin, Sanskrit and Pali; but 6 do not have the same cognates in ancient Greek (day, eye, nose, one, sew, tongue).

 

The above takes a lot of time to generate. Below are easy to directly read from the database.

 

8. Let's look at the most "universal and stable" I-E words:

a. Same cognates in all 13 clades of I-E (the database counts the Indo-Iranian Nuristani as a 3rd clade of Indo-Iranian): two, three, name (but in Baltic only Old Prussian cognates, but Latvian, Latgalian and Lithuanian use another root)

b. Same cognates in 12 clades: four (different the earliest Anatolian clade), five (not in database, possibly not attested in Anatolian), nail, new (different in Albanian)

c. Same cognates in 9 clades: sun (different in Anatolian, Tocharian, Armenian, and Albanian)

 

9. So the final ranking:

Tied 1st: two, three - all lexemes in database from PIE *du̯o-, *du̯i-

Tied 3rd: five - all lexemes but lacking Anatolian; name 152 /161 from same I-E root, 13 clades;

5th: four - 156 /159 lexemes in database covering 12 clades except for Anatolian

6th: nail - 146 / 152

7th new - 139 / 158

8th: sun