Stockholm university logo, link to start page

Project about Hindu Kush languages completed

The research project Language contact and relatedness in the Hindu Kush region has been carried out in the period 2015—2020. The project systematically compared languages spoken in this distinctive and linguistically diverse region. One tangible outcome of the project is the online database Hindu Kush Areal Typology.

Färgglada hus på bergssluttning i Kargil, Ladakh. Foto: Henrik Liljegren
Kargil, Ladakh, en av platserna för datainsamling, i maj 2018. Foto: Henrik Liljegren

Hindu Kush (northeastern Afghanistan, northern Pakistan, and Indian Kashmir) is a distinctive region with large elevation differences and some of the world's by far highest mountain peaks. The languages belong to six linguistic phyla (or families): Indo-Aryan, Iranian, Nuristani, Sino-Tibetan, Turkic and the language isolate Burushaski.

Project about language contact and relatedness

The research project Language contact and relatedness in the Hindukush region systematically compared the languages spoken in the region, with the aim of finding out how similar or different these languages are in their structures (grammar, sound systems, etc.). The focus has been to investigate whether there is evidence that the languages have gradually become more similar due to contacts between geographically related speaker groups, and at the same time have become more different from their closest linguistic relatives outside the region, or if, for example, the physical environment has favoured isolation and conservation, or even the development of unusual linguistic properties. Henrik Liljegren has been the principal investigator for the project.

The following conclusions can be drawn from the completed project:

  1. There is a clear link between geography and language structure in the Hindu Kush that often cuts across family boundaries. Contacts between adjacent communities have made their languages similar to each other. This is particularly clear at the local level. There is, for example, an area in western Hindu Kush where many characteristics are shared across language boundaries and which clearly overlaps with an area that remained relatively isolated and where the population as recently as 150-200 years ago converted to Islam.
  2. The various language domains show partly different patterns in terms of contact patterns. The language features that especially characterize the languages of Hindu Kush – regardless of relatedness - have mainly to do with phonology and lexical organization. In terms of word order and sentence structure, these languages are often included in larger areal constellations; they are similar in these respects to the languages of South Asia in general or to the languages of large parts of Eurasia.
  3. Hindu Kush and the entire contiguous Himalayan highlands probably formed a multilingual reservoir during prehistoric times, with representatives of several now extinct language families, with the language isolate Burushaski as a single contemporary remnant. This diversity has gradually diminished, first through Indo-European expansion, beginning about 4,000 years ago, and then through long periods of cultural and political influences from the surrounding lowland cultures.

In addition to direct research results, the interaction with native speakers, several of whom are language activists, has encouraged and contributed to the documentation of low-resource and endangered languages in the region.

Henrik Liljegren med bergslandskap i bakgrunden. Foto: Sani Marzban
Henrik Liljegren in Afghanistan. Photo by Sani Marzban

More about the project

Language contact and relatedness in the Hindukush region, with Henrik Liljegren as its principal investigator, has been carried out in the period 2015—2020. This is now completed, and a final report has been submitted to the project funder Vetenskapsrådet, the Swedish Research Council(421-2014-631).

How the study was conducted

79 speakers from 59 languages were recruited to participate in the study. In collaboration with three institutions in the region, interactive 4-5 days’ workshops were arranged, with speakers of 5-10 languages at a time. Audio and video recordings were made of wordlists, one longer questionnaire, a text translated from a major language (Urdu, Dari, Pashto), and a couple of experimental/interactive elicitation sessions. The material was transcribed and processed to categorize and analyse the languages based on 80 structural properties within five domains: phonology (its sound system), lexical organization, word order, grammatical categories and sentence structure. A comparative basic wordlist was also established for the purpose of confirming or revising previously proposed classification.

Online database 

One tangible outcome of the project is the online database Hindu Kush Areal Typology. It has been established to make processed project data and analysis available in the form of wordlists with linked audio files, descriptions of 80 structural linguistic features and their distributions displayed in tables and interactive maps. The design, which allows for regular instalments in the future, is a collaborative effort carried out with the Max Planck Institute for the Science of Human History in Jena, within the framework Cross-Linguistic Linked Data (CLLD).

The database is now openly available:

Hindu Kush Areal Typology

Map over the Hindu Kush-Karakorum, target area of the project.
Map over the Hindu Kush-Karakorum, target area of the project.