Stockholms universitet

Henrik LiljegrenProfessor

Om mig

Som lingvist är jag speciellt intresserad av språken i Hindukush-Karakorumregionen, dvs bergsområdet som täcker norra Pakistan, nordöstra Afghanistan och det omtvistade Kashmir. Många av dessa språk är knapphändigt beskrivna, särskilt utsatta och resurssvaga. Jag bodde under en tioårsperiod i norra Pakistan och har både under den tiden och därefter bedrivit fältarbete i enskilda språk såväl som arealtypologisk forskning i tät samverkan med modersmålstalare av ett stort antal Hindukush-språk. 

Förutom forskning är jag också engagerad i språkrevitalisering (skriftspråks- och lokal litteraturutveckling, modersmålsundervisning osv), handledning och rådgivning åt språkaktivister i att dokumentera de egna språken samt i att bygga nätverk mellan lokalsamhällen och organisationer.



Den undervisning jag bedriver vid Stockholms universitet är främst inom allmän språkvetenskap och språkdokumentation. Jag är också involverad i uppsats- och doktorandhandledning.


Mitt forskningsfokus är för närvarande att dokumentera och beskriva gawarbati, ett av många knapphändigt dokumenterade och resurssvaga språk i Hindukushregionen beläget i Asiens höghöjdsområde. Syftet med projektet är att åstadkomma en bestående och representativ ansamling data i form av annoterade ljud- och videoinspelningar och en lexikal databas. Detta utförs i nära samverkan med det lokala språksamfundet och med ett regional språkresurscenter. 

I ett nyligen avslutat forskningsprojekt (2015-2020), producerades en lingvistisk profil för Hindukush-Karakorumregionen, baserad på primärdata insamlad från 59 språkliga varieteter inom ramen för projektet. Ett konkret resultat av projektet är online-databasen Hindu Kush Areal Typology: 



I urval från Stockholms universitets publikationsdatabas

  • Language trees with sampled ancestors support a hybrid model for the origin of Indo-European languages

    2023. Paul Heggarty (et al.). Science 381 (6656)


    Languages of the Indo-European family are spoken by almost half of the world’s population, but their origins and patterns of spread are disputed. Heggarty et al. present a database of 109 modern and 52 time-calibrated historical Indo-European languages, which they analyzed with models of Bayesian phylogenetic inference. Their results suggest an emergence of Indo-European languages around 8000 years before present. This is a deeper root date than previously thought, and it fits with an initial origin south of the Caucasus followed by a branch northward into the Steppe region. These findings lead to a “hybrid hypothesis” that reconciles current linguistic and ancient DNA evidence from both the eastern Fertile Crescent (as a primary source) and the steppe (as a secondary homeland).

    Läs mer om Language trees with sampled ancestors support a hybrid model for the origin of Indo-European languages
  • The Languages of Peristan through the Lens of Areal Typology

    2023. Henrik Liljegren. Roots of Peristan, 391-431


    In this study, the mountainous region of High Asia referred to as Peristan is outlined and discussed from an areal-linguistic perspective, based on first-hand data from more than fifty separate language varieties. While a comparison of the basic lexicon largely confirms established phylogenetic classification, many structural properties tend to cluster together geographically and often display convergence across phylogenetic boundaries. However, the analysis does not lend support to a simplistic description of the region as a single linguistic area with clear boundaries. Instead, it suggests that a set of languages, situated at the inner parts of Peristan, forms a hard core displaying a significant degree of structural similarity, with a gradual decrease in the number of shared properties towards its fluid outer boundaries. Another significant finding is the identification of three distinct micro-areas within Peristan. These convergence areas map convincingly—but not perfectly—with the Peristan geo-cultural regions suggested in earlier studies, the latter based on (mainly pre-Islamic) shared cultural, social, political and religious identities. The areal-linguistic patterns emerging are likely to be of considerable time-depth and appear to be the result of long-standing cross-community interaction on a sub-regional level.

    Läs mer om The Languages of Peristan through the Lens of Areal Typology
  • Nuristani in its areal and typological context

    2022. Henrik Liljegren. International Journal of Diachronic Linguistics and Linguistic Reconstruction 19, 201-265


    This study presents and details Nuristani, a phylogenetically distinctgroup of languages spoken in a remote area of northeastern Afghanistan. Largelybased on a recently collected data set from six Nuristani varieties, a large numberof structural properties, representing several linguistic domains (phonology,grammar and lexico-semantics), were analysed and systematically comparedwith world-wide typologies as well as with a tight and representative 53-language sample from the surrounding Hindu Kush region. Nuristani emerges as an integralpart of region-wide areal patterns, shared to a varying extent with languagesbelonging to six distinct phylogenies. In a majority of structural domains, Nuristaniclusters with a Hindu Kush core, including many Indo-Aryan languages,some Iranian, Tibeto-Burman and the isolate Burushaski. While Nuristani generallycomes out as internally homogeneous, one of the languages, Prasun, deviatesfrom that pattern; in certain respects, particularly morpho-syntactically, it clustersmore closely with languages other than its closest Nuristani kin, possibly asthe result of substratal influence. Only a small number of structural propertiescan be termed typically Nuristani: the presence of a retroflex approximant, linguisticcoding of complex spatial distinctions, kinship suffixes, and a set of finetuneddiscourse markers. Nuristani appears to be the source of subareal patternsdetectable also in some neighbouring non-Nuristani communities, most likely relatedto a shared pre-Muslim context.

    Läs mer om Nuristani in its areal and typological context
  • Kinship terminologies reveal ancient contact zone in the Hindu Kush

    2022. Henrik Liljegren. Linguistic typology 26 (2), 211-245


    The Hindu Kush, or the mountain region of northern Pakistan, north-eastern Afghanistan and the northern-most part of the Indian-administered Kashmir region, is home to approximately 50 languages belonging to six different genera: Indo-Aryan, Iranian, Nuristani, Sino-Tibetan, Turkic and the isolate Burushaski. Areality research on this region is only in its early stages, and while its significance as a convergence area has been suggested by several scholars, only a few, primarily phonological and grammatical, features have been studied in a more systematic fashion. Cross-linguistic research in the realms of semantics and lexical organization has been given considerably less attention. However, preliminary findings indicate that features are geographically bundled with one another, across genera, in significant ways, displaying semantic areality on multiple levels throughout the region or in one or more of its sub-regions. The present study is an areal-typological investigation of kinship terms in the region, in which particular attention is paid to a few notable polysemy patterns and what appears to be a significant geographical clustering of these. Comparisons are made between the geographical distribution of such patterns and those of some other linguistic features as well as with relevant non-linguistic factors related to shared cultural values or identities and a long history of small-scale cross-community interaction in different parts of the region.

    Läs mer om Kinship terminologies reveal ancient contact zone in the Hindu Kush
  • The Hindu Kush–Karakorum and linguistic areality

    2020. Henrik Liljegren. Journal of South Asian languages and linguistics 7 (2), 187-233


    The high-altitude Hindu Kush–Karakoram region is home to more than 50 language communities, belonging to six phylogenies. The significance of thisregion as a linguistic area has been discussed in the past, but the tendency has been to focus on individual features and phenomena, and more seldom have there been attempts at applying a higher degree of feature aggregation with tight sampling. In the present study, comparable first-hand data from as many as 59 Hindu Kush–Karakoram language varieties, was collected and analyzed. The data allowed for setting up a basic word list as well as for classifying each variety according to 80 binary structural features (phonology, lexico-semantics, grammatical categories, clause structure and word order properties). While a comparison of the basic lexicon across the varieties lines up very closely with the established phylogenetic classification, structural similarity clustering gives results clearly related to geographical proximity within the region and often cuts across phylogenetic boundaries. The strongest evidence of areality tied to the region itself (vis-à-vis South Asia in general on the one hand and Central/West Asia on the other) relates to phonology and lexical structure, whereas morphosyntactic properties mostly place the region’s languages within a larger areal or macro-areal distribution. The overall structural analysis also lends itself to recognizing six distinct micro-areas within the region, lining up with geo-cultural regions identified in previous ethno-historical studies. The present study interprets the domain-specific distributions as layers of areality that are each linked to a distinct historical period, and that taken together paint a picture of a region developing from high phylogenetic diversity, through massive Indo-Aryan penetration and language shifts, to today’s dramatically shrinking diversity and structural stream-lining propelled by the dominance of a few lingua francas.

    Läs mer om The Hindu Kush–Karakorum and linguistic areality
  • Emerging epistemic marking in Indo-Aryan Palula

    2020. Henrik Liljegren. Evidentiality, egophoricity and engagement, 141-163


    While evidentiality is neither systematically nor obligatorily signaled in IndoAryan Palula [phl; phal1254] (Pakistan), it can be observed in so-called scattered coding. It is most obviously reflected in three sub-systems of the language: a) as a secondary effect of tense—aspect differentiation, mostly clearly seen in the use of the perfect for indirect evidence vis-à-vis the use of the simple past for direct evidence; b) by a set of utterance-final mood markers, involving an emerging threeway paradigmatic contrast: thaní as quotative, maní as hearsay and ɡa as inferred knowledge; and c) by (at least) one member of a set of second-position discourse particles, xu, marking surprise. Although evidentiality contrasts akin to the perfect vs. simple past were indeed part of the ancestral Indo-Aryan tense system, there are plenty of parallels in adjacent languages to the epistemic contrasts noted for Palula, suggesting that more recent language contact must have contributed to, or largely facilitated, the emergence of epistemic marking in the language.

    Läs mer om Emerging epistemic marking in Indo-Aryan Palula
  • Gender typology and gender (in)stability in Hindu Kush Indo-Aryan languages

    2019. Henrik Liljegren. Grammatical gender and linguistic complexity, 279-328


    This paper investigates the phenomenon of gender as it appears in 25 Indo-Aryan languages (sometimes referred to as “Dardic”) spoken in the Hindu Kush-Karakorum region – the mountainous areas of northeastern Afghanistan, northern Pakistan and the disputed territory of Kashmir. Looking at each language in terms of the number of genders present, to what extent these are sex-based or non-sex-based, how gender relates to declensional differences, and what systems of assign-ment are applied, we arrive at a micro-typology of gender in Hindu Kush Indo-Aryan, including a characterization of these systems in terms of their general com-plexity. Considering the relatively close genealogical ties, the languages display a number of unexpected and significant differences. While the inherited sex-based gender system is clearly preserved in most of the languages, and perhaps even strengthened in some, it is curiously missing altogether in others (such as in Kalasha and Khowar) or seems to be subject to considerable erosion (e.g. in Dameli). That the languages of the latter kind are all found at the northwestern outskirts of the Indo-Aryan world suggests non-trivial interaction with neighbouring languages without gender or with markedly different assignment systems. In terms of com-plexity, the southwestern-most corner of the region stands out; here we find a few languages (primarily belonging to the Pashai group) that combine inherited sex-based gender differentiation with animacy-related distinctions resulting in highly complex agreement patterns. The findings are discussed in the light of earlier obser-vations of linguistic areality or substratal influence in the region, involving Indo-Aryan, Iranian, Nuristani, Tibeto-Burman, Turkic languages and Burushaski. The present study draws from the analysis of earlier publications as well as from en-tirely novel field data.

    Läs mer om Gender typology and gender (in)stability in Hindu Kush Indo-Aryan languages
  • Supporting and sustaining language vitality in northern Pakistan

    2018. Henrik Liljegren. The Routledge Handbook of Language Revitalization, 427-437


    Northern Pakistan is linguistically and culturally very diverse. Nearly 30 languages—representing a wide span, numerically and vitality-wise—are spoken in this mountainous region, sharing ties with adjacent areas of neighboring countries. Although most of these languages have received little outside recognition, there have been few restrictions for those wanting to promote their languages. Therefore, a number of sustaining efforts have been made in recent years, exemplified throughout the chapter: collaborative fieldwork, the formation of language organizations, training in documentation, the development of orthographies, publications, the introduction of mother-tongue schools, and lobbying for the region’s languages. Evaluating some of those activities and their effectiveness in terms of language maintenance and revitalization, some key factors stand out: community ownership, institutional support, pooling of resources, and multi-community collaboration. The observations and subsequent analysis are informed by the author’s own long-term involvement in the development of the Forum for Language Initiatives.

    Läs mer om Supporting and sustaining language vitality in northern Pakistan
  • Geomorphic coding in Palula and Kalasha

    2018. Jan Heegård, Henrik Liljegren. Acta Linguistica Hafniensia. International Journal of Structural Linguistics 50 (2), 129-160


    The article describes the geomorphic systems of spatial reference in the two Indo-Aryan languages Palula and Kalasha, spoken in adjacent areas of an alpine region in Northwestern Pakistan. Palula and Kalasha encode the inclination of the mountain slope as well as the flow of the river, in systematic and similar ways, and by use of distinct sets of nominal lexemes that may function adverbially. In their verbal systems, only Palula encode, landscape features in a systematic way, but both languages make use of a number of verbal sets that in different ways emphasise boundary-crossing. The article relates the analysis to Palmer's Topographic Correspondence Hypothesis that predicts that the linguistic system of spatial reference will reflect the topography of the surrounding landscape. The analysis of the geomorphic systems in Palula and Kalasha supports this hypothesis. However, data from a survey of spatial strategies in neighbouring languages, i.e., languages spoken in a similar alpine landscape, reveal another system that does not to the same extent or in a similar way encode typical landscape features such as the mountain slope and the flow of the river. This calls for a revision of Palmer's hypothesis that also takes language contact into consideration.

    Läs mer om Geomorphic coding in Palula and Kalasha
  • Bisyndetic Contrast Marking in the Hindukush

    2017. Henrik Liljegren, Erik Svärd. Journal of Language Contact 10 (3), 450-484


    A contrastive (or antithetical) construction which makes simultaneous use of two separate particles is identified through a mainly corpus-based study as a typical feature of a number of lesser-described languages spoken in the Afghanistan-Pakistan borderland in the high Hindukush. The feature encompasses Nuristani languages (Waigali, Kati) as well as the Indo-Aryan languages found in their close vicinity (Palula, Kalasha, Dameli, Gawri), while it is not shared by more closely related Indo-Aryan languages spoken outside of this geographically delimited area. Due to a striking (although not complete) overlap with at least two other (unrelated) structural features, pronominal kinship suffixes and retroflex vowels, we suggest that a linguistic and cultural diffusion zone of considerable age is centred in the mountainous Nuristan-Kunar-Panjkora area.

    Läs mer om Bisyndetic Contrast Marking in the Hindukush
  • Khowar

    2017. Henrik Liljegren, Afsar Ali Khan. Journal of the International Phonetic Association 47 (2), 219-229


    Khowar (ISO 639-3: khw) is an Indo-Aryan language spoken by 200,000–300,000 (Decker 1992: 31–32; Bashir 2003: 843) people in Pakistan's Khyber Pakhtunkhwa Province (formerly North-West Frontier Province). The majority of the speakers are found in Chitral (a district and erstwhile princely state bordering Afghanistan, see Figure 1), where the language is used as a lingua franca, but there are also important pockets of speaker groups in adjacent areas of Gilgit-Baltistan and Swat District as well as a considerable number of recent migrants to larger cities such as Peshawar and Rawalpindi (Decker 1992: 25–26). Its closest linguistic relative is Kalasha, a much smaller language spoken in a few villages in southern Chitral (Morgenstierne 1961: 138; Strand 1973: 302, 2001: 252). While Khowar has preserved a number of features (phonological, morphological as well as lexical) now lost in other Indo-Aryan languages of the surrounding Hindukush-Karakoram mountain region, it has, over time, incorporated a massive amount of lexical material from neighbouring or influential Iranian languages (Morgenstierne 1936) – and with it, new phonological distinctions. Certain features might also be attributable to formerly dominant languages (e.g. Turkic), or to linguistic substrates, either in the form of, or related to, the language isolate Burushaski, or other, now extinct, languages previously spoken in the area (Morgenstierne 1932: 48, 1947: 6; Bashir 2007: 208–214). There is relatively little dialectal variation among the speakers in Chitral itself, probably attributable to the relative recency of the present expansion of the language (Morgenstierne 1932: 50).

    Läs mer om Khowar
  • Profiling Indo-Aryan in the Hindukush-Karakoram: A preliminary study of micro-typological patterns

    2017. Henrik Liljegren. Journal of South Asian languages and linguistics 4 (1), 107-156


    The study is a typological profile of 31 Indo-Aryan (IA) languages in the Hindukush-Karakoram-Western Himalayan region (covering NE Afghanistan, N Pakistan, and parts of Kashmir). Native speakers were recruited to provide comparative data. This data, supplemented by reputable descriptions or field notes, was evaluated against a number of WALS- or WALS-like features, enabling a fine-tuned characterization of each language, taking different lin-guistic domains into account (phonology, morphology, syntax, lexicon). The emerging patterns were compared with global distributions as well as with characteristic IA features and well-known areal patterns. Some features, mainly syntactic, turned out to be shared with IA in general, whereas others do have scattered reflexes in IA outside of the region but are especially prevalent in the region: large consonant inventories, tripartite pronominal case alignment, a high frequency of left-branching constructions, and multi-degree deictic sys-tems. Yet other features display a high degree of diversity, often bundling subareally. Finally, there was a significant clustering of features that are not characterizing IA in general: tripartite affricate differentiation, retroflexion across several subsets, aspiration contrasts involving voiceless consonants only, tonal contrasts and 20-based numerals. This clustering forms a “hard core” at the centre of the region, gradually fading out toward its peripheries.

    Läs mer om Profiling Indo-Aryan in the Hindukush-Karakoram
  • A grammar of Palula

    2016. Henrik Liljegren.


    This grammar provides a grammatical description of Palula, an Indo-Aryan language of the Shina group. The language is spoken by about 10,000 people in the Chitral district in Pakistan’s Khyber Pakhtunkhwa Province. This is the first extensive description of the formerly little-documented Palula language, and is one of only a few in-depth studies available for languages in the extremely multilingual Hindukush-Karakoram region. The grammar is based on original fieldwork data, collected over the course of about ten years, commencing in 1998. It is primarily in the form of recorded, mainly narrative, texts, but supplemented by targeted elicitation as well as notes of observed language use. All fieldwork was conducted in close collaboration with the Palula-speaking community, and a number of native speakers took active part in the process of data gathering, annotation and data management. The main areas covered are phonology, morphology and syntax, illustrated with a large number of example items and utterances, but also a few selected lexical topics of some prominence have received a more detailed treatment as part of the morphosyntactic structure. Suggestions for further research that should be undertaken are given throughout the grammar. The approach is theory-informed rather than theory-driven, but an underlying functional-typological framework is assumed. Diachronic development is taken into account, particularly in the area of morphology, and comparisons with other languages and references to areal phenomena are included insofar as they are motivated and available. The description also provides a brief introduction to the speaker community and their immediate environment.

    Läs mer om A grammar of Palula

Visa alla publikationer av Henrik Liljegren vid Stockholms universitet