Utvalda publikationer

Utvalda böcker, vetenskapliga artiklar, konferensbidrag och andra publikationer från Natural Language Processing Research Group 2009–2025.

Dinh, T. och H. Dalianis. 2025. Evaluating Privacy and Utility in Synthetic EHR Data Generation for Adverse Drug Event Detection. I Proceedings of the European Federation Medical Informatics, Special Topic Conference, EFMI STC 2025, Osnabrück, 20–22 October 2025.

Kopacheva, E., Henriksson, A., Dalianis, H., Hammar, T. och A. Lincke. 2025. Identifying Adverse Drug Events in Clinical Text Using Fine-Tuned Clinical Language Models: Machine Learning Study. JMIR Formative Research, 9(1), e71949.

Kiefer, L., Vakili, T., Alabi, O. A., Dalianis, H. och D. Klakow. 2025. Instruction-Tuning LLaMA for Synthetic Medical Note Generation in Swedish and English. I Proceedings of RANLP, International Conference “Recent Advances in NLP”, Varna, Bulgaria, 8–10 September 2025.

Ngo, P., Tejedor Hernández M., Chomutare, T, Budrionis, A., Olsen Svenning, T., Torsvik, T., Lamproudis, A. och H. Dalianis. 2025. Domain-Specific Pretraining and Evaluation of NorDeClin-BERT for ICD-10 Code Prediction in Norwegian Clinical Texts. JMIR AI, Journal of Medical Internet Research AI.

Chomutare, T., Babic, A., Peltonen, L. M., Elunurm, S., Lundberg, P., Jönsson, A., Eneling, E., Gerstenberger, C-V., Siggaard, T., Kolde, R, Jerdhaf, O., Hansson, M., Makhlysheva, A., Muzny, M., Ylipää, E., Brunak, S. och H. Dalianis 2025. Implementing a Nordic-Baltic Federated Health Data Network: a case report. I Proceedings of the 20th World Congress on Medical and Health Informatics, MedInfo202. Taipei, Taiwan 9–13 August 2025.

Randl, K., Pavlopoulos, J., Henriksson, A., Lindgren, T. och J. Bakagianni. 2025. SemEval-2025 Task 9: The Food Hazard Detection Challenge. I Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval 2025).

van der Werff, S.D., van Rooden, S.M., Henriksson, A., Behnke, M., Aghdassi, S.J.S., van Mourik, M.S.M. och P. Naucler. 2025. The future of healthcare-associated infection surveillance: Automated surveillance and using the potential of artificial intelligence. Journal of Internal Medicine, 298, pp. 54–77.

Randl, K., Pavlopoulos, J., Henriksson, A. och T. Lindgren. 2025. Mind the Gap: From Plausible to Valid Self-Explanations in Large Language Models. Machine Learning, 114:120.

Vakili, T., Henriksson, A. and H. Dalianis. 2025. Data-Constrained Synthesis of Training Data for De-Identification. I Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025). Vienna, July 27–August 1st, 2025.

Chomutare, T., Olsen Svenning, T., Hernández, M. Á. T., Ngo, P. D., Budrionis, A., Markljung, K., Hind, L.I., Torsvik, T., Mikalsen, K.Ø., Babic, A. och H. Dalianis. 2025. Artificial intelligence to improve clinical coding practice in Scandinavia: a crossover randomized controlled trial. Journal of Medical Internet Research, 27, e71904.

Vakili, T., Hansson, M. och A. Henriksson. 2025. SweClinEval: A Benchmark for Swedish Clinical Natural Language Processing. I Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025), 767-775.

Dürlich, L., Bergman, E., Larsson, M., Dalianis, H., Doyle, S., Westman, G. och J. Nivre. 2025. Explainability for NLP in Pharmacovigilance: A Study on Adverse Event Report Triage in Swedish. I Proceedings of the Second Workshop on Patient-Oriented Language Processing (CL4Health), pp. 46-68.

Vakili, T., Henriksson, A. och H. Dalianis. 2024. End-to-end pseudonymization of fine-tuned clinical BERT models – Privacy preservation with maintained data utility. Special Issue on Health information privacy and security, BMC Medical Informatics and Decision Making, Volume 24, Number 1, pp. 1-15.

Lundmark, L., Kaati, L. och A. Shrestha. 2024. Visions of Violence: Threatful Communication in Incel Communities. IEEE International Conference on Big Data (BigData), 2772-27782024.

Berggren, M., Kaati, L., Pelzer, B., Stiff, H., Lundmark, L. och N. Akrami. 2024. The generalizability of machine learning models of personality across two text domains. Personality and Individual Differences 217, 112465.

Dunstan, J., Vakili, T., Miranda, L., Villena, F., Aracena, C., Quiroga, T., Vera, P., Viteri Valenzuela, S. och V Rocco. 2024. A Pseudonymized Corpus of Occupational Health Narratives for Clinical Entity Recognition in Spanish. BMC Medical Informatics and Decision Making, special issue on Health information privacy and security.

Aracena, C., Miranda, L., Vakili, T., Villena, F., Quiroga, T., Núñez-Torres, F., Rocco, V. och J Dunstan. 2024. A Privacy-Preserving Corpus for Occupational Health in Spanish: Evaluation for NER and Classification Tasks. In Proceedings of the 6th Clinical Natural Language Processing Workshop @ NAACL 2024.

Vakili, T., Hullmann T., Henriksson A. och H. Dalianis. 2024. When Is a Name Sensitive? Eponyms in Clinical Text and Implications for De-Identification. In the proceedings of the CALD-pseudo Workshop at the 18th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2024, Malta.

Ngo, P., Tejedor M., Olsen Svenning T., Chomutare T., Budrionis A. och H. Dalianis. 2024. Deidentifying a Norwegian clinical corpus – An effort to create a privacy-preserving Norwegian large clinical language model. In the proceedings of the CALD-pseudo Workshop at the 18th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2024, Malta.

Sneiders, E. och A. Henriksson. 2024. Text Retrieval in Restricted Domains by Pairwise Term Co-occurrence. Complex Systems Informatics and Modeling Quarterly, (41), 80-111.

Randl, K., Pavlopoulos, J., Henriksson, A., och T. Lindgren. 2024. Evaluating the Reliability of Self-Explanations in Large Language Models. In International Conference on Discovery Science (pp. 36-51). Cham: Springer Nature Switzerland.

Wu, Y. och A. Henriksson. 2024. Selecting from Multiple Strategies Improves the Foreseeable Reasoning of Tool-Augmented Large Language Models. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 197-212). Cham: Springer Nature Switzerland.

Randl, K, Pavlopoulos, I., Henriksson, A. och T, Lindgren. 2024. CICLe: Conformal In-Context Learning for Largescale Multi-Class Food Risk Classification. In Findings of the Association for Computational Linguistics ACL 2024, pp. 7695–7715.

Li, X., Henriksson, A., Duneld, M., Nouri, J., and Y. Wu. 2024. Supporting Teaching-to-the-Curriculum by Linking Diagnostic Tests to Curriculum Goals: Using Textbook Content as Context for Retrieval-Augmented Generation with Large Language Models. In Proc. of International Conference on AI in Education, pp. 118-132.

Vakili, T., Henriksson, A. och H. Dalianis. 2024. End-to-end pseudonymization of fine-tuned clinical BERT models: Privacy preservation with maintained data utility. Special Issue on Health information privacy and security, BMC Medical Informatics and Decision Making, Volume 24, Number 1.

Berg, N. och H. Dalianis. 2024. Using BART to Automatically Generate Discharge Summaries from Swedish Clinical Text. In Proceedings of the First Workshop on Patient-Oriented Language Processing (CL4Health) @LREC-COLING 2024, pp. 246-252.

Kaati, L., Shrestha, A och Akrami. N. 2023. Harmful Communication. Detection of Toxic Language and Threats on Swedish. In Proceedings of the 2023 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM '23).

Lamproudis, A., Olsen Svenning T., Torsvik T., Chomutare T., Budrionis A., Dinh Ngo P., Vakili T. och H. Dalianis. 2023. Using a Large Open Clinical Corpus for Improved ICD-10 Diagnosis Coding. Proceedings of AMIA 2023, Annual Symposium, November 11-15. New Orleans, LA, USA.

Henriksson, A., Pawar, Y., Hedberg P. och P. Nauclér. 2023. Multimodal fine-tuning of clinical language models for predicting COVID-19 outcomes. In Artificial Intelligence in Medicine, 146.

Wu, Y., Henriksson, A., Duneld, M., och J. Nouri. 2023. Towards Improving the Reliability and Transparency of ChatGPT for Educational Question Answering. In Proceedings of the Eighteenth European Conference on Technology Enhanced Learning (ECTEL).

Vakili, T. och H. Dalianis. 2023. Using Membership Inference Attacks to Evaluate Privacy-Preserving Language Modeling Fails for Pseudonymizing Data. Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa 2023).

Foteini, V. och E. Sneiders. 2022. Communication between Citizens and Public Organizations as a means of Public Value Co-creation. In: DG.O 2022: The 23rd Annual International Conference on Digital Government, Republic of Korea, pp. 1-12. ACM, ISBN: 978-1-4503-9749-0

Kaati, L., Shrestha, A och Akrami. N. 2022. Predicting Targeted Violence from Social Media Communication. In Proceedings of the 2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM '22).

Lamproudis, A., Henriksson A. och H. Dalianis. 2022. Evaluating Pretraining Strategies for Clinical BERT Models. In Proceedings of the 13th International Conference on Language Resources and Evaluation, LREC 2022, pp.410–416.

Vakili, T., Lamproudis, A., Henriksson, A. och H. Dalianis. 2022. Downstream Task Performance of BERT Models Pre-Trained Using Automatically De-Identified Clinical Data. In Proceedings of the 13th International Conference on Language Resources and Evaluation, LREC 2022, pp. 4245–4252.

Wu, Y., Henriksson, A., Nouri, J., Duneld M. och Li, X. (2022). Retrieving Key Topical Sentences With Topic-aware BERT when Conducting Automated Essay Scoring. In Proceedings of the 12th International Conference on Methodologies and Intelligent Systems for Technology Enhanced Learning (MIS4TEL).

Vakili, T. och H. Dalianis. 2021. Are Clinical BERT Models Privacy Preserving? The Difficulty of Extracting Patient-Condition Associations. In the Proceedings of the Association for the Advancement of Artificial Intelligence AAAI Fall 2021 Symposium in HUman partnership with Medical Artificial iNtelligence (HUMAN.AI), November 4-6, 2021.

Remmer, S., Lamproudis, A. och H. Dalianis. 2021. Multi-label Diagnosis Classification of Swedish Discharge Summaries – ICD-10 Code Assignment Using KB-BERT. In the Proceedings of RANLP 2021: Recent Advances in Natural Language Processing, 1-3 Sept 2021, Varna, Bulgaria.

Shrestha, A., Akrami, N., Kaati, L., Kupper, J., och Schumacher, M. R. 2021. Words of Suicide: Identifying Suicidal Risk in Written Communications. In 2021 IEEE International Conference on Big Data (Big Data) (pp. 2144-2150). IEEE.

Lamproudis, A., Henriksson, A. och H. Dalianis. 2021. Developing a Clinical Language Model for Swedish: Continued Pretraining of Generic BERT with In-Domain Data. In the Proceeding of RANLP 2021: Recent Advances in Natural Language Processing, 1-3 Sept 2021, Varna, Bulgaria.

Berg, H., A. Henriksson och H. Dalianis. 2020. The Impact of De-identification on Downstream Named Entity Recognition in Clinical Text. In Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis, Louhi 2020, in conjunction with EMNLP 2020, (pp. 1-11).

Dalianis, H. 2019. Pseudonymisation of Swedish Electronic Patient Records Using a Rule-based Approach. Proceedings of the Workshop on NLP and Pseudonymisation, in conjunction with the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), Turku, Finland, September 30, 2019.

Dalianis, H. 2018. Clinical Text Mining: Secondary Use of Electronic Patient Records. 181 pages, Springer, Open Access.

Sneiders, E. 2016. Text retrieval by term co-occurrences in a query-based vector space. In Proceedings of COLING 2016:  the 26th International Conference on Computational Linguistics.

Henriksson, A., Kvist, M., Dalianis H. och M. Duneld 2015. Identifying adverse drug event information in clinical notes with distributional semantic representations of context. Journal of Biomedical Informatics, 57: 333-349.

Dalianis, H., Henriksson, A., Kvist, M., Velupillai, S. och R. Weegar. 2015. HEALTH BANK – A Workbench for Data Science Applications in Healthcare. In Proceedings of CAiSE’15 – Industry Track, Stockholm, Sweden.

Skeppstedt, M., M. Kvist, H. Dalianis och G.H. Nilsson. 2014. Automatic recognition of disorders, findings, pharmaceuticals and body structures from clinical text: An annotation and machine learning study. Journal of Biomedical Informatics, Vol 49, pp. 148-158.

Velupillai, S., H. Dalianis, M. Hassel och G. H. Nilsson. 2009. Developing a standard for de-identifying electronic patient records written in Swedish: precision, recall and F-measure in a manual and computerized annotation trial. International Journal of Medical Informatics (2009), Volume 78, Issue 12.

Senast uppdaterad: 2025-11-21

Sidansvarig: Institutionen för data- och systemvetenskap, DSV