Open access to research data and automatic pseudonymization: Two years with the Mormor Karl project
Welcome to a research seminar at the Department of Computer and Systems Sciences (DSV). We are happy to have professor Elena Volodina as our guest, presenting her work on automatic pseudonymization.
Seminar
Date:
Tuesday 3 June 2025Time:
14.00 – 15.00Location:
Lilla Hörsalen, DSV, Borgarfjordsgatan 12, KistaElena Volodina is a professor at the Department of Swedish, multilingualism, and language technology, University of Gothenburg.
She says the following about her talk at DSV on June 3:
“
This talk will be devoted to the challenges of working with data that contains personal information. I will describe a set of experiments with automatic pseudonymization that we have performed within the Mormor Karl project.Among others, experiments with detection and labeling of personal categories using BERT models (Szawerna et al. 2024, 2025), attempts at using LLMs to “fill in the blanks” when substituting personal information with pseudonyms (yet unpublished) and a study on whether pseudonyms can provoke biased automated classifications (Muñoz Sánchez et al. 2024).
The choice of models for our experiments is currently dictated by the sensitive nature of our data. To extend the choice from open source to proprietary models, we are currently collecting a “pseudo-corpus” with fictitious personal information that we will be able to share freely for future research.
You are welcome to contribute to the pseudo-corpus collection as well
Finally, in this talk I will name several strategies to unify the research on automatic pseudonymization, and outline further challenges, needs for standardization and a proposal of a shared task.
”
References
Maria Irena Szawerna, Simon Dobnik, Ricardo Muñoz Sánchez, and Elena Volodina, 2025:
The Devil’s in the Details: the Detailedness of Classes Influences Personal Information Detection and Labeling
Maria Irena Szawerna, Simon Dobnik, Ricardo Muñoz Sánchez, Therese Lindström Tiedemann and Elena Volodina, 2024:
Detecting Personal Identifiable Information in Swedish Learner Essays
Ricardo Muñoz Sánchez, Simon Dobnik, Maria Irena Szawerna, Therese Lindström Tiedemann and Elena Volodina, 2024:
Did the Names I Used within My Essay Affect My Score? Diagnosing Name Biases in Automated Essay Scoring
More information about Elena Volodina
More information about the Mormor Karl project
This seminar is organised at DSV, Campus Kista. If you are a researcher at another Stockholm University department or centre and wish to participate, please contact Hercules Dalianis
Last updated: 2025-05-09
Source: Department of Computer and Systems Sciences, DSV