Beata Megyesi Professor

About me

I hold the position of a Professor in Computational Linguistics. My main research areas include natural language processing and digital philology and my scholarly pursuits center around cross-disciplinary research aimed at facilitating quantitative studies utilizing AI for the humanities and social sciences. Currently, I am working on historical cryptology to analyze and break ciphers and codes.  

Throughout the years, I have actively taken on a range of academic roles:

  • Chair of the Linguistics Review Panel at the Swedish Research Council 2024-2025 and Member 2021-2025;
  • Member of the board of the National Research School in Digital Philology (DigPhil), Sweden, 2023-;
  • Member of the board at the Center for Digital Humanities, Uppsala University, Sweden, 2020-2023;
  • President of the Northern European Association for Language Technology (NEALT), 2020-2021 and vice president 2018-2019;
  • Head of Department of Linguistics and Philology, 2009-2018;
  • Director of the English Park Campus, Uppsala University, 2017-2018;

For additional insights into my research and teaching endeavors, please refer to the details provided below.

I teach regularly at the undergraduate and advanced level, primarily in computational linguistics. I am program responsible for the international master's program in AI and Language. I am the main supervisor for two PhD students and co-supervisor for one PhD student. 

Throughout the years, I have been taught courses at three universities: the Dept. of Linguistics at Stockholm University (SU), the Dept. of Linguistics and Philology at Uppsala University (UU), and the Dept. of Speech, Music and Hearing at KTH. I have been given various courses in computational linguistics (CL) and general linguistics (GL) from basic to advanced levels, as well as some PhD courses. 

Basic level courses:

  • Corpus linguistics, 7.5 ECTS: 2023-2025 (SU)
  • BA thesis supervision, 15 ECTS: 2000- (SU, KTH, UU)
  • Languages, computers, and text processing, 7.5 ECTS: 2012-2020  (UU)
  • Introduction to Language Technology, 7.5 ECTS: 2015  (UU)
  • Advisor for Language Technology Project, 7.5 ECTS: 2011-2016  (UU)
  • Techniques for large scale parsing (parts): 2009  (UU)
  • Corpus linguistics, 7.5 ECTS: 2005, 2006, 2007 (UU)
  • Computational grammar II, 7.5 ECTS: 2004 (UU)

Advanced level courses:

  • Master's thesis course in AI and Language, 30 ECTS: 2026 (SU)
  • Project course in AI and Language, 15 ECTS: 2025 (SU)
  • The structure of language, 7.5 ECTS: 2024-2025 (SU)
  • Digital philology, 7.5 ECTS: 2018-2024 (UU)
  • Corpus-based methods, 7.5 ECTS: 2023 (SU)
  • Master thesis supervision, 30 ECTS: 2004-2023 (UU)
  • Research and development, 15 ECTS: 2021 (UU)
  • Computer-based tools for research in humanities, 7.5 ECTS: 2007-2013 (UU)
  • Thesis work in language technology, 30 ECTS: 2005-2007 (UU)
  • Advanced course in corpus linguistics, 7.5 ECTS: 2005 (UU)
  • Advisor for Language Technology Project, 7.5 ECTS: 2011-2016 (UU)

PhD education:

  • (Subs.) Director of the National Graduate School of Early Languages (DigPhil) 2025-2026
  • I am the main supervisor of Micaella Bruton (SU) and Crina Tudor (SU), and co-supervisor of Oreen Yousuf (UU)
  • I was co-supervisor: Eva Petterson and Mojgan Seraji
  • Course in Digital philology II, 7.5 ECTS (SU/UU)
  • Course in Digital Philology I, 7.5 ECTS: 2024 (UU)
  • Natural Language Processing, GSLT, 2008
  • Infrastructural tools for the study of linguistic variation: PhD course at Oslo University, June 2009

I have always been interested in how human language is processed by humans, and how it can be processed by machines. My research focuses on the automatic analysis of historical handwritten documents on one hand, and large-scale text analysis for research within the humanities and social sciences on the other hand. I collaborate both nationally and internationally in Sweden, Germany, Hungary, Norway, Spain, and the USA. Over the past 10 years, my research has received external funding exceeding 10 million Euros, and my scientific work has resulted in over 100 scientific articles published in international fora.

Some projects that I led and/or contributed to: 

  • DESCRYPT: Echoes of History: Analysis and Decipherment of Historical Writings: PI, Riksbankens Jubileumsfond, 2025-2032
  • DECRYPT: Decryption of Historical Manuscripts: PI, Swedish Research Council, 2018-2024 
  • DECODE: Automatic Decoding of Historical Manuscript: PI, Swedish Research Council, 2015-2017
  • HistoCrypt: A scientific forum for historical cryptology 2018-
  • HistCorp: A collection of historical texts for 17 European languages 2015-
  • SWEGRAM: Automatic Annotation and Analysis of Swedish texts, PI; part of the Swe-CLARIN project, Swedish Research Council, 2014-2024
  • SWeLL: Research Infrastructure for Swedish as a second language: co-applicant, RJ, 2017-2019
  • Multilingual Parallel Corpora, Swedish Research Council: member, 2006-2010
  • Methods and Tools for Automatic Grammar Extraction: Swedish Research Council: member, 2005-2007
  • An Infrastructure for Swedish language technology: member,  Swedish Research Council, 2007-2008

My work has been published in the media as well, see for example: 

  • Vetenskapsradion Historia (Radio Sweden, Science, History) 8/2 2025. Interview with me by 

Urban Björstadius.

The computational linguist who cracks historical riddles 2024. Article and film by Stockholm University

You can find details about my research in my publications. 

I have also served on numerous committees for doctoral theses and mid-term evaluations, regularly act as a reviewer for conferences and workshops, and have undertaken numerous expert assignments for appointments in both Sweden and abroad. Additionally, I have served as an assessor for projects funded by the Swedish Research Council and the Wallenberg Foundation.

Beáta Megyesi's publications per year pdf, 324.7 kB. and per type. pdf, 288.1 kB.