Predoc-seminarium: Xiu Li

Seminarium

Datum: måndag 29 april 2024

Tid: 13.00 – 15.00

Plats: Plats: M10, DSV, Borgarfjordsgatan 12, Kista.

Välkommen till ett predoc-seminarium om anpassningsbara system för lärande, och hur man kan skapa intelligenta läroböcker! Xiu Li, doktorand på DSV, är respondent.

29 april 2024 presenterar doktoranden Xiu Li sitt pågående arbete med titeln ”Exploring Natural Language Processing for Linking Digital Learning Materials: Towards Intelligent and Adaptive Learning Systems”. Seminariet genomförs på Institutionen för data- och systemvetenskap (DSV) vid Stockholms universitet.

Respondent: Xiu Li, DSV
Opponent: Asad Sayeed, Göteborgs universitet
Huvudhandledare: Aron Henriksson, DSV
Handledare: Martin Duneld och Jalal Nouri, DSV
Närmast berörda professor: Hercules Dalianis, DSV

Kontaktuppgifter till Xiu Li

Hitta till DSV

 

Sammanfattning på engelska

The digital transformation in education has created many benefits but also introduced challenges with respect to navigate the ever-expanding repository of diverse learning materials. The vast array of resources can overwhelm both educators and learners, making it difficult to identify and utilize the most relevant resources effectively.

In light of this, there’s a critical demand for systems capable of effectively connecting these varied learning materials, thereby supporting teaching and learning. One promising approach to achieving this is leveraging natural language processing techniques. This thesis investigates the use of natural language processing to develop educational content recommendation systems that can automatically link learning resources like textbooks, quizzes, and curriculum goals.

We propose two natural language processing approaches to connect and recommend pedagogically relevant materials, focusing on solving the linking task through an unsupervised manner. One approach is ontology-based, where we undertake the task of named entity recognition for concept extraction, aiming to use concepts to connect learning materials. The other approach is using language models – semantic textual similarity-based models and prompt-based large language models. Furthermore, we explore a combination of these approaches through embedding ensemble, and retrieval-augmented generation models that incorporate external knowledge as context through information retrieval with large language models for enhanced performances.

The contributions of this work are manifold, presenting empirical evidence that supports the effectiveness of these natural language processing techniques in the Swedish educational context. The findings highlight the superiority of unsupervised approach, achieving high performance in the content recommendation tasks.

The results show that contextual embeddings from large language models outperform traditional methods, prompt-based large language models outperform semantic textual similarity-based models, and retrieval augmented generation can further improve the performances of large language models given the right scope of retrieved context. Moreover, the potential practical applications of the proposed approaches for the automatic linkage of digital learning materials pave the way for the development of intelligent and adaptive learning systems, and the creation of intelligent textbooks.