Disputation: Zed Lee


Datum: fredag 24 november 2023

Tid: 13.00 – 17.00

Plats: Sal L50, DSV, Borgarfjordsgatan 12, Kista

Välkommen till en disputation på DSV! Zed Lee presenterar sin avhandling som handlar om hur användbara data kan utvinnas ur stora och komplexa datakällor.

24 november 2023 presenterar Zed Lee sin doktorsavhandling på Institutionen för data- och systemvetenskap (DSV) vid Stockholms universitet. Titeln är ”Z-Series: Mining and learning from complex sequential data”.

Doktorand: Zed Lee, DSV
Opponent: Professor Toon Calders, University of Antwerp, Belgien
Huvudhandledare: Panagiotis Papapetrou, DSV
Handledare: Tony Lindgren, DSV

Disputationen genomförs i DSVs lokaler, med start klockan 13.00.

Hitta till DSV
Kontaktuppgifter till Zed Lee
Avhandlingen kan laddas ner från Diva

Zed Lee spikar sin avhandling på DSV, Stockholms universitet. Foto: Luis Quintero.
Zed Lee spikade traditionsenligt upp sin avhandling på DSVs vägg i god tid före disputationen. Foto: Luis Quintero.

Sammanfattning (på engelska)

The amount and complexity of sequential data collected across various domains have grown rapidly, posing significant challenges for extracting useful knowledge from such data sources. The complexity arises from diverse sequence representations with varying granularities, such as multivariate time series, histogram snapshots, and heterogeneous health records, which often describe a single data instance with multiple sequences. Due to this complexity, the underlying temporal relations between sequences may not be clear and can change over time, making knowledge discovery even more challenging.

To address these challenges, this thesis proposes event intervals as a unified representation for complex sequential data. Event intervals capture the underlying temporal relations between sequences by comparing the relative locations of event intervals in both the time and value dimensions, making them suitable for describing diverse sequential data. The proposed artifacts aim to efficiently and effectively discover patterns of interest, transform sequential data in different application domains through temporal abstraction, and provide interpretable features for machine learning tasks without compromising performance. The effectiveness of the proposed artifacts is evaluated through empirical experiments and practical evaluations, which demonstrate their applicability and performance.

The thesis is structured into three parts. First, it introduces state-of-the-art frameworks for mining event interval sequences, including frequent arrangement mining, classification, and clustering. The utility of these frameworks is demonstrated through comparative empirical evaluations against other frameworks. Second, the thesis applies temporal abstraction to complex sequential data in different application domains, showcasing its applicability through tasks such as disproportionality analysis and local grouping detection for time series. Lastly, event intervals are used as interpretable features for learning tasks, outperforming competitive algorithms using different feature representations. This part focuses on univariate and multivariate time series, and extensive experiments are performed on the publicly available benchmark datasets with statistical tests.