Stockholms universitet

Isak SamstenUniversitetslektor

Om mig

Jag är universitetslektor i data science och maskininlärning, med en kandidatexamen (2010), masterexamen (2012) och doktorsexamen (2017) i datavetenskap. Mitt huvudsakliga forskningsområde är förklarbar och tolkbar maskininlärning (explainable AI), med särskilt fokus på temporala data, såsom tidsserier. Inom detta område undersöker jag metoder och system som kan ge svar på frågor såsom "Varför gjordes denna prediktion?" (förklaringar) och "Vilka förändringar krävs för att få ett annat utfall?" (kontrafaktiska förklaringar).

Undervisning

Jag undervisar för närvarande på följande kurser:

  • Research topics in Data science (RTDS)
  • Programming 2 (PROG2)
  • Parallel and distributed programming (PARADIS)
  • Data mining (DATM)

Jag handleder också kandidat- och masteruppsatser.

Forskningsprojekt

Publikationer

I urval från Stockholms universitets publikationsdatabas

  • COMET: Constrained Counterfactual Explanations for Patient Glucose Multivariate Forecasting

    2024. Zhendong Wang (et al.). Annual IEEE Symposium on Computer-Based Medical Systems, 502-507

    Konferens

    Applying deep learning models for healthcare-related forecasting applications has been widely adopted, such as leveraging glucose monitoring data of diabetes patients to predict hyperglycaemic or hypoglycaemic events. However, most deep learning models are considered black-boxes; hence, the model predictions are not interpretable and may not offer actionable insights into medical practitioners’ decisions. Previous work has shown that counterfactual explanations can be applied in forecasting tasks by suggesting counterfactual changes in time series inputs to achieve the desired forecasting outcome. This study proposes a generalized multivariate forecasting setup of counterfactual generation by introducing a novel approach, COMET, which imposes three domain-specific constraint mechanisms to provide counterfactual explanations for glucose forecasting. Moreover, we conduct the experimental evaluation using two diabetes patient datasets to demonstrate the effectiveness of our proposed approach in generating realistic counterfactual changes in comparison with a baseline approach. Our qualitative analysis evaluates examples to validate that the counterfactual samples are clinically relevant and can effectively lead the patients to achieve a normal range of predicted glucose levels by suggesting changes to the treatment variables.

    Läs mer om COMET
  • Counterfactual Explanations for Time Series Forecasting

    2024. Zhendong Wang (et al.). 2023 IEEE International Conference on Data Mining (ICDM), 1391-1396

    Konferens

    Among recent developments in time series forecasting methods, deep forecasting models have gained popularity as they can utilize hidden feature patterns in time series to improve forecasting performance. Nevertheless, the majority of current deep forecasting models are opaque, hence making it challenging to interpret the results. While counterfactual explanations have been extensively employed as a post-hoc approach for explaining classification models, their application to forecasting models still remains underexplored. In this paper, we formulate the novel problem of counterfactual generation for time series forecasting, and propose an algorithm, called ForecastCF, that solves the problem by applying gradient-based perturbations to the original time series. The perturbations are further guided by imposing constraints to the forecasted values. We experimentally evaluate ForecastCF using four state-of-the-art deep model architectures and compare to two baselines. ForecastCF outperforms the baselines in terms of counterfactual validity and data manifold closeness, while generating meaningful and relevant counterfactuals for various forecasting tasks.

    Läs mer om Counterfactual Explanations for Time Series Forecasting
  • Glacier: guided locally constrained counterfactual explanations for time series classification

    2024. Zhendong Wang (et al.). Machine Learning 113, 4639-4669

    Artikel

    In machine learning applications, there is a need to obtain predictive models of high performance and, most importantly, to allow end-users and practitioners to understand and act on their predictions. One way to obtain such understanding is via counterfactuals, that provide sample-based explanations in the form of recommendations on which features need to be modified from a test example so that the classification outcome of a given classifier changes from an undesired outcome to a desired one. This paper focuses on the domain of time series classification, more specifically, on defining counterfactual explanations for univariate time series. We propose Glacier, a model-agnostic method for generating locally-constrained counterfactual explanations for time series classification using gradient search either on the original space or on a latent space that is learned through an auto-encoder. An additional flexibility of our method is the inclusion of constraints on the counterfactual generation process that favour applying changes to particular time series points or segments while discouraging changing others. The main purpose of these constraints is to ensure more reliable counterfactuals, while increasing the efficiency of the counterfactual generation process. Two particular types of constraints are considered, i.e., example-specific constraints and global constraints. We conduct extensive experiments on 40 datasets from the UCR archive, comparing different instantiations of Glacier against three competitors. Our findings suggest that Glacier outperforms the three competitors in terms of two common metrics for counterfactuals, i.e., proximity and compactness. Moreover, Glacier obtains comparable counterfactual validity compared to the best of the three competitors. Finally, when comparing the unconstrained variant of Glacier to the constraint-based variants, we conclude that the inclusion of example-specific and global constraints yields a good performance while demonstrating the trade-off between the different metrics.

    Läs mer om Glacier
  • Predictive Machine Learning in Assessing Materiality: The Global Reporting Initiative Standard and Beyond

    2024. Jan Svanberg (et al.). Artificial Intelligence for Sustainability, 105-131

    Kapitel

    Sustainability reporting standards state that material information should be disclosed, but materiality is not easily nor consistently defined across companies and sectors. Research finds that materiality assessments by reporting companies and sustainability auditors are uncertain, discretionary, and subjective. This chapter investigates a machine learning approach to sustainability reporting materiality assessments that has predictive validity. The investigated assessment methodology provides materiality assessments of disclosed as well as non-disclosed sustainability items consistent with the impact materiality GRI (Global Reporting Initiative) reporting standard. Our machine learning model estimates the likelihood that a company fully complies with environmental responsibilities. We then explore how a state-of-the-art model interpretation method, the SHAP (SHapley Additive exPlanations) developed by Lundberg and Lee (A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 2017-December, pp 4766–4775, 2017), can be used to estimate impact materiality.

    Läs mer om Predictive Machine Learning in Assessing Materiality
  • Assessment 10 of double materiality: The development of predictively valid materiality assessments with artificial intelligence

    2023. Peter Öhman, Jan Svanberg, Isak Samsten. Auditing Transformation, 205-226

    Kapitel

    Sustainability reporting standards, e.g. the Global Reporting Initiative, require a broader definition of materiality than is traditionally used in financial reporting. Double materiality expands the material information concept to include information about companies' environmental and social impact relevant to society at large. A problem for reporting companies as well as auditors (even though accounting firms invest resources in establishing themselves as reliable service providers) is that the assessment of double materiality is uncertain. The chapter utilises machine learning methods to suggest a method to determine double materiality in sustainability reporting by examining what type of information can predict environmental issues resulting from companies' operations. It represents a proposal to use a structured and quantitative approach for sustainability auditors to determine double materiality, thereby potentially facilitating sustainability reporting and assurance in accordance with future regulation.

    Läs mer om Assessment 10 of double materiality
  • Must social performance ratings be idiosyncratic? An exploration of social performance ratings with predictive validity

    2023. Jan Svanberg (et al.). Sustainability Accounting, Management and Policy Journal 14 (7), 313-348

    Artikel

    Syftet med denna studie är att utveckla en metod för att bedöma social prestation. Traditionellt använder leverantörer av miljö, social och styrning (ESG) subjektivt viktade aritmetiska medelvärden för att kombinera en uppsättning sociala prestationsindikatorer (SP) till en enda värdering. För att övervinna detta problem undersöker denna studie förutsättningarna för en ny metodik för att klassificera SP-komponenten i ESG genom att tillämpa maskininlärning (ML) och artificiell intelligens (AI) förankrade i sociala kontroverser.

    Den här studien föreslår användningen av en datadriven klassificeringsmetodik som härleder den relativa betydelsen av SP-egenskaper från deras bidrag till förutsägelsen av sociala kontroverser. Författarna använder den föreslagna metoden för att lösa viktningsproblemet med övergripande ESG-betyg och ytterligare undersöka om förutsägelse är möjlig.

    Författarna finner att ML-modeller kan förutsäga kontroverser med hög prediktiv prestanda och validitet. Resultaten tyder på att viktningsproblemet med ESG-betygen kan lösas med ett datadrivet tillvägagångssätt. Den avgörande förutsättningen för den föreslagna ratingmetodiken är dock att sociala kontroverser förutsägs av en bred uppsättning SP-indikatorer. Resultaten tyder också på att prediktivt giltiga betyg kan utvecklas med denna ML-baserade AI-metod.

    Praktiska konsekvenser

    Denna studie erbjuder praktiska lösningar på ESG-ratingproblem som har konsekvenser för investerare, ESG-bedömare och socialt ansvarsfulla investeringar.

    Den föreslagna ML-baserade AI-metoden kan bidra till att uppnå bättre ESG-betyg, vilket i sin tur kommer att bidra till att förbättra SP, vilket får konsekvenser för organisationer och samhällen genom hållbar utveckling.

    Så vitt författarna vet är denna forskning en av de första studierna som erbjuder en unik metod för att ta itu med ESG-betygsproblemet och förbättra hållbarheten genom att fokusera på SP-indikatorer.

    Läs mer om Must social performance ratings be idiosyncratic? An exploration of social performance ratings with predictive validity
  • Prediction of Controversies and Estimation of ESG Performance: An Experimental Investigation Using Machine Learning

    2023. Jan Svanberg (et al.). Handbook of Big Data and Analytics in Accounting and Auditing, 65-87

    Kapitel

    We develop a new methodology for computing environmental, social, and governance (ESG) ratings using a mode of artificial intelligence (AI) called machine learning (ML) to make ESG more transparent. The ML algorithms anchor our rating methodology in controversies related to non-compliance with corporate social responsibility (CSR). This methodology is consistent with the information needs of institutional investors and is the first ESG methodology with predictive validity. Our best model predicts what companies are likely to experience controversies. It has a precision of 70–84 per cent and high predictive performance on several measures. It also provides evidence of what indicators contribute the most to the predicted likelihood of experiencing an ESG controversy. Furthermore, while the common approach of rating companies is to aggregate indicators using the arithmetic average, which is a simple explanatory model designed to describe an average company, the proposed rating methodology uses state-of-the-art AI technology to aggregate ESG indicators into holistic ratings for the predictive modelling of individual company performance.

    Predictive modelling using ML enables our models to aggregate the information contained in ESG indicators with far less information loss than with the predominant aggregation method.

    Läs mer om Prediction of Controversies and Estimation of ESG Performance
  • Style-transfer counterfactual explanations: An application to mortality prevention of ICU patients

    2023. Zhendong Wang (et al.). Artificial Intelligence in Medicine 135

    Artikel

    In recent years, machine learning methods have been rapidly adopted in the medical domain. However, current state-of-the-art medical mining methods usually produce opaque, black-box models. To address the lack of model transparency, substantial attention has been given to developing interpretable machine learning models. In the medical domain, counterfactuals can provide example-based explanations for predictions, and show practitioners the modifications required to change a prediction from an undesired to a desired state. In this paper, we propose a counterfactual solution MedSeqCF for preventing the mortality of three cohorts of ICU patients, by representing their electronic health records as medical event sequences, and generating counterfactuals by adopting and employing a text style-transfer technique. We propose three model augmentations for MedSeqCF to integrate additional medical knowledge for generating more trustworthy counterfactuals. Experimental results on the MIMIC-III dataset strongly suggest that augmented style-transfer methods can be effectively adapted for the problem of counterfactual explanations in healthcare applications and can further improve the model performance in terms of validity, BLEU-4, local outlier factor, and edit distance. In addition, our qualitative analysis of the results by consultation with medical experts suggests that our style-transfer solutions can generate clinically relevant and actionable counterfactual explanations.

    Läs mer om Style-transfer counterfactual explanations
  • Surveillance of communicable diseases using social media: A systematic review

    2023. Patrick Pilipiec, Isak Samsten, András Bota. PLOS ONE 18 (2)

    Artikel

    Background

    Communicable diseases pose a severe threat to public health and economic growth. The traditional methods that are used for public health surveillance, however, involve many drawbacks, such as being labor intensive to operate and resulting in a lag between data collection and reporting. To effectively address the limitations of these traditional methods and to mitigate the adverse effects of these diseases, a proactive and real-time public health surveillance system is needed. Previous studies have indicated the usefulness of performing text mining on social media.

    Objective

    To conduct a systematic review of the literature that used textual content published to social media for the purpose of the surveillance and prediction of communicable diseases.

    Methodology

    Broad search queries were formulated and performed in four databases. Both journal articles and conference materials were included. The quality of the studies, operationalized as reliability and validity, was assessed. This qualitative systematic review was guided by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.

    Results

    Twenty-three publications were included in this systematic review. All studies reported positive results for using textual social media content to surveille communicable diseases. Most studies used Twitter as a source for these data. Influenza was studied most frequently, while other communicable diseases received far less attention. Journal articles had a higher quality (reliability and validity) than conference papers. However, studies often failed to provide important information about procedures and implementation.

    Conclusion

    Text mining of health-related content published on social media can serve as a novel and powerful tool for the automated, real-time, and remote monitoring of public health and for the surveillance and prediction of communicable diseases in particular. This tool can address limitations related to traditional surveillance methods, and it has the potential to supplement traditional methods for public health surveillance.

    Läs mer om Surveillance of communicable diseases using social media
  • Corporate governance performance ratings with machine learning

    2022. Jan Svanberg (et al.). International Journal of Intelligent Systems in Accounting, Finance & Management 29 (1), 50-68

    Artikel

    We use machine learning with a cross-sectional research design to predict governance controversies and to develop a measure of the governance component of the environmental, social, governance (ESG) metrics. Based on comprehensive governance data from 2,517 companies over a period of 10 years and investigating nine machine-learning algorithms, we find that governance controversies can be predicted with high predictive performance. Our proposed governance rating methodology has two unique advantages compared with traditional ESG ratings: it rates companies' compliance with governance responsibilities and it has predictive validity. Our study demonstrates a solution to what is likely the greatest challenge for the finance industry today: how to assess a company's sustainability with validity and accuracy. Prior to this study, the ESG rating industry and the literature have not provided evidence that widely adopted governance ratings are valid. This study describes the only methodology for developing governance performance ratings based on companies' compliance with governance responsibilities and for which there is evidence of predictive validity.

    Läs mer om Corporate governance performance ratings with machine learning
  • Post Hoc Explainability for Time Series Classification. Toward a signal processing perspective

    2022. Rami Mochaourab (et al.). IEEE signal processing magazine (Print) 39 (4), 119-129

    Artikel

    Time series data correspond to observations of phenomena that are recorded over time [1] . Such data are encountered regularly in a wide range of applications, such as speech and music recognition, monitoring health and medical diagnosis, financial analysis, motion tracking, and shape identification, to name a few. With such a diversity of applications and the large variations in their characteristics, time series classification is a complex and challenging task. One of the fundamental steps in the design of time series classifiers is that of defining or constructing the discriminant features that help differentiate between classes. This is typically achieved by designing novel representation techniques [2] that transform the raw time series data to a new data domain, where subsequently a classifier is trained on the transformed data, such as one-nearest neighbors [3] or random forests [4] . In recent time series classification approaches, deep neural network models have been employed that are able to jointly learn a representation of time series and perform classification [5] . In many of these sophisticated approaches, the discriminant features tend to be complicated to analyze and interpret, given the high degree of nonlinearity.

    Läs mer om Post Hoc Explainability for Time Series Classification. Toward a signal processing perspective
  • Prediction of environmental controversies and development of a corporate environmental performance rating methodology

    2022. Jan Svanberg (et al.). Journal of Cleaner Production 344

    Artikel

    Institutional investors seek to make environmentally sustainable investments using environment, social, governance (ESG) ratings. Current ESG ratings have limited validity because they are based on idiosyncratic scores derived using subjective, discretionary methodologies. We discuss a new direction for developing corporate environmental performance (CEP) ratings and propose a solution to the limited validity problem by anchoring such ratings in environmental controversies. The study uses a novel machine learning approach to make the ratings more comprehensive and transparent, based on a set of algorithmic approaches that handle nonlinearity when aggregating ESG indicators. This approach minimizes the rater subjectivity and preferences inherent in traditional ESG indicators. The findings indicate that controversies as proxies for non-compliance with environmental responsibilities can be predicted well. We conclude that environmental performance ratings developed using our machine learning framework offer predictive validity consistent with institutional investors' demand for socially responsible investment screening.

    Läs mer om Prediction of environmental controversies and development of a corporate environmental performance rating methodology
  • Assessing the Clinical Validity of Attention-based and SHAP Temporal Explanations for Adverse Drug Event Predictions

    2021. Jonathan Rebane (et al.). 2021 IEEE 34th International Symposium on Computer-Based Medical Systems, 235-240

    Konferens

    Attention mechanisms form the basis of providing temporal explanations for a variety of state-of-the-art recurrent neural network (RNN) based architectures. However, evidence is lacking that attention mechanisms are capable of providing sufficiently valid medical explanations. In this study we focus on the quality of temporal explanations for the medical problem of adverse drug event (ADE) prediction by comparing explanations globally and locally provided by an attention-based RNN architecture against those provided by more a more basic RNN using the post-hoc SHAP framework, a popular alternative option which adheres to several desirable explainability properties. The validity of this comparison is supported by medical expert knowledge gathered for the purpose of this study. This investigation has uncovered that these explanation methods both possess appropriateness for ADE explanations and may be used complementarily, due to SHAP providing more clinically appropriate global explanations and attention mechanisms capturing more clinically appropriate local explanations. Additional feedback from medical experts reveal that SHAP may be more applicable to real-time clinical encounters, in which efficiency must be prioritised, over attention explanations which possess properties more appropriate for offline analyses.

    Läs mer om Assessing the Clinical Validity of Attention-based and SHAP Temporal Explanations for Adverse Drug Event Predictions
  • Counterfactual Explanations for Survival Prediction of Cardiovascular ICU Patients

    2021. Zhendong Wang, Isak Samsten, Panagiotis Papapetrou. Artificial Intelligence in Medicine, 338-348

    Konferens

    In recent years, machine learning methods have been rapidly implemented in the medical domain. However, current state-of-the-art methods usually produce opaque, black-box models. To address the lack of model transparency, substantial attention has been given to develop interpretable machine learning methods. In the medical domain, counterfactuals can provide example-based explanations for predictions, and show practitioners the modifications required to change a prediction from an undesired to a desired state. In this paper, we propose a counterfactual explanation solution for predicting the survival of cardiovascular ICU patients, by representing their electronic health record as a sequence of medical events, and generating counterfactuals by adopting and employing a text style-transfer technique. Experimental results on the MIMIC-III dataset strongly suggest that text style-transfer methods can be effectively adapted for the problem of counterfactual explanations in healthcare applications and can achieve competitive performance in terms of counterfactual validity, BLEU-4 and local outlier metrics. 

    Läs mer om Counterfactual Explanations for Survival Prediction of Cardiovascular ICU Patients
  • Learning Time Series Counterfactuals via Latent Space Representations

    2021. Zhendong Wang (et al.). Discovery Science, 369-384

    Konferens

    Counterfactual explanations can provide sample-based explanations of features required to modify from the original sample to change the classification result from an undesired state to a desired state; hence it provides interpretability of the model. Previous work of LatentCF presents an algorithm for image data that employs auto-encoder models to directly transform original samples into counterfactuals in a latent space representation. In our paper, we adapt the approach to time series classification and propose an improved algorithm named LatentCF++ which introduces additional constraints in the counterfactual generation process. We conduct an extensive experiment on a total of 40 datasets from the UCR archive, comparing to current state-of-the-art methods. Based on our evaluation metrics, we show that the LatentCF++ framework can with high probability generate valid counterfactuals and achieve comparable explanations to current state-of-the-art. Our proposed approach can also generate counterfactuals that are considerably closer to the decision boundary in terms of margin difference.

    Läs mer om Learning Time Series Counterfactuals via Latent Space Representations
  • SMILE: A feature-based temporal abstraction framework for event-interval sequence classification

    2021. Jonathan Rebane (et al.). Data mining and knowledge discovery 35 (1), 372-399

    Artikel

    In this paper, we study the problem of classification of sequences of temporal intervals. Our main contribution is a novel framework, which we call SMILE, for extracting relevant features from interval sequences to construct classifiers.SMILE introduces the notion of utilizing random temporal abstraction features, we define as e-lets, as a means to capture information pertaining to class-discriminatory events which occur across the span of complete interval sequences. Our empirical evaluation is applied to a wide array of benchmark data sets and fourteen novel datasets for adverse drug event detection. We demonstrate how the introduction of simple sequential features, followed by progressively more complex features each improve classification performance. Importantly, this investigation demonstrates that SMILE significantly improves AUC performance over the current state-of-the-art. The investigation also reveals that the selection of underlying classification algorithm is important to achieve superior predictive performance, and how the number of features influences the performance of our framework.

    Läs mer om SMILE
  • Exploiting complex medical data with interpretable deep learning for adverse drug event prediction

    2020. Jonathan Rebane, Isak Samsten, Panagiotis Papapetrou. Artificial Intelligence in Medicine 109

    Artikel

    A variety of deep learning architectures have been developed for the goal of predictive modelling and knowledge extraction from medical records. Several models have placed strong emphasis on temporal attention mechanisms and decay factors as a means to include highly temporally relevant information regarding the recency of medical event occurrence while facilitating medical code-level interpretability. In this study we utilise such models with a large Electronic Patient Record (EPR) data set consisting of diagnoses, medication, and clinical text data for the purpose of adverse drug event (ADE) prediction. The first contribution of this work is an empirical evaluation of two state-of-the-art medical-code based models in terms of objective performance metrics for ADE prediction on diagnosis and medication data. Secondly, as an extension of previous work, we augment an interpretable deep learning architecture to permit numerical risk and clinical text features and demonstrate how this approach yields improved predictive performance compared to the other baselines. Finally, we assess the importance of attention mechanisms in regards to their usefulness for medical code-level and text-level interpretability, which may facilitate novel insights pertaining to the nature of ADE occurrence within the health care domain.

    Läs mer om Exploiting complex medical data with interpretable deep learning for adverse drug event prediction
  • Locally and globally explainable time series tweaking

    2020. Isak Karlsson (et al.). Knowledge and Information Systems 62 (5), 1671-1700

    Artikel

    Time series classification has received great attention over the past decade with a wide range of methods focusing on predictive performance by exploiting various types of temporal features. Nonetheless, little emphasis has been placed on interpretability and explainability. In this paper, we formulate the novel problem of explainable time series tweaking, where, given a time series and an opaque classifier that provides a particular classification decision for the time series, we want to find the changes to be performed to the given time series so that the classifier changes its decision to another class. We show that the problem is NP -hard, and focus on three instantiations of the problem using global and local transformations. In the former case, we investigate the k-nearest neighbor classifier and provide an algorithmic solution to the global time series tweaking problem. In the latter case, we investigate the random shapelet forest classifier and focus on two instantiations of the local time series tweaking problem, which we refer to as reversible and irreversible time series tweaking, and propose two algorithmic solutions for the two problems along with simple optimizations. An extensive experimental evaluation on a variety of real datasets demonstrates the usefulness and effectiveness of our problem formulation and solutions.

    Läs mer om Locally and globally explainable time series tweaking
  • A classification framework for exploiting sparse multi-variate temporal features with application to adverse drug event detection in medical records

    2019. Francesco Bagattini (et al.). BMC Medical Informatics and Decision Making 19

    Artikel

    Background: Adverse drug events (ADEs) as well as other preventable adverse events in the hospital setting incur a yearly monetary cost of approximately $3.5 billion, in the United States alone. Therefore, it is of paramount importance to reduce the impact and prevalence of ADEs within the healthcare sector, not only since it will result in reducing human suffering, but also as a means to substantially reduce economical strains on the healthcare system. One approach to mitigate this problem is to employ predictive models. While existing methods have been focusing on the exploitation of static features, limited attention has been given to temporal features.

    Methods: In this paper, we present a novel classification framework for detecting ADEs in complex Electronic health records (EHRs) by exploiting the temporality and sparsity of the underlying features. The proposed framework consists of three phases for transforming sparse and multi-variate time series features into a single-valued feature representation, which can then be used by any classifier. Moreover, we propose and evaluate three different strategies for leveraging feature sparsity by incorporating it into the new representation.

    Results: A large-scale evaluation on 15 ADE datasets extracted from a real-world EHR system shows that the proposed framework achieves significantly improved predictive performance compared to state-of-the-art. Moreover, our framework can reveal features that are clinically consistent with medical findings on ADE detection.

    Conclusions: Our study and experimental findings demonstrate that temporal multi-variate features of variable length and with high sparsity can be effectively utilized to predict ADEs from EHRs. Two key advantages of our framework are that it is method agnostic, i.e., versatile, and of low computational cost, i.e., fast; hence providing an important building block for future exploitation within the domain of machine learning from EHRs.

    Läs mer om A classification framework for exploiting sparse multi-variate temporal features with application to adverse drug event detection in medical records
  • Example-Based Feature Tweaking Using Random Forests

    2019. Tony Lindgren (et al.). 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science, 53-60

    Konferens

    In certain application areas when using predictive models, it is not enough to make an accurate prediction for an example, instead it might be more important to change a prediction from an undesired class into a desired class. In this paper we investigate methods for changing predictions of examples. To this end, we introduce a novel algorithm for changing predictions of examples and we compare this novel method to an existing method and a baseline method. In an empirical evaluation we compare the three methods on a total of 22 datasets. The results show that the novel method and the baseline method can change an example from an undesired class into a desired class in more cases than the competitor method (and in some cases this difference is statistically significant). We also show that the distance, as measured by the euclidean norm, is higher for the novel and baseline methods (and in some cases this difference is statistically significantly) than for state-of-the-art. The methods and their proposed changes are also evaluated subjectively in a medical domain with interesting results.

    Läs mer om Example-Based Feature Tweaking Using Random Forests
  • Explainable time series tweaking via irreversible and reversible temporal transformations

    2018. Isak Karlsson (et al.). 2018 IEEE International Conference on Data Mining (ICDM), 207-216

    Konferens

    Time series classification has received great attention over the past decade with a wide range of methods focusing on predictive performance by exploiting various types of temporal features. Nonetheless, little emphasis has been placed on interpretability and explainability. In this paper, we formulate the novel problem of explainable time series tweaking, where, given a time series and an opaque classifier that provides a particular classification decision for the time series, we want to find the minimum number of changes to be performed to the given time series so that the classifier changes its decision to another class. We show that the problem is NP-hard, and focus on two instantiations of the problem, which we refer to as reversible and irreversible time series tweaking. The classifier under investigation is the random shapelet forest classifier. Moreover, we propose two algorithmic solutions for the two problems along with simple optimizations, as well as a baseline solution using the nearest neighbor classifier. An extensive experimental evaluation on a variety of real datasets demonstrates the usefulness and effectiveness of our problem formulation and solutions.

    Läs mer om Explainable time series tweaking via irreversible and reversible temporal transformations
  • Exploring epistaxis as an adverse effect of anti-thrombotic drugs and outdoor temperature

    2018. Jaakko Hollmén (et al.). Proceedings of the 11th PErvasive Technologies Related to Assistive Environments Conference (PETRA), 1-4

    Konferens

    Electronic health records contain a wealth of epidemiological information about diseases at the population level. Using a database of medical diagnoses and drug prescriptions in electronic health records, we investigate the correlation between outdoor temperature and the incidence of epistaxis over time for two groups of patients. One group consists of patients that had been diagnosed with epistaxis and also been prescribed at least one of the three anti-thrombotic agents: Warfarin, Apixaban, or Rivaroxaban. The other group consists of patients that had been diagnosed with epistaxis and not been prescribed any of the three anti-thrombotic drugs. We find a strong negative correlation between the incidence of epistaxis and outdoor temperature for the group that had not been prescribed any of the three anti-thrombotic drugs, while there is a weaker correlation between incidence of epistaxis and outdoor temperature for the other group. It is, however, clear that both groups are affected in a similar way, such that the incidence of epistaxis increases with colder temperatures.

    Läs mer om Exploring epistaxis as an adverse effect of anti-thrombotic drugs and outdoor temperature
  • Seq2Seq RNNs and ARIMA models for Cryptocurrency Prediction: A Comparative Study

    2018. Jonathan Rebane (et al.). Proceedings of SIGKDD Workshop on Fintech (SIGKDD Fintech’18)

    Konferens

    Cyrptocurrency price prediction has recently become an alluring topic, attracting massive media and investor interest. Traditional models, such as Autoregressive Integrated Moving Average models (ARIMA) and models with more modern popularity, such as Recurrent Neural Networks (RNN’s) can be considered candidates for such financial prediction problems, with RNN’s being capable of utilizing various endogenous and exogenous input sources. This study compares the model performance of ARIMA to that of a seq2seq recurrent deep multi-layer neural network (seq2seq) utilizing a varied selection of inputs types. The results demonstrate superior performance of seq2seq over ARIMA, for models generated throughout most of bitcoin price history, with additional data sources leading to better performance during less volatile price periods.

    Läs mer om Seq2Seq RNNs and ARIMA models for Cryptocurrency Prediction
  • Conformal prediction using random survival forests

    2017. Henrik Boström (et al.). 16th IEEE International Conference on Machine Learning and Applications, 812-817

    Konferens

    Random survival forests constitute a robust approach to survival modeling, i.e., predicting the probability that an event will occur before or on a given point in time. Similar to most standard predictive models, no guarantee for the prediction error is provided for this model, which instead typically is empirically evaluated. Conformal prediction is a rather recent framework, which allows the error of a model to be determined by a user specified confidence level, something which is achieved by considering set rather than point predictions. The framework, which has been applied to some of the most popular classification and regression techniques, is here for the first time applied to survival modeling, through random survival forests. An empirical investigation is presented where the technique is evaluated on datasets from two real-world applications; predicting component failure in trucks using operational data and predicting survival and treatment of heart failure patients from administrative healthcare data. The experimental results show that the error levels indeed are very close to the provided confidence levels, as guaranteed by the conformal prediction framework, and that the error for predicting each outcome, i.e., event or no-event, can be controlled separately. The latter may, however, lead to less informative predictions, i.e., larger prediction sets, in case the class distribution is heavily imbalanced.

    Läs mer om Conformal prediction using random survival forests
  • KAPMiner: Mining Ordered Association Rules with Constraints

    2017. Isak Karlsson, Panagiotis Papapetrou, Lars Asker. Advances in Intelligent Data Analysis XVI, 149-161

    Konferens

    We study the problem of mining ordered association rules from event sequences. Ordered association rules differ from regular association rules in that the events occurring in the antecedent (left hand side) of the rule are temporally constrained to occur strictly before the events in the consequent (right hand side). We argue that such constraints can provide more meaningful rules in particular application domains, such as health care. The importance and interestingness of the extracted rules are quantified by adapting existing rule mining metrics. Our experimental evaluation on real data sets demonstrates the descriptive power of ordered association rules against ordinary association rules.

    Läs mer om KAPMiner: Mining Ordered Association Rules with Constraints
  • Learning from Administrative Health Registries

    2017. Jonathan Rebane (et al.). SoGood 2017: Data Science for Social Good

    Konferens

    Over the last decades the healthcare domain has seen a tremendous increase and interest in methods for making inference about patient care using large quantities of medical data. Such data is often stored in electronic health records and administrative health registries. As these data sources have grown increasingly complex, with millions of patients represented by thousands of attributes, static or time evolving, finding relevant and accurate patterns that can be used for predictive or descriptive modelling is impractical for human experts. In this paper, we concentrate our review on Swedish Administrative Health Registries (AHRs) and Electronic Health Records (EHRs) and provide an overview of recent and ongoing work in the area with focus on adverse drug events (ADEs) and heart failure.

    Läs mer om Learning from Administrative Health Registries
  • Mining disproportional itemsets for characterizing groups of heart failure patients from administrative health records

    2017. Isak Karlsson (et al.). Proceedings of the 10th International Conference on PErvasive Technologies Related to Assistive Environments, 394-398

    Konferens

    Heart failure is a serious medical conditions involving decreased quality of life and an increased risk of premature death. A recent evaluation by the Swedish National Board of Health and Welfare shows that Swedish heart failure patients are often undertreated and do not receive basic medication as recommended by the national guidelines for treatment of heart failure. The objective of this paper is to use registry data to characterize groups of heart failure patients, with an emphasis on basic treatment. Towards this end, we explore the applicability of frequent itemset mining and disproportionality analysis for finding interesting and distinctive characterizations of a target group of patients, e.g., those who have received basic treatment, against a control group, e.g., those who have not received basic treatment. Our empirical evaluation is performed on data extracted from administrative health records from the Stockholm County covering the years 2010--2016. Our findings suggest that frequency is not always the most appropriate measure of importance for frequent itemsets, while itemset disproportionality against a control group provides alternative rankings of the extracted itemsets leading to some medically intuitive characterizations of the target groups.

    Läs mer om Mining disproportional itemsets for characterizing groups of heart failure patients from administrative health records
  • Early Random Shapelet Forest

    2016. Isak Karlsson, Panagiotis Papapetrou, Henrik Boström. Discovery Science, 261-276

    Konferens

    Early classification of time series has emerged as an increasingly important and challenging problem within signal processing, especially in domains where timely decisions are critical, such as medical diagnosis in health-care. Shapelets, i.e., discriminative sub-sequences, have been proposed for time series classification as a means to capture local and phase independent information. Recently, forests of randomized shapelet trees have been shown to produce state-of-the-art predictive performance at a low computational cost. In this work, they are extended to allow for early classification of time series. An extensive empirical investigation is presented, showing that the proposed algorithm is superior to alternative state-of-the-art approaches, in case predictive performance is considered to be more important than earliness. The algorithm allows for tuning the trade-off between accuracy and earliness, thereby supporting the generation of early classifiers that can be dynamically adapted to specific needs at low computational cost.

    Läs mer om Early Random Shapelet Forest
  • Generalized random shapelet forests

    2016. Isak Karlsson, Panagiotis Papapetrou, Henrik Boström. Data mining and knowledge discovery 30 (5), 1053-1085

    Artikel

    Shapelets are discriminative subsequences of time series, usually embedded in shapelet-based decision trees. The enumeration of time series shapelets is, however, computationally costly, which in addition to the inherent difficulty of the decision tree learning algorithm to effectively handle high-dimensional data, severely limits the applicability of shapelet-based decision tree learning from large (multivariate) time series databases. This paper introduces a novel tree-based ensemble method for univariate and multivariate time series classification using shapelets, called the generalized random shapelet forest algorithm. The algorithm generates a set of shapelet-based decision trees, where both the choice of instances used for building a tree and the choice of shapelets are randomized. For univariate time series, it is demonstrated through an extensive empirical investigation that the proposed algorithm yields predictive performance comparable to the current state-of-the-art and significantly outperforms several alternative algorithms, while being at least an order of magnitude faster. Similarly for multivariate time series, it is shown that the algorithm is significantly less computationally costly and more accurate than the current state-of-the-art.

    Läs mer om Generalized random shapelet forests
  • Predicting Adverse Drug Events using Heterogeneous Event Sequences

    2016. Isak Karlsson, Henrik Boström. 2016 IEEE International Conference on Healthcare Informatics (ICHI), 356-362

    Konferens

    Adverse drug events (ADEs) are known to be severely under-reported in electronic health record (EHR) systems. One approach to mitigate this problem is to employ machine learning methods to detect and signal for potentially missing ADEs, with the aim of increasing reporting rates. There are, however, many challenges involved in constructing prediction models for this task, since data present in health care records is heterogeneous, high dimensional, sparse and temporal. Previous approaches typically employ bag-of-items representations of clinical events that are present in a record, ignoring the temporal aspects. In this paper, we study the problem of classifying heterogeneous and multivariate event sequences using a novel algorithm building on the well known concept of ensemble learning. The proposed approach is empirically evaluated using 27 datasets extracted from a real EHR database with different ADEs present. The results indicate that the proposed approach, which explicitly models the temporal nature of clinical data, can be expected to outperform, in terms of the trade-off between precision and specificity, models that do no consider the temporal aspects.

    Läs mer om Predicting Adverse Drug Events using Heterogeneous Event Sequences
  • Semigeometric Tiling of Event Sequences

    2016. Andreas Henelius (et al.). Machine Learning and Knowledge Discovery in Databases, 329-344

    Konferens

    Event sequences are ubiquitous, e.g., in finance, medicine, and social media. Often the same underlying phenomenon, such as television advertisements during Superbowl, is reflected in independent event sequences, like different Twitter users. It is hence of interest to find combinations of temporal segments and subsets of sequences where an event of interest, like a particular hashtag, has an increased occurrence probability. Such patterns allow exploration of the event sequences in terms of their evolving temporal dynamics, and provide more fine-grained insights to the data than what for example straightforward clustering can reveal. We formulate the task of finding such patterns as a novel matrix tiling problem, and propose two algorithms for solving it. Our first algorithm is a greedy set-cover heuristic, while in the second approach we view the problem as time-series segmentation. We apply the algorithms on real and artificial datasets and obtain promising results. The software related to this paper is available at https://github.com/bwrc/semigeom-r.

    Läs mer om Semigeometric Tiling of Event Sequences
  • Embedding-based subsequence matching with gaps-range-tolerances: a Query-By-Humming application

    2015. Alexios Kotsifakos (et al.). The VLDB journal 24 (4), 519-536

    Artikel

    We present a subsequence matching framework that allows for gaps in both query and target sequences, employs variable matching tolerance efficiently tuned for each query and target sequence, and constrains the maximum matching range. Using this framework, a dynamic programming method is proposed, called SMBGT, that, given a short query sequence Q and a large database, identifies in quadratic time the subsequence of the database that best matches Q. SMBGT is highly applicable to music retrieval. However, in Query-By-Humming applications, runtime is critical. Hence, we propose a novel embedding-based approach, called ISMBGT, for speeding up search under SMBGT. Using a set of reference sequences, ISMBGT maps both Q and each position of each database sequence into vectors. The database vectors closest to the query vector are identified, and SMBGT is then applied between Q and the subsequences that correspond to those database vectors. The key novelties of ISMBGT are that it does not require training, it is query sensitive, and it exploits the flexibility of SMBGT. We present an extensive experimental evaluation using synthetic and hummed queries on a large music database. Our findings show that ISMBGT can achieve speedups of up to an order of magnitude against brute-force search and over an order of magnitude against cDTW, while maintaining a retrieval accuracy very close to that of brute-force search.

    Läs mer om Embedding-based subsequence matching with gaps-range-tolerances
  • Forests of Randomized Shapelet Trees

    2015. Isak Karlsson, Panagiotis Papapetrou, Henrik Boström. Statistical Learning and Data Sciences, 126-136

    Konferens

    Shapelets have recently been proposed for data series classification, due to their ability to capture phase independent and local information. Decision trees based on shapelets have been shown to provide not only interpretable models, but also, in many cases, state-of-the-art predictive performance. Shapelet discovery is however computationally costly, and although several techniques for speeding up the technique have been proposed, the computational cost is still in many cases prohibitive. In this work, an ensemble based method, referred to as Random Shapelet Forest (RSF), is proposed, which builds on the success of the random forest algorithm, and which is shown to have a lower computational complexity than the original shapelet tree learning algorithm. An extensive empirical investigation shows that the algorithm provides competitive predictive performance and that a proposed way of calculating importance scores can be used to successfully identify influential regions.

    Läs mer om Forests of Randomized Shapelet Trees
  • Multi-channel ECG classification using forests of randomized shapelet trees

    2015. Isak Karlsson, Panagiotis Papapetrou, Lars Asker. Proceedings of the 8th ACM International Conference on PErvasive Technologies Related to Assistive Environments

    Konferens

    Data series of multiple channels occur at high rates and in massive quantities in several application domains, such as healthcare. In this paper, we study the problem of multi-channel ECG classification. We map this problem to multivariate data series classification and propose five methods for solving it, using a split-and-combine approach. The proposed framework is evaluated using three base-classifiers on real-world data for detecting Myocardial Infarction. Extensive experiments are performed on real ECG data extracted from the Physiobank data repository. Our findings emphasize the importance of selecting an appropriate base-classifier for multivariate data series classification, while demonstrating the superiority of the Random Shapelet Forest (0.825 accuracy) against competitor methods (0.664 accuracy for 1-NN under cDTW).

    Läs mer om Multi-channel ECG classification using forests of randomized shapelet trees
  • Dimensionality Reduction with Random Indexing: An Application on Adverse Drug Event Detection using Electronic Health Records

    2014. Isak Karlsson, Jing Zhao. IEEE 27th International Symposium on Computer-Based Medical Systems, 304-307

    Konferens

    Although electronic health records (EHRs) have recently become an important data source for drug safety signals detection, which is usually evaluated in clinical trials, the use of such data is often prohibited by dimensionality and available computer resources. Currently, several methods for reducing dimensionality are developed, used and evaluated within the medical domain. While these methods perform well, the computational cost tends to increase with growing dimensionality. An alternative solution is random indexing, a technique commonly employed in text classification to reduce the dimensionality of large and sparse documents. This study aims to explore how the predictive performance of random forest is affected by dimensionality reduction through random indexing to predict adverse drug reactions (ADEs). Data are extracted from EHRs and the task is to predict whether or not a patient should be assigned an ADE related diagnosis code. Four different dimensionality settings are investigated and their sensitivity, specificity and area under ROC curve are reported for 14 data sets. The results show that for the investigated data sets, the predictive performance is not negatively affected by dimensionality reduction, however, the computational cost is significantly reduced. Therefore, this study concludes that applying random indexing on EHR data reduces the computational cost, while retaining the predictive performance.

    Läs mer om Dimensionality Reduction with Random Indexing
  • Handling Sparsity with Random Forests when Predicting Adverse Drug Events from Electronic Health Records

    2014. Isak Karlsson, Henrik Boström. IEEE International Conference on Healthcare Informatics (ICHI), 17-22

    Konferens

    When using electronic health record (EHR) data to build models for predicting adverse drug effects (ADEs), one is typically facing the problem of data sparsity, i.e., drugs and diagnosis codes that could be used for predicting a certain ADE are absent for most observations. For such tasks, the ability to effectively handle sparsity by the employed machine learning technique is crucial. The state-of-the-art random forest algorithm is frequently employed to handle this type of data. It has however recently been demonstrated that the algorithm is biased towards the majority class, which may result in a low predictive performance on EHR data with large numbers of sparse features. In this study, approaches to handle this problem are empirically evaluated using 14 ADE datasets and three performance metrics; F1-score, AUC and Brier score. Two resampling based techniques are investigated and compared to two baseline approaches. The experimental results indicate that, for larger forests, the resampling methods outperform the baseline approaches when considering F1-score, which is consistent with the metric being affected by class bias. The approaches perform on a similar level with respect to AUC, which can be explained by the metric not being sensitive to class bias. Finally, when considering the squared error (Brier score) of individual predictions, one of the baseline approaches turns out to be ahead of the others. A bias-variance analysis shows that this is an effect of the individual trees being more correct on average for the baseline approach and that this outweighs the expected loss from a lower variance. The main conclusion is that the suggested choice of approach to handle sparsity is highly dependent on the performance metric, or the task, of interest. If the task is to accurately assign an ADE to a patient record, a sampling based approach is recommended. If the task is to rank patients according to risk of a certain ADE, the choice of approach is of minor importance. Finally, if the task is to accurately assign probabilities for a certain ADE, then one of the baseline approaches is recommended.

    Läs mer om Handling Sparsity with Random Forests when Predicting Adverse Drug Events from Electronic Health Records
  • Mining Candidates for Adverse Drug Interactions in Electronic Patient Records

    2014. Lars Asker (et al.). PETRA '14 Proceedings of the 7th International Conference on Pervasive Technologies Related to Assistive Environments, PETRA’14

    Konferens

    Electronic patient records provide a valuable source of information for detecting adverse drug events. In this paper, we explore two different but complementary approaches to extracting useful information from electronic patient records with the goal of identifying candidate drugs, or combinations of drugs, to be further investigated for suspected adverse drug events. We propose a novel filter-and-refine approach that combines sequential pattern mining and disproportionality analysis. The proposed method is expected to identify groups of possibly interacting drugs suspected for causing certain adverse drug events. We perform an empirical investigation of the proposed method using a subset of the Stockholm electronic patient record corpus. The data used in this study consists of all diagnoses and medications for a group of patients diagnoses with at least one heart related diagnosis during the period 2008--2010. The study shows that the method indeed is able to detect combinations of drugs that occur more frequently for patients with cardiovascular diseases than for patients in a control group, providing opportunities for finding candidate drugs that cause adverse drug effects through interaction.

    Läs mer om Mining Candidates for Adverse Drug Interactions in Electronic Patient Records
  • Predicting Adverse Drug Events by Analyzing Electronic Patient Records

    2013. Isak Karlsson (et al.). Artificial Intelligence in Medicine, 125-129

    Konferens

    Diagnosis codes for adverse drug events (ADEs) are sometimes missing from electronic patient records (EPRs). This may not only affect patient safety in the worst case, but also the number of reported ADEs, resulting in incorrect risk estimates of prescribed drugs. Large databases of electronic patient records (EPRs) are potentially valuable sources of information to support the identification of ADEs. This study investigates the use of machine learning for predicting one specific ADE based on information extracted from EPRs, including age, gender, diagnoses and drugs. Several predictive models are developed and evaluated using different learning algorithms and feature sets. The highest observed AUC is 0.87, obtained by the random forest algorithm. The resulting model can be used for screening EPRs that are not, but possibly should be, assigned a diagnosis code for the ADE under consideration. Preliminary results from using the model are presented.

    Läs mer om Predicting Adverse Drug Events by Analyzing Electronic Patient Records

Visa alla publikationer av Isak Samsten vid Stockholms universitet

profilePageLayout