Isak SamstenUniversitetslektor
Om mig
Jag är universitetslektor i data science och maskininlärning, med en kandidatexamen (2010), masterexamen (2012) och doktorsexamen (2017) i datavetenskap. Mitt huvudsakliga forskningsområde är förklarbar och tolkbar maskininlärning (explainable AI), med särskilt fokus på temporala data, såsom tidsserier. Inom detta område undersöker jag metoder och system som kan ge svar på frågor såsom "Varför gjordes denna prediktion?" (förklaringar) och "Vilka förändringar krävs för att få ett annat utfall?" (kontrafaktiska förklaringar).
Undervisning
Forskningsprojekt
Publikationer
I urval från Stockholms universitets publikationsdatabas
-
COMET: Constrained Counterfactual Explanations for Patient Glucose Multivariate Forecasting
2024. Zhendong Wang (et al.). Annual IEEE Symposium on Computer-Based Medical Systems, 502-507
KonferensApplying deep learning models for healthcare-related forecasting applications has been widely adopted, such as leveraging glucose monitoring data of diabetes patients to predict hyperglycaemic or hypoglycaemic events. However, most deep learning models are considered black-boxes; hence, the model predictions are not interpretable and may not offer actionable insights into medical practitioners’ decisions. Previous work has shown that counterfactual explanations can be applied in forecasting tasks by suggesting counterfactual changes in time series inputs to achieve the desired forecasting outcome. This study proposes a generalized multivariate forecasting setup of counterfactual generation by introducing a novel approach, COMET, which imposes three domain-specific constraint mechanisms to provide counterfactual explanations for glucose forecasting. Moreover, we conduct the experimental evaluation using two diabetes patient datasets to demonstrate the effectiveness of our proposed approach in generating realistic counterfactual changes in comparison with a baseline approach. Our qualitative analysis evaluates examples to validate that the counterfactual samples are clinically relevant and can effectively lead the patients to achieve a normal range of predicted glucose levels by suggesting changes to the treatment variables.
-
Counterfactual Explanations for Time Series Forecasting
2024. Zhendong Wang (et al.). 2023 IEEE International Conference on Data Mining (ICDM), 1391-1396
KonferensAmong recent developments in time series forecasting methods, deep forecasting models have gained popularity as they can utilize hidden feature patterns in time series to improve forecasting performance. Nevertheless, the majority of current deep forecasting models are opaque, hence making it challenging to interpret the results. While counterfactual explanations have been extensively employed as a post-hoc approach for explaining classification models, their application to forecasting models still remains underexplored. In this paper, we formulate the novel problem of counterfactual generation for time series forecasting, and propose an algorithm, called ForecastCF, that solves the problem by applying gradient-based perturbations to the original time series. The perturbations are further guided by imposing constraints to the forecasted values. We experimentally evaluate ForecastCF using four state-of-the-art deep model architectures and compare to two baselines. ForecastCF outperforms the baselines in terms of counterfactual validity and data manifold closeness, while generating meaningful and relevant counterfactuals for various forecasting tasks.
-
Glacier: guided locally constrained counterfactual explanations for time series classification
2024. Zhendong Wang (et al.). Machine Learning 113, 4639-4669
ArtikelIn machine learning applications, there is a need to obtain predictive models of high performance and, most importantly, to allow end-users and practitioners to understand and act on their predictions. One way to obtain such understanding is via counterfactuals, that provide sample-based explanations in the form of recommendations on which features need to be modified from a test example so that the classification outcome of a given classifier changes from an undesired outcome to a desired one. This paper focuses on the domain of time series classification, more specifically, on defining counterfactual explanations for univariate time series. We propose Glacier, a model-agnostic method for generating locally-constrained counterfactual explanations for time series classification using gradient search either on the original space or on a latent space that is learned through an auto-encoder. An additional flexibility of our method is the inclusion of constraints on the counterfactual generation process that favour applying changes to particular time series points or segments while discouraging changing others. The main purpose of these constraints is to ensure more reliable counterfactuals, while increasing the efficiency of the counterfactual generation process. Two particular types of constraints are considered, i.e., example-specific constraints and global constraints. We conduct extensive experiments on 40 datasets from the UCR archive, comparing different instantiations of Glacier against three competitors. Our findings suggest that Glacier outperforms the three competitors in terms of two common metrics for counterfactuals, i.e., proximity and compactness. Moreover, Glacier obtains comparable counterfactual validity compared to the best of the three competitors. Finally, when comparing the unconstrained variant of Glacier to the constraint-based variants, we conclude that the inclusion of example-specific and global constraints yields a good performance while demonstrating the trade-off between the different metrics.
-
Predictive Machine Learning in Assessing Materiality: The Global Reporting Initiative Standard and Beyond
2024. Jan Svanberg (et al.). Artificial Intelligence for Sustainability, 105-131
KapitelSustainability reporting standards state that material information should be disclosed, but materiality is not easily nor consistently defined across companies and sectors. Research finds that materiality assessments by reporting companies and sustainability auditors are uncertain, discretionary, and subjective. This chapter investigates a machine learning approach to sustainability reporting materiality assessments that has predictive validity. The investigated assessment methodology provides materiality assessments of disclosed as well as non-disclosed sustainability items consistent with the impact materiality GRI (Global Reporting Initiative) reporting standard. Our machine learning model estimates the likelihood that a company fully complies with environmental responsibilities. We then explore how a state-of-the-art model interpretation method, the SHAP (SHapley Additive exPlanations) developed by Lundberg and Lee (A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 2017-December, pp 4766–4775, 2017), can be used to estimate impact materiality.
-
Assessment 10 of double materiality: The development of predictively valid materiality assessments with artificial intelligence
2023. Peter Öhman, Jan Svanberg, Isak Samsten. Auditing Transformation, 205-226
KapitelSustainability reporting standards, e.g. the Global Reporting Initiative, require a broader definition of materiality than is traditionally used in financial reporting. Double materiality expands the material information concept to include information about companies' environmental and social impact relevant to society at large. A problem for reporting companies as well as auditors (even though accounting firms invest resources in establishing themselves as reliable service providers) is that the assessment of double materiality is uncertain. The chapter utilises machine learning methods to suggest a method to determine double materiality in sustainability reporting by examining what type of information can predict environmental issues resulting from companies' operations. It represents a proposal to use a structured and quantitative approach for sustainability auditors to determine double materiality, thereby potentially facilitating sustainability reporting and assurance in accordance with future regulation.
-
Must social performance ratings be idiosyncratic? An exploration of social performance ratings with predictive validity
2023. Jan Svanberg (et al.). Sustainability Accounting, Management and Policy Journal 14 (7), 313-348
ArtikelSyftet med denna studie är att utveckla en metod för att bedöma social prestation. Traditionellt använder leverantörer av miljö, social och styrning (ESG) subjektivt viktade aritmetiska medelvärden för att kombinera en uppsättning sociala prestationsindikatorer (SP) till en enda värdering. För att övervinna detta problem undersöker denna studie förutsättningarna för en ny metodik för att klassificera SP-komponenten i ESG genom att tillämpa maskininlärning (ML) och artificiell intelligens (AI) förankrade i sociala kontroverser.
Den här studien föreslår användningen av en datadriven klassificeringsmetodik som härleder den relativa betydelsen av SP-egenskaper från deras bidrag till förutsägelsen av sociala kontroverser. Författarna använder den föreslagna metoden för att lösa viktningsproblemet med övergripande ESG-betyg och ytterligare undersöka om förutsägelse är möjlig.
Författarna finner att ML-modeller kan förutsäga kontroverser med hög prediktiv prestanda och validitet. Resultaten tyder på att viktningsproblemet med ESG-betygen kan lösas med ett datadrivet tillvägagångssätt. Den avgörande förutsättningen för den föreslagna ratingmetodiken är dock att sociala kontroverser förutsägs av en bred uppsättning SP-indikatorer. Resultaten tyder också på att prediktivt giltiga betyg kan utvecklas med denna ML-baserade AI-metod.
Praktiska konsekvenser
Denna studie erbjuder praktiska lösningar på ESG-ratingproblem som har konsekvenser för investerare, ESG-bedömare och socialt ansvarsfulla investeringar.
Den föreslagna ML-baserade AI-metoden kan bidra till att uppnå bättre ESG-betyg, vilket i sin tur kommer att bidra till att förbättra SP, vilket får konsekvenser för organisationer och samhällen genom hållbar utveckling.
Så vitt författarna vet är denna forskning en av de första studierna som erbjuder en unik metod för att ta itu med ESG-betygsproblemet och förbättra hållbarheten genom att fokusera på SP-indikatorer.
-
Prediction of Controversies and Estimation of ESG Performance: An Experimental Investigation Using Machine Learning
2023. Jan Svanberg (et al.). Handbook of Big Data and Analytics in Accounting and Auditing, 65-87
KapitelWe develop a new methodology for computing environmental, social, and governance (ESG) ratings using a mode of artificial intelligence (AI) called machine learning (ML) to make ESG more transparent. The ML algorithms anchor our rating methodology in controversies related to non-compliance with corporate social responsibility (CSR). This methodology is consistent with the information needs of institutional investors and is the first ESG methodology with predictive validity. Our best model predicts what companies are likely to experience controversies. It has a precision of 70–84 per cent and high predictive performance on several measures. It also provides evidence of what indicators contribute the most to the predicted likelihood of experiencing an ESG controversy. Furthermore, while the common approach of rating companies is to aggregate indicators using the arithmetic average, which is a simple explanatory model designed to describe an average company, the proposed rating methodology uses state-of-the-art AI technology to aggregate ESG indicators into holistic ratings for the predictive modelling of individual company performance.
Predictive modelling using ML enables our models to aggregate the information contained in ESG indicators with far less information loss than with the predominant aggregation method.
-
Style-transfer counterfactual explanations: An application to mortality prevention of ICU patients
2023. Zhendong Wang (et al.). Artificial Intelligence in Medicine 135
ArtikelIn recent years, machine learning methods have been rapidly adopted in the medical domain. However, current state-of-the-art medical mining methods usually produce opaque, black-box models. To address the lack of model transparency, substantial attention has been given to developing interpretable machine learning models. In the medical domain, counterfactuals can provide example-based explanations for predictions, and show practitioners the modifications required to change a prediction from an undesired to a desired state. In this paper, we propose a counterfactual solution MedSeqCF for preventing the mortality of three cohorts of ICU patients, by representing their electronic health records as medical event sequences, and generating counterfactuals by adopting and employing a text style-transfer technique. We propose three model augmentations for MedSeqCF to integrate additional medical knowledge for generating more trustworthy counterfactuals. Experimental results on the MIMIC-III dataset strongly suggest that augmented style-transfer methods can be effectively adapted for the problem of counterfactual explanations in healthcare applications and can further improve the model performance in terms of validity, BLEU-4, local outlier factor, and edit distance. In addition, our qualitative analysis of the results by consultation with medical experts suggests that our style-transfer solutions can generate clinically relevant and actionable counterfactual explanations.
-
Surveillance of communicable diseases using social media: A systematic review
2023. Patrick Pilipiec, Isak Samsten, András Bota. PLOS ONE 18 (2)
ArtikelBackground
Communicable diseases pose a severe threat to public health and economic growth. The traditional methods that are used for public health surveillance, however, involve many drawbacks, such as being labor intensive to operate and resulting in a lag between data collection and reporting. To effectively address the limitations of these traditional methods and to mitigate the adverse effects of these diseases, a proactive and real-time public health surveillance system is needed. Previous studies have indicated the usefulness of performing text mining on social media.
Objective
To conduct a systematic review of the literature that used textual content published to social media for the purpose of the surveillance and prediction of communicable diseases.
Methodology
Broad search queries were formulated and performed in four databases. Both journal articles and conference materials were included. The quality of the studies, operationalized as reliability and validity, was assessed. This qualitative systematic review was guided by the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.
Results
Twenty-three publications were included in this systematic review. All studies reported positive results for using textual social media content to surveille communicable diseases. Most studies used Twitter as a source for these data. Influenza was studied most frequently, while other communicable diseases received far less attention. Journal articles had a higher quality (reliability and validity) than conference papers. However, studies often failed to provide important information about procedures and implementation.
Conclusion
Text mining of health-related content published on social media can serve as a novel and powerful tool for the automated, real-time, and remote monitoring of public health and for the surveillance and prediction of communicable diseases in particular. This tool can address limitations related to traditional surveillance methods, and it has the potential to supplement traditional methods for public health surveillance.
-
Corporate governance performance ratings with machine learning
2022. Jan Svanberg (et al.). International Journal of Intelligent Systems in Accounting, Finance & Management 29 (1), 50-68
ArtikelWe use machine learning with a cross-sectional research design to predict governance controversies and to develop a measure of the governance component of the environmental, social, governance (ESG) metrics. Based on comprehensive governance data from 2,517 companies over a period of 10 years and investigating nine machine-learning algorithms, we find that governance controversies can be predicted with high predictive performance. Our proposed governance rating methodology has two unique advantages compared with traditional ESG ratings: it rates companies' compliance with governance responsibilities and it has predictive validity. Our study demonstrates a solution to what is likely the greatest challenge for the finance industry today: how to assess a company's sustainability with validity and accuracy. Prior to this study, the ESG rating industry and the literature have not provided evidence that widely adopted governance ratings are valid. This study describes the only methodology for developing governance performance ratings based on companies' compliance with governance responsibilities and for which there is evidence of predictive validity.
-
Post Hoc Explainability for Time Series Classification. Toward a signal processing perspective
2022. Rami Mochaourab (et al.). IEEE signal processing magazine (Print) 39 (4), 119-129
ArtikelTime series data correspond to observations of phenomena that are recorded over time [1] . Such data are encountered regularly in a wide range of applications, such as speech and music recognition, monitoring health and medical diagnosis, financial analysis, motion tracking, and shape identification, to name a few. With such a diversity of applications and the large variations in their characteristics, time series classification is a complex and challenging task. One of the fundamental steps in the design of time series classifiers is that of defining or constructing the discriminant features that help differentiate between classes. This is typically achieved by designing novel representation techniques [2] that transform the raw time series data to a new data domain, where subsequently a classifier is trained on the transformed data, such as one-nearest neighbors [3] or random forests [4] . In recent time series classification approaches, deep neural network models have been employed that are able to jointly learn a representation of time series and perform classification [5] . In many of these sophisticated approaches, the discriminant features tend to be complicated to analyze and interpret, given the high degree of nonlinearity.
-
Prediction of environmental controversies and development of a corporate environmental performance rating methodology
2022. Jan Svanberg (et al.). Journal of Cleaner Production 344
ArtikelInstitutional investors seek to make environmentally sustainable investments using environment, social, governance (ESG) ratings. Current ESG ratings have limited validity because they are based on idiosyncratic scores derived using subjective, discretionary methodologies. We discuss a new direction for developing corporate environmental performance (CEP) ratings and propose a solution to the limited validity problem by anchoring such ratings in environmental controversies. The study uses a novel machine learning approach to make the ratings more comprehensive and transparent, based on a set of algorithmic approaches that handle nonlinearity when aggregating ESG indicators. This approach minimizes the rater subjectivity and preferences inherent in traditional ESG indicators. The findings indicate that controversies as proxies for non-compliance with environmental responsibilities can be predicted well. We conclude that environmental performance ratings developed using our machine learning framework offer predictive validity consistent with institutional investors' demand for socially responsible investment screening.
-
Assessing the Clinical Validity of Attention-based and SHAP Temporal Explanations for Adverse Drug Event Predictions
2021. Jonathan Rebane (et al.). 2021 IEEE 34th International Symposium on Computer-Based Medical Systems, 235-240
KonferensAttention mechanisms form the basis of providing temporal explanations for a variety of state-of-the-art recurrent neural network (RNN) based architectures. However, evidence is lacking that attention mechanisms are capable of providing sufficiently valid medical explanations. In this study we focus on the quality of temporal explanations for the medical problem of adverse drug event (ADE) prediction by comparing explanations globally and locally provided by an attention-based RNN architecture against those provided by more a more basic RNN using the post-hoc SHAP framework, a popular alternative option which adheres to several desirable explainability properties. The validity of this comparison is supported by medical expert knowledge gathered for the purpose of this study. This investigation has uncovered that these explanation methods both possess appropriateness for ADE explanations and may be used complementarily, due to SHAP providing more clinically appropriate global explanations and attention mechanisms capturing more clinically appropriate local explanations. Additional feedback from medical experts reveal that SHAP may be more applicable to real-time clinical encounters, in which efficiency must be prioritised, over attention explanations which possess properties more appropriate for offline analyses.
-
Counterfactual Explanations for Survival Prediction of Cardiovascular ICU Patients
2021. Zhendong Wang, Isak Samsten, Panagiotis Papapetrou. Artificial Intelligence in Medicine, 338-348
KonferensIn recent years, machine learning methods have been rapidly implemented in the medical domain. However, current state-of-the-art methods usually produce opaque, black-box models. To address the lack of model transparency, substantial attention has been given to develop interpretable machine learning methods. In the medical domain, counterfactuals can provide example-based explanations for predictions, and show practitioners the modifications required to change a prediction from an undesired to a desired state. In this paper, we propose a counterfactual explanation solution for predicting the survival of cardiovascular ICU patients, by representing their electronic health record as a sequence of medical events, and generating counterfactuals by adopting and employing a text style-transfer technique. Experimental results on the MIMIC-III dataset strongly suggest that text style-transfer methods can be effectively adapted for the problem of counterfactual explanations in healthcare applications and can achieve competitive performance in terms of counterfactual validity, BLEU-4 and local outlier metrics.
-
Learning Time Series Counterfactuals via Latent Space Representations
2021. Zhendong Wang (et al.). Discovery Science, 369-384
KonferensCounterfactual explanations can provide sample-based explanations of features required to modify from the original sample to change the classification result from an undesired state to a desired state; hence it provides interpretability of the model. Previous work of LatentCF presents an algorithm for image data that employs auto-encoder models to directly transform original samples into counterfactuals in a latent space representation. In our paper, we adapt the approach to time series classification and propose an improved algorithm named LatentCF++ which introduces additional constraints in the counterfactual generation process. We conduct an extensive experiment on a total of 40 datasets from the UCR archive, comparing to current state-of-the-art methods. Based on our evaluation metrics, we show that the LatentCF++ framework can with high probability generate valid counterfactuals and achieve comparable explanations to current state-of-the-art. Our proposed approach can also generate counterfactuals that are considerably closer to the decision boundary in terms of margin difference.
-
SMILE: A feature-based temporal abstraction framework for event-interval sequence classification
2021. Jonathan Rebane (et al.). Data mining and knowledge discovery 35 (1), 372-399
ArtikelIn this paper, we study the problem of classification of sequences of temporal intervals. Our main contribution is a novel framework, which we call SMILE, for extracting relevant features from interval sequences to construct classifiers.SMILE introduces the notion of utilizing random temporal abstraction features, we define as e-lets, as a means to capture information pertaining to class-discriminatory events which occur across the span of complete interval sequences. Our empirical evaluation is applied to a wide array of benchmark data sets and fourteen novel datasets for adverse drug event detection. We demonstrate how the introduction of simple sequential features, followed by progressively more complex features each improve classification performance. Importantly, this investigation demonstrates that SMILE significantly improves AUC performance over the current state-of-the-art. The investigation also reveals that the selection of underlying classification algorithm is important to achieve superior predictive performance, and how the number of features influences the performance of our framework.
-
Exploiting complex medical data with interpretable deep learning for adverse drug event prediction
2020. Jonathan Rebane, Isak Samsten, Panagiotis Papapetrou. Artificial Intelligence in Medicine 109
ArtikelA variety of deep learning architectures have been developed for the goal of predictive modelling and knowledge extraction from medical records. Several models have placed strong emphasis on temporal attention mechanisms and decay factors as a means to include highly temporally relevant information regarding the recency of medical event occurrence while facilitating medical code-level interpretability. In this study we utilise such models with a large Electronic Patient Record (EPR) data set consisting of diagnoses, medication, and clinical text data for the purpose of adverse drug event (ADE) prediction. The first contribution of this work is an empirical evaluation of two state-of-the-art medical-code based models in terms of objective performance metrics for ADE prediction on diagnosis and medication data. Secondly, as an extension of previous work, we augment an interpretable deep learning architecture to permit numerical risk and clinical text features and demonstrate how this approach yields improved predictive performance compared to the other baselines. Finally, we assess the importance of attention mechanisms in regards to their usefulness for medical code-level and text-level interpretability, which may facilitate novel insights pertaining to the nature of ADE occurrence within the health care domain.
-
Locally and globally explainable time series tweaking
2020. Isak Karlsson (et al.). Knowledge and Information Systems 62 (5), 1671-1700
ArtikelTime series classification has received great attention over the past decade with a wide range of methods focusing on predictive performance by exploiting various types of temporal features. Nonetheless, little emphasis has been placed on interpretability and explainability. In this paper, we formulate the novel problem of explainable time series tweaking, where, given a time series and an opaque classifier that provides a particular classification decision for the time series, we want to find the changes to be performed to the given time series so that the classifier changes its decision to another class. We show that the problem is NP -hard, and focus on three instantiations of the problem using global and local transformations. In the former case, we investigate the k-nearest neighbor classifier and provide an algorithmic solution to the global time series tweaking problem. In the latter case, we investigate the random shapelet forest classifier and focus on two instantiations of the local time series tweaking problem, which we refer to as reversible and irreversible time series tweaking, and propose two algorithmic solutions for the two problems along with simple optimizations. An extensive experimental evaluation on a variety of real datasets demonstrates the usefulness and effectiveness of our problem formulation and solutions.
-
A classification framework for exploiting sparse multi-variate temporal features with application to adverse drug event detection in medical records
2019. Francesco Bagattini (et al.). BMC Medical Informatics and Decision Making 19
ArtikelBackground: Adverse drug events (ADEs) as well as other preventable adverse events in the hospital setting incur a yearly monetary cost of approximately $3.5 billion, in the United States alone. Therefore, it is of paramount importance to reduce the impact and prevalence of ADEs within the healthcare sector, not only since it will result in reducing human suffering, but also as a means to substantially reduce economical strains on the healthcare system. One approach to mitigate this problem is to employ predictive models. While existing methods have been focusing on the exploitation of static features, limited attention has been given to temporal features.
Methods: In this paper, we present a novel classification framework for detecting ADEs in complex Electronic health records (EHRs) by exploiting the temporality and sparsity of the underlying features. The proposed framework consists of three phases for transforming sparse and multi-variate time series features into a single-valued feature representation, which can then be used by any classifier. Moreover, we propose and evaluate three different strategies for leveraging feature sparsity by incorporating it into the new representation.
Results: A large-scale evaluation on 15 ADE datasets extracted from a real-world EHR system shows that the proposed framework achieves significantly improved predictive performance compared to state-of-the-art. Moreover, our framework can reveal features that are clinically consistent with medical findings on ADE detection.
Conclusions: Our study and experimental findings demonstrate that temporal multi-variate features of variable length and with high sparsity can be effectively utilized to predict ADEs from EHRs. Two key advantages of our framework are that it is method agnostic, i.e., versatile, and of low computational cost, i.e., fast; hence providing an important building block for future exploitation within the domain of machine learning from EHRs.
-
Example-Based Feature Tweaking Using Random Forests
2019. Tony Lindgren (et al.). 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science, 53-60
KonferensIn certain application areas when using predictive models, it is not enough to make an accurate prediction for an example, instead it might be more important to change a prediction from an undesired class into a desired class. In this paper we investigate methods for changing predictions of examples. To this end, we introduce a novel algorithm for changing predictions of examples and we compare this novel method to an existing method and a baseline method. In an empirical evaluation we compare the three methods on a total of 22 datasets. The results show that the novel method and the baseline method can change an example from an undesired class into a desired class in more cases than the competitor method (and in some cases this difference is statistically significant). We also show that the distance, as measured by the euclidean norm, is higher for the novel and baseline methods (and in some cases this difference is statistically significantly) than for state-of-the-art. The methods and their proposed changes are also evaluated subjectively in a medical domain with interesting results.
-
Explainable time series tweaking via irreversible and reversible temporal transformations
2018. Isak Karlsson (et al.). 2018 IEEE International Conference on Data Mining (ICDM), 207-216
KonferensTime series classification has received great attention over the past decade with a wide range of methods focusing on predictive performance by exploiting various types of temporal features. Nonetheless, little emphasis has been placed on interpretability and explainability. In this paper, we formulate the novel problem of explainable time series tweaking, where, given a time series and an opaque classifier that provides a particular classification decision for the time series, we want to find the minimum number of changes to be performed to the given time series so that the classifier changes its decision to another class. We show that the problem is NP-hard, and focus on two instantiations of the problem, which we refer to as reversible and irreversible time series tweaking. The classifier under investigation is the random shapelet forest classifier. Moreover, we propose two algorithmic solutions for the two problems along with simple optimizations, as well as a baseline solution using the nearest neighbor classifier. An extensive experimental evaluation on a variety of real datasets demonstrates the usefulness and effectiveness of our problem formulation and solutions.
-
Exploring epistaxis as an adverse effect of anti-thrombotic drugs and outdoor temperature
2018. Jaakko Hollmén (et al.). Proceedings of the 11th PErvasive Technologies Related to Assistive Environments Conference (PETRA), 1-4
KonferensElectronic health records contain a wealth of epidemiological information about diseases at the population level. Using a database of medical diagnoses and drug prescriptions in electronic health records, we investigate the correlation between outdoor temperature and the incidence of epistaxis over time for two groups of patients. One group consists of patients that had been diagnosed with epistaxis and also been prescribed at least one of the three anti-thrombotic agents: Warfarin, Apixaban, or Rivaroxaban. The other group consists of patients that had been diagnosed with epistaxis and not been prescribed any of the three anti-thrombotic drugs. We find a strong negative correlation between the incidence of epistaxis and outdoor temperature for the group that had not been prescribed any of the three anti-thrombotic drugs, while there is a weaker correlation between incidence of epistaxis and outdoor temperature for the other group. It is, however, clear that both groups are affected in a similar way, such that the incidence of epistaxis increases with colder temperatures.
-
Seq2Seq RNNs and ARIMA models for Cryptocurrency Prediction: A Comparative Study
2018. Jonathan Rebane (et al.). Proceedings of SIGKDD Workshop on Fintech (SIGKDD Fintech’18)
KonferensCyrptocurrency price prediction has recently become an alluring topic, attracting massive media and investor interest. Traditional models, such as Autoregressive Integrated Moving Average models (ARIMA) and models with more modern popularity, such as Recurrent Neural Networks (RNN’s) can be considered candidates for such financial prediction problems, with RNN’s being capable of utilizing various endogenous and exogenous input sources. This study compares the model performance of ARIMA to that of a seq2seq recurrent deep multi-layer neural network (seq2seq) utilizing a varied selection of inputs types. The results demonstrate superior performance of seq2seq over ARIMA, for models generated throughout most of bitcoin price history, with additional data sources leading to better performance during less volatile price periods.
-
Conformal prediction using random survival forests
2017. Henrik Boström (et al.). 16th IEEE International Conference on Machine Learning and Applications, 812-817
KonferensRandom survival forests constitute a robust approach to survival modeling, i.e., predicting the probability that an event will occur before or on a given point in time. Similar to most standard predictive models, no guarantee for the prediction error is provided for this model, which instead typically is empirically evaluated. Conformal prediction is a rather recent framework, which allows the error of a model to be determined by a user specified confidence level, something which is achieved by considering set rather than point predictions. The framework, which has been applied to some of the most popular classification and regression techniques, is here for the first time applied to survival modeling, through random survival forests. An empirical investigation is presented where the technique is evaluated on datasets from two real-world applications; predicting component failure in trucks using operational data and predicting survival and treatment of heart failure patients from administrative healthcare data. The experimental results show that the error levels indeed are very close to the provided confidence levels, as guaranteed by the conformal prediction framework, and that the error for predicting each outcome, i.e., event or no-event, can be controlled separately. The latter may, however, lead to less informative predictions, i.e., larger prediction sets, in case the class distribution is heavily imbalanced.
-
KAPMiner: Mining Ordered Association Rules with Constraints
2017. Isak Karlsson, Panagiotis Papapetrou, Lars Asker. Advances in Intelligent Data Analysis XVI, 149-161
KonferensWe study the problem of mining ordered association rules from event sequences. Ordered association rules differ from regular association rules in that the events occurring in the antecedent (left hand side) of the rule are temporally constrained to occur strictly before the events in the consequent (right hand side). We argue that such constraints can provide more meaningful rules in particular application domains, such as health care. The importance and interestingness of the extracted rules are quantified by adapting existing rule mining metrics. Our experimental evaluation on real data sets demonstrates the descriptive power of ordered association rules against ordinary association rules.
-
Learning from Administrative Health Registries
2017. Jonathan Rebane (et al.). SoGood 2017: Data Science for Social Good
KonferensOver the last decades the healthcare domain has seen a tremendous increase and interest in methods for making inference about patient care using large quantities of medical data. Such data is often stored in electronic health records and administrative health registries. As these data sources have grown increasingly complex, with millions of patients represented by thousands of attributes, static or time evolving, finding relevant and accurate patterns that can be used for predictive or descriptive modelling is impractical for human experts. In this paper, we concentrate our review on Swedish Administrative Health Registries (AHRs) and Electronic Health Records (EHRs) and provide an overview of recent and ongoing work in the area with focus on adverse drug events (ADEs) and heart failure.
-
Mining disproportional itemsets for characterizing groups of heart failure patients from administrative health records
2017. Isak Karlsson (et al.). Proceedings of the 10th International Conference on PErvasive Technologies Related to Assistive Environments, 394-398
KonferensHeart failure is a serious medical conditions involving decreased quality of life and an increased risk of premature death. A recent evaluation by the Swedish National Board of Health and Welfare shows that Swedish heart failure patients are often undertreated and do not receive basic medication as recommended by the national guidelines for treatment of heart failure. The objective of this paper is to use registry data to characterize groups of heart failure patients, with an emphasis on basic treatment. Towards this end, we explore the applicability of frequent itemset mining and disproportionality analysis for finding interesting and distinctive characterizations of a target group of patients, e.g., those who have received basic treatment, against a control group, e.g., those who have not received basic treatment. Our empirical evaluation is performed on data extracted from administrative health records from the Stockholm County covering the years 2010--2016. Our findings suggest that frequency is not always the most appropriate measure of importance for frequent itemsets, while itemset disproportionality against a control group provides alternative rankings of the extracted itemsets leading to some medically intuitive characterizations of the target groups.
-
Early Random Shapelet Forest
2016. Isak Karlsson, Panagiotis Papapetrou, Henrik Boström. Discovery Science, 261-276
KonferensEarly classification of time series has emerged as an increasingly important and challenging problem within signal processing, especially in domains where timely decisions are critical, such as medical diagnosis in health-care. Shapelets, i.e., discriminative sub-sequences, have been proposed for time series classification as a means to capture local and phase independent information. Recently, forests of randomized shapelet trees have been shown to produce state-of-the-art predictive performance at a low computational cost. In this work, they are extended to allow for early classification of time series. An extensive empirical investigation is presented, showing that the proposed algorithm is superior to alternative state-of-the-art approaches, in case predictive performance is considered to be more important than earliness. The algorithm allows for tuning the trade-off between accuracy and earliness, thereby supporting the generation of early classifiers that can be dynamically adapted to specific needs at low computational cost.
-
Generalized random shapelet forests
2016. Isak Karlsson, Panagiotis Papapetrou, Henrik Boström. Data mining and knowledge discovery 30 (5), 1053-1085
ArtikelShapelets are discriminative subsequences of time series, usually embedded in shapelet-based decision trees. The enumeration of time series shapelets is, however, computationally costly, which in addition to the inherent difficulty of the decision tree learning algorithm to effectively handle high-dimensional data, severely limits the applicability of shapelet-based decision tree learning from large (multivariate) time series databases. This paper introduces a novel tree-based ensemble method for univariate and multivariate time series classification using shapelets, called the generalized random shapelet forest algorithm. The algorithm generates a set of shapelet-based decision trees, where both the choice of instances used for building a tree and the choice of shapelets are randomized. For univariate time series, it is demonstrated through an extensive empirical investigation that the proposed algorithm yields predictive performance comparable to the current state-of-the-art and significantly outperforms several alternative algorithms, while being at least an order of magnitude faster. Similarly for multivariate time series, it is shown that the algorithm is significantly less computationally costly and more accurate than the current state-of-the-art.
-
Predicting Adverse Drug Events using Heterogeneous Event Sequences
2016. Isak Karlsson, Henrik Boström. 2016 IEEE International Conference on Healthcare Informatics (ICHI), 356-362
KonferensAdverse drug events (ADEs) are known to be severely under-reported in electronic health record (EHR) systems. One approach to mitigate this problem is to employ machine learning methods to detect and signal for potentially missing ADEs, with the aim of increasing reporting rates. There are, however, many challenges involved in constructing prediction models for this task, since data present in health care records is heterogeneous, high dimensional, sparse and temporal. Previous approaches typically employ bag-of-items representations of clinical events that are present in a record, ignoring the temporal aspects. In this paper, we study the problem of classifying heterogeneous and multivariate event sequences using a novel algorithm building on the well known concept of ensemble learning. The proposed approach is empirically evaluated using 27 datasets extracted from a real EHR database with different ADEs present. The results indicate that the proposed approach, which explicitly models the temporal nature of clinical data, can be expected to outperform, in terms of the trade-off between precision and specificity, models that do no consider the temporal aspects.
-
Semigeometric Tiling of Event Sequences
2016. Andreas Henelius (et al.). Machine Learning and Knowledge Discovery in Databases, 329-344
KonferensEvent sequences are ubiquitous, e.g., in finance, medicine, and social media. Often the same underlying phenomenon, such as television advertisements during Superbowl, is reflected in independent event sequences, like different Twitter users. It is hence of interest to find combinations of temporal segments and subsets of sequences where an event of interest, like a particular hashtag, has an increased occurrence probability. Such patterns allow exploration of the event sequences in terms of their evolving temporal dynamics, and provide more fine-grained insights to the data than what for example straightforward clustering can reveal. We formulate the task of finding such patterns as a novel matrix tiling problem, and propose two algorithms for solving it. Our first algorithm is a greedy set-cover heuristic, while in the second approach we view the problem as time-series segmentation. We apply the algorithms on real and artificial datasets and obtain promising results. The software related to this paper is available at https://github.com/bwrc/semigeom-r.
-
Embedding-based subsequence matching with gaps-range-tolerances: a Query-By-Humming application
2015. Alexios Kotsifakos (et al.). The VLDB journal 24 (4), 519-536
ArtikelWe present a subsequence matching framework that allows for gaps in both query and target sequences, employs variable matching tolerance efficiently tuned for each query and target sequence, and constrains the maximum matching range. Using this framework, a dynamic programming method is proposed, called SMBGT, that, given a short query sequence Q and a large database, identifies in quadratic time the subsequence of the database that best matches Q. SMBGT is highly applicable to music retrieval. However, in Query-By-Humming applications, runtime is critical. Hence, we propose a novel embedding-based approach, called ISMBGT, for speeding up search under SMBGT. Using a set of reference sequences, ISMBGT maps both Q and each position of each database sequence into vectors. The database vectors closest to the query vector are identified, and SMBGT is then applied between Q and the subsequences that correspond to those database vectors. The key novelties of ISMBGT are that it does not require training, it is query sensitive, and it exploits the flexibility of SMBGT. We present an extensive experimental evaluation using synthetic and hummed queries on a large music database. Our findings show that ISMBGT can achieve speedups of up to an order of magnitude against brute-force search and over an order of magnitude against cDTW, while maintaining a retrieval accuracy very close to that of brute-force search.
-
Forests of Randomized Shapelet Trees
2015. Isak Karlsson, Panagiotis Papapetrou, Henrik Boström. Statistical Learning and Data Sciences, 126-136
KonferensShapelets have recently been proposed for data series classification, due to their ability to capture phase independent and local information. Decision trees based on shapelets have been shown to provide not only interpretable models, but also, in many cases, state-of-the-art predictive performance. Shapelet discovery is however computationally costly, and although several techniques for speeding up the technique have been proposed, the computational cost is still in many cases prohibitive. In this work, an ensemble based method, referred to as Random Shapelet Forest (RSF), is proposed, which builds on the success of the random forest algorithm, and which is shown to have a lower computational complexity than the original shapelet tree learning algorithm. An extensive empirical investigation shows that the algorithm provides competitive predictive performance and that a proposed way of calculating importance scores can be used to successfully identify influential regions.
-
Multi-channel ECG classification using forests of randomized shapelet trees
2015. Isak Karlsson, Panagiotis Papapetrou, Lars Asker. Proceedings of the 8th ACM International Conference on PErvasive Technologies Related to Assistive Environments
KonferensData series of multiple channels occur at high rates and in massive quantities in several application domains, such as healthcare. In this paper, we study the problem of multi-channel ECG classification. We map this problem to multivariate data series classification and propose five methods for solving it, using a split-and-combine approach. The proposed framework is evaluated using three base-classifiers on real-world data for detecting Myocardial Infarction. Extensive experiments are performed on real ECG data extracted from the Physiobank data repository. Our findings emphasize the importance of selecting an appropriate base-classifier for multivariate data series classification, while demonstrating the superiority of the Random Shapelet Forest (0.825 accuracy) against competitor methods (0.664 accuracy for 1-NN under cDTW).
-
Dimensionality Reduction with Random Indexing: An Application on Adverse Drug Event Detection using Electronic Health Records
2014. Isak Karlsson, Jing Zhao. IEEE 27th International Symposium on Computer-Based Medical Systems, 304-307
KonferensAlthough electronic health records (EHRs) have recently become an important data source for drug safety signals detection, which is usually evaluated in clinical trials, the use of such data is often prohibited by dimensionality and available computer resources. Currently, several methods for reducing dimensionality are developed, used and evaluated within the medical domain. While these methods perform well, the computational cost tends to increase with growing dimensionality. An alternative solution is random indexing, a technique commonly employed in text classification to reduce the dimensionality of large and sparse documents. This study aims to explore how the predictive performance of random forest is affected by dimensionality reduction through random indexing to predict adverse drug reactions (ADEs). Data are extracted from EHRs and the task is to predict whether or not a patient should be assigned an ADE related diagnosis code. Four different dimensionality settings are investigated and their sensitivity, specificity and area under ROC curve are reported for 14 data sets. The results show that for the investigated data sets, the predictive performance is not negatively affected by dimensionality reduction, however, the computational cost is significantly reduced. Therefore, this study concludes that applying random indexing on EHR data reduces the computational cost, while retaining the predictive performance.
-
Handling Sparsity with Random Forests when Predicting Adverse Drug Events from Electronic Health Records
2014. Isak Karlsson, Henrik Boström. IEEE International Conference on Healthcare Informatics (ICHI), 17-22
KonferensWhen using electronic health record (EHR) data to build models for predicting adverse drug effects (ADEs), one is typically facing the problem of data sparsity, i.e., drugs and diagnosis codes that could be used for predicting a certain ADE are absent for most observations. For such tasks, the ability to effectively handle sparsity by the employed machine learning technique is crucial. The state-of-the-art random forest algorithm is frequently employed to handle this type of data. It has however recently been demonstrated that the algorithm is biased towards the majority class, which may result in a low predictive performance on EHR data with large numbers of sparse features. In this study, approaches to handle this problem are empirically evaluated using 14 ADE datasets and three performance metrics; F1-score, AUC and Brier score. Two resampling based techniques are investigated and compared to two baseline approaches. The experimental results indicate that, for larger forests, the resampling methods outperform the baseline approaches when considering F1-score, which is consistent with the metric being affected by class bias. The approaches perform on a similar level with respect to AUC, which can be explained by the metric not being sensitive to class bias. Finally, when considering the squared error (Brier score) of individual predictions, one of the baseline approaches turns out to be ahead of the others. A bias-variance analysis shows that this is an effect of the individual trees being more correct on average for the baseline approach and that this outweighs the expected loss from a lower variance. The main conclusion is that the suggested choice of approach to handle sparsity is highly dependent on the performance metric, or the task, of interest. If the task is to accurately assign an ADE to a patient record, a sampling based approach is recommended. If the task is to rank patients according to risk of a certain ADE, the choice of approach is of minor importance. Finally, if the task is to accurately assign probabilities for a certain ADE, then one of the baseline approaches is recommended.
-
Mining Candidates for Adverse Drug Interactions in Electronic Patient Records
2014. Lars Asker (et al.). PETRA '14 Proceedings of the 7th International Conference on Pervasive Technologies Related to Assistive Environments, PETRA’14
KonferensElectronic patient records provide a valuable source of information for detecting adverse drug events. In this paper, we explore two different but complementary approaches to extracting useful information from electronic patient records with the goal of identifying candidate drugs, or combinations of drugs, to be further investigated for suspected adverse drug events. We propose a novel filter-and-refine approach that combines sequential pattern mining and disproportionality analysis. The proposed method is expected to identify groups of possibly interacting drugs suspected for causing certain adverse drug events. We perform an empirical investigation of the proposed method using a subset of the Stockholm electronic patient record corpus. The data used in this study consists of all diagnoses and medications for a group of patients diagnoses with at least one heart related diagnosis during the period 2008--2010. The study shows that the method indeed is able to detect combinations of drugs that occur more frequently for patients with cardiovascular diseases than for patients in a control group, providing opportunities for finding candidate drugs that cause adverse drug effects through interaction.
-
Predicting Adverse Drug Events by Analyzing Electronic Patient Records
2013. Isak Karlsson (et al.). Artificial Intelligence in Medicine, 125-129
KonferensDiagnosis codes for adverse drug events (ADEs) are sometimes missing from electronic patient records (EPRs). This may not only affect patient safety in the worst case, but also the number of reported ADEs, resulting in incorrect risk estimates of prescribed drugs. Large databases of electronic patient records (EPRs) are potentially valuable sources of information to support the identification of ADEs. This study investigates the use of machine learning for predicting one specific ADE based on information extracted from EPRs, including age, gender, diagnoses and drugs. Several predictive models are developed and evaluated using different learning algorithms and feature sets. The highest observed AUC is 0.87, obtained by the random forest algorithm. The resulting model can be used for screening EPRs that are not, but possibly should be, assigned a diagnosis code for the ADE under consideration. Preliminary results from using the model are presented.
Visa alla publikationer av Isak Samsten vid Stockholms universitet