Tony LindgrenUniversitetslektor, docent, enhetschef SAS
Om mig
Forskningsintressen:
- Maskininlärning
- Villkorsprogrammering
- Logikprogramming
Forskning
Forskningsprojekt
Publikationer
I urval från Stockholms universitets publikationsdatabas
-
SCANIA Component X dataset: a real-world multivariate time series dataset for predictive maintenance
2025. Zahra Kharazian (et al.). Scientific Data 12
ArtikelPredicting failures and maintenance time in predictive maintenance is challenging due to the scarcity of comprehensive real-world datasets, and among those available, few are of time series format. This paper introduces a real-world, multivariate time series dataset collected exclusively from a single anonymized engine component (Component X) across a fleet of SCANIA trucks. The dataset includes operational data, repair records, and specifications related to Component X while maintaining confidentiality through anonymization. It is well-suited for a range of machine learning applications, including classification, regression, survival analysis, and anomaly detection, particularly in predictive maintenance scenarios. The dataset’s large population size, diverse features (in the form of histograms and numerical counters), and temporal information make it a unique resource in the field. The objective of releasing this dataset is to give a broad range of researchers the possibility of working with real-world data from an internationally well-known company and introduce a standard benchmark to the predictive maintenance field, fostering reproducible research.
-
Mind the gap: from plausible to valid self-explanations in large language models
2025. Korbinian Robert Randl (et al.). Machine Learning 114 (10)
ArtikelThis paper investigates the reliability of explanations generated by large language models (LLMs) when prompted to explain their previous output. We evaluate two kinds of such self-explanations (SE)—extractive and counterfactual—using state-of-the-art LLMs (1B to 70B parameters) on three different classification tasks (both objective and subjective). In line with Agarwal et al. (Faithfulness versus plausibility: On the (Un)reliability of explanations from large language models. 2024. https://doi.org/10.48550/arXiv.2402.04614), our findings indicate a gap between perceived and actual model reasoning: while SE largely correlate with human judgment (i.e. are plausible), they do not fully and accurately follow the model’s decision process (i.e. are not faithful). Additionally, we show that counterfactual SE are not even necessarily valid in the sense of actually changing the LLM’s prediction. Our results suggest that extractive SE provide the LLM’s “guess” at an explanation based on training data. Conversely, counterfactual SE can help understand the LLM’s reasoning: We show that the issue of validity can be resolved by sampling counterfactual candidates at high temperature—followed by a validity check—and introducing a formula to estimate the number of tries needed to generate valid explanations. This simple method produces plausible and valid explanations that offer a 16 times faster alternative to SHAP on average in our experiments.
-
Crowding distance and IGD-driven grey wolf reinforcement learning approach for multi-objective agile earth observation satellite scheduling
2025. He Wang (et al.). International Journal of Digital Earth 18 (1)
ArtikelWith the rise of low-cost launches, miniaturized space technology, and commercialization, the cost of space missions has dropped, leading to a surge in flexible Earth observation satellites. This increased demand for complex and diverse imaging products requires addressing multi-objective optimization in practice. To this end, we propose a multi-objective agile Earth observation satellite scheduling problem (MOAEOSSP) model and introduce a reinforcement learning-based multi-objective grey wolf optimization (RLMOGWO) algorithm. It aims to maximize observation efficiency while minimizing energy consumption. During population initialization, the algorithm uses chaos mapping and opposition-based learning to enhance diversity and global search, reducing the risk of local optima. It integrates Q-learning into an improved multi-objective grey wolf optimization framework, designing state-action combinations that balance exploration and exploitation. Dynamic parameter adjustments guide position updates, boosting adaptability across different optimization stages. Moreover, the algorithm introduces a reward mechanism based on the crowding distance and inverted generational distance (IGD) to maintain Pareto front diversity and distribution, ensuring a strong multi-objective optimization performance. The experimental results show that the algorithm excels at solving the MOAEOSSP, outperforming competing algorithms across several metrics and demonstrating its effectiveness for complex optimization problems.
-
A Strategy Fusion-Based Multiobjective Optimization Approach for Agile Earth Observation Satellite Scheduling Problem
2024. He Wang (et al.). IEEE Transactions on Geoscience and Remote Sensing 62, 1-14
ArtikelAgile satellite imaging scheduling plays a vital role in improving emergency response, urban planning, national defense, and resource management. With the rise in the number of in-orbit satellites and observation windows, the need for diverse agile Earth observation satellite (AEOS) scheduling has surged. However, current research seldom addresses multiple optimization objectives, which are crucial in many engineering practices. This article tackles a multiobjective AEOS scheduling problem (MOAEOSSP) that aims to optimize total observation task profit, satellite energy consumption, and load balancing. To address this intricate problem, we propose a strategy-fused multiobjective dung beetle optimization (SFMODBO) algorithm. This novel algorithm harnesses the position update characteristics of various dung beetle populations and integrates multiple high-adaptability strategies. Consequently, it strikes a better balance between global search capability and local exploitation accuracy, making it more effective at exploring the solution space and avoiding local optima. The SFMODBO algorithm enhances global search capabilities through diverse strategies, ensuring thorough coverage of the search space. Simultaneously, it significantly improves local optimization precision by fine-tuning solutions in promising regions. This dual approach enables more robust and efficient problem-solving. Simulation experiments confirm the effectiveness and efficiency of the SFMODBO algorithm. Results indicate that it significantly outperforms competitors across multiple metrics, achieving superior scheduling schemes. In addition to these enhanced metrics, the proposed algorithm also exhibits advantages in computation time and resource utilization. This not only demonstrates the algorithm’s robustness but also underscores its efficiency and speed in solving the MOAEOSSP.
-
Automotive fault nowcasting with machine learning and natural language processing
2024. Ioannis Pavlopoulos (et al.). Machine Learning 113 (2), 843-861
ArtikelAutomated fault diagnosis can facilitate diagnostics assistance, speedier troubleshooting, and better-organised logistics. Currently, most AI-based prognostics and health management in the automotive industry ignore textual descriptions of the experienced problems or symptoms. With this study, however, we propose an ML-assisted workflow for automotive fault nowcasting that improves on current industry standards. We show that a multilingual pre-trained Transformer model can effectively classify the textual symptom claims from a large company with vehicle fleets, despite the task’s challenging nature due to the 38 languages and 1357 classes involved. Overall, we report an accuracy of more than 80% for high-frequency classes and above 60% for classes with reasonable minimum support, bringing novel evidence that automotive troubleshooting management can benefit from multilingual symptom text classification.
-
CICLe: Conformal In-Context Learning for Largescale Multi-Class Food Risk Classification
2024. Korbinian Robert Randl (et al.). Findings of the Association for Computational Linguistics, 7695-7715
KonferensContaminated or adulterated food poses a substantial risk to human health. Given sets of labeled web texts for training, Machine Learning and Natural Language Processing can be applied to automatically detect such risks. We publish a dataset of 7,546 short texts describing public food recall announcements. Each text is manually labeled, on two granularity levels (coarse and fine), for food products and hazards that the recall corresponds to. We describe the dataset and benchmark naive, traditional, and Transformer models. Based on our analysis, Logistic Regression based on a tf-idf representation outperforms RoBERTa and XLM-R on classes with low support. Finally, we discuss different prompting strategies and present an LLM-in-the-loop framework, based on Conformal Prediction, which boosts the performance of the base classifier while reducing energy consumption compared to normal prompting.
-
CoPAL: Conformal Prediction for Active Learning with Application to Remaining Useful Life Estimation in Predictive Maintenance
2024. Zahra Kharazian (et al.). Proceedings of Machine Learning Research, 195-217
KonferensActive learning has received considerable attention as an approach to obtain high predictive performance while minimizing the labeling effort. A central component of the active learning framework concerns the selection of objects for labeling, which are used for iteratively updating the underlying model. In this work, an algorithm called CoPAL (Conformal Prediction for Active Learning) is proposed, which makes the selection of objects within active learning based on the uncertainty as quantified by conformal prediction. The efficacy of CoPAL is investigated by considering the task of estimating the remaining useful life (RUL) of assets in the domain of predictive maintenance (PdM). Experimental results
are presented, encompassing diverse setups, including different models, sample selection criteria, conformal predictors, and datasets, using root mean squared error (RMSE) as the primary evaluation metric while also reporting prediction interval sizes over the iterations. The comprehensive analysis confirms the positive effect of using CoPAL for improving predictive performance
-
CounterFair: Group Counterfactuals for Bias Detection, Mitigation and Subgroup Identification
2024. Alejandro Kuratomi Hernandez (et al.). Proceedings 24th IEEE International Conference on Data Mining, 181-190
KonferensCounterfactual explanations can be used as a means to explain a models decision process and to provide recommendations to users on how to improve their current status. The difficulty to apply these counterfactual recommendations from the users perspective, also known as burden, may be used to assess the models algorithmic fairness and to provide fair recommendations among different sensitive feature groups. We propose a novel model-agnostic, mathematical programming-based, group counterfactual algorithm that can: (1) detect biases via group counterfactual burden, (2) produce fair recommendations among sensitive groups and (3) identify relevant subgroups of instances through shared counterfactuals. We analyze these capabilities from the perspective of recourse fairness, and empirically compare our proposed method with the state-of-the-art algorithms for group counterfactual generation in order to assess the bias identification and the capabilities in group counterfactual effectiveness and burden minimization.
-
Ijuice: integer JUstIfied counterfactual explanations
2024. Alejandro Kuratomi Hernandez (et al.). Machine Learning 113, 5731-5771
ArtikelCounterfactual explanations modify the feature values of an instance in order to alter its prediction from an undesired to a desired label. As such, they are highly useful for providing trustworthy interpretations of decision-making in domains where complex and opaque machine learning algorithms are utilized. To guarantee their quality and promote user trust, they need to satisfy the faithfulness desideratum, when supported by the data distribution. We hereby propose a counterfactual generation algorithm for mixed-feature spaces that prioritizes faithfulness through k-justification, a novel counterfactual property introduced in this paper. The proposed algorithm employs a graph representation of the search space and provides counterfactuals by solving an integer program. In addition, the algorithm is classifier-agnostic and is not dependent on the order in which the feature space is explored. In our empirical evaluation, we demonstrate that it guarantees k-justification while showing comparable performance to state-of-the-art methods in feasibility, sparsity, and proximity.
-
Z-Time: efficient and effective interpretable multivariate time series classification
2024. Zed Lee, Tony Lindgren, Panagiotis Papapetrou. Data mining and knowledge discovery 38 (1), 206-236
ArtikelMultivariate time series classification has become popular due to its prevalence in many real-world applications. However, most state-of-the-art focuses on improving classification performance, with the best-performing models typically opaque. Interpretable multivariate time series classifiers have been recently introduced, but none can maintain sufficient levels of efficiency and effectiveness together with interpretability. We introduce Z-Time, a novel algorithm for effective and efficient interpretable multivariate time series classification. Z-Time employs temporal abstraction and temporal relations of event intervals to create interpretable features across multiple time series dimensions. In our experimental evaluation on the UEA multivariate time series datasets, Z-Time achieves comparable effectiveness to state-of-the-art non-interpretable multivariate classifiers while being faster than all interpretable multivariate classifiers. We also demonstrate that Z-Time is more robust to missing values and inter-dimensional orders, compared to its interpretable competitors.
-
SHAP-Driven Explainability in Survival Analysis for Predictive Maintenance Applications
2024. Monireh Kargar-Sharif-Abad (et al.). HAII5.0 2024 Embracing Human-Aware AI in Industry 2024
KonferensIn the dynamic landscape of industrial operations, ensuring machines operate without interruption is crucial for maintaining optimal productivity levels. Estimating the Remaining Useful Life within Predictive Maintenance is vital for minimizing downtime, improving operational efficiency, and prevent-ing unexpected equipment failures. Survival analysis is a beneficial approach in this context due to its power of handling censored data (here referred to industrial assets that have not experienced a failure during the study period). However, the black-box nature of survival analysis models necessitates the use of explainable AI for greater transparency and interpretability. This study evaluates three Machine Learning-based Survival Analysis models and a traditional Survival Analysis model using real-world data from SCANIA AB, which includes over 90% censored data. Results indicate that Random Survival Forest outperforms the Cox Proportional Hazards model and the Gradient Boosting Survival Analysis and Survival Support vector machine. Additionally, we employ SHAP analysis to provide global and local explanations, highlighting the importance and interaction of features in our best-performing model. To overcome the limitation of applying SHAP on survival output, we utilize a surrogate model. Finally, SHAP identifies specific influential features, shedding light on their effects and interactions. This compre-hensive methodology tackles the inherent opacity of machine learning-based survival analysis models, providing valuable insights into their predictive mechanisms. The findings from our SHAP analysis underscore the pivotal role of these identified features and their interactions, thereby enriching our comprehension of the factors influencing Remaining Useful Life predictions.
-
AID4HAI: Automatic Idea Detection for Healthcare-Associated Infections from Twitter, A Framework based on Active Learning and Transfer Learning
2023. Zahra Kharazian (et al.). 35th Annual Workshop of the Swedish Artificial Intelligence Society SAIS 2023
KonferensThis study is a collaboration between data scientists, innovation management researchers from academia, and experts from a hygiene and health company. The study aims to develop an automatic idea detection package to control and prevent healthcare-associated infections (HAI) by extracting informative ideas from social media using Active Learning and Transfer Learning. The proposed package includes a dataset collected from Twitter, expert-created labels, and an annotation framework. Transfer Learning has been used to build a twostep deep neural network model that gradually extracts the semantic representation of the text data using the BERTweet language model in the first step. In the second step, the model classifies the extracted representations as informative or non-informative using a multi-layer perception (MLP). The package is named AID4HAI (Automatic Idea Detection for controlling and preventing Healthcare-Associated Infections) and is publicly available on GitHub.
-
Hierarchical Bayesian modeling for knowledge transfer across engineering fleets via multitask learning
2023. L. A. Bull (et al.). Computer-Aided Civil and Infrastructure Engineering 38 (7), 821-848
ArtikelA population-level analysis is proposed to address data sparsity when building predictive models for engineering infrastructure. Utilizing an interpretable hierarchical Bayesian approach and operational fleet data, domain expertise is naturally encoded (and appropriately shared) between different subgroups, representing (1) use-type, (2) component, or (3) operating condition. Specifically, domain expertise is exploited to constrain the model via assumptions (and prior distributions) allowing the methodology to automatically share information between similar assets, improving the survival analysis of a truck fleet (15% and 13% increases in predictive log-likelihood of hazard) and power prediction in a wind farm (up to 82% reduction in the standard deviation of maximum output prediction). In each asset management example, a set of correlated functions is learnt over the fleet, in a combined inference, to learn a population model. Parameter estimation is improved when subfleets are allowed to share correlated information at different levels in the hierarchy; the (averaged) reduction in standard deviation for interpretable parameters in the survival analysis is 70%, alongside 32% in wind farm power models. In turn, groups with incomplete data automatically borrow statistical strength from those that are data-rich. The statistical correlations enable knowledge transfer via Bayesian transfer learning, and the correlations can be inspected to inform which assets share information for which effect (i.e., parameter). Successes in both case studies demonstrate the wide applicability in practical infrastructure monitoring, since the approach is naturally adapted between interpretable fleet models of different in situ examples.
-
Robust Contrastive Learning and Multi-shot Voting for High-dimensional Multivariate Data-driven Prognostics
2023. Kaiji Sun (et al.). 2023 IEEE International Conference on Prognostics and Health Management (ICPHM), 53-60
KonferensThe availability of data gathered from industrial sensors has increased expeditiously in recent years. These data are valuable assets in delivering exceptional services for manufacturing enterprises. We see growing interests and expectations from manufacturers in deploying artificial intelligence for predictive maintenance. The paper has adopted and transferred a state-of-the-art method from few-shot learning to failure prognostics using the Siamese neural network based contractive learning. The method has three main characteristics on top of the highest performance - a sensitivity of 98.4% for Scania truck's air pressure system failure capture, compared to the methods proposed by the previous related research: prediction stability, deployment flexibility, and the robust multi-shot diagnosis based on selected historical reference samples
-
Measuring the Burden of (Un)fairness Using Counterfactuals
2023. Alejandro Kuratomi Hernandez (et al.). Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 402-417
KonferensIn this paper, we use counterfactual explanations to offer a new perspective on fairness, that, besides accuracy, accounts also for the difficulty or burden to achieve fairness. We first gather a set of fairness-related datasets and implement a classifier to extract the set of false negative test instances to generate different counterfactual explanations on them. We subsequently calculate two measures: the false negative ratio of the set of test instances, and the distance (also called burden) from these instances to their corresponding counterfactuals, aggregated by sensitive feature groups. The first measure is an accuracy-based estimation of the classifier biases against sensitive groups, whilst the second is a counterfactual-based assessment of the difficulty each of these groups has of reaching their corresponding desired ground truth label. We promote the idea that a counterfactual and an accuracy-based fairness measure may assess fairness in a more holistic manner, whilst also providing interpretability. We then propose and evaluate, on these datasets, a measure called Normalized Accuracy Weighted Burden, which is more consistent than only its accuracy or its counterfactual components alone, considering both false negative ratios and counterfactual distance per sensitive feature. We believe this measure would be more adequate to assess classifier fairness and promote the design of better performing algorithms.
-
ORANGE: Opposite-label soRting for tANGent Explanations in heterogeneous spaces
2023. Alejandro Kuratomi Hernandez (et al.). 2023 IEEE 10th International Conference on Data Science and Advanced Analytics (DSAA), 1-10
KonferensMost real-world datasets have a heterogeneous feature space composed of binary, categorical, ordinal, and continuous features. However, the currently available local surrogate explainability algorithms do not consider this aspect, generating infeasible neighborhood centers which may provide erroneous explanations. To overcome this issue, we propose ORANGE, a local surrogate explainability algorithm that generates highaccuracy and high-fidelity explanations in heterogeneous spaces. ORANGE has three main components: (1) it searches for the closest feasible counterfactual point to a given instance of interest by considering feasible values in the features to ensure that the explanation is built around the closest feasible instance and not any, potentially non-existent instance in space; (2) it generates a set of neighboring points around this close feasible point based on the correlations among features to ensure that the relationship among features is preserved inside the neighborhood; and (3) the generated instances are weighted, firstly based on their distance to the decision boundary, and secondly based on the disagreement between the predicted labels of the global model and a surrogate model trained on the neighborhood. Our extensive experiments on synthetic and public datasets show that the performance achieved by ORANGE is best-in-class in both explanation accuracy and fidelity.
-
Weibull recurrent neural networks for failure prognosis using histogram data
2023. Maharshi Dhada (et al.). Neural Computing & Applications 35 (4), 3011-3024
ArtikelWeibull time-to-event recurrent neural networks (WTTE-RNN) is a simple and versatile prognosis algorithm that works by optimising a Weibull survival function using a recurrent neural network. It offers the combined benefits of the sequential nature of the recurrent neural network, and the ability of the Weibull loss function to incorporate censored data. The goal of this paper is to present the first industrial use case of WTTE-RNN for prognosis. Prognosis of turbocharger conditions in a fleet of heavy-duty trucks is presented here, where the condition data used in the case study were recorded as a time series of sparsely sampled histograms. The experiments include comparison of the prediction models trained using data from the entire fleet of trucks vs data from clustered sub-fleets, where it is concluded that clustering is only beneficial as long as the training dataset is large enough for the model to not overfit. Moreover, the censored data from assets that did not fail are also shown to be incorporated while optimising the Weibull loss function and improve prediction performance. Overall, this paper concludes that WTTE-RNN-based failure predictions enable predictive maintenance policies, which are enhanced by identifying the sub-fleets of similar trucks.
-
Low dimensional synthetic data generation for improving data driven prognostic models
2022. Tony Lindgren, Olof Steinert. 2022 IEEE International Conference on Prognostics and Health Management (ICPHM), 173-182
KonferensData driven prognostic models are becoming more prevalent in many areas, ranging from heavy trucks to gas turbines. One aspect of certain prognostic models is the need for labeled failures, which then can be used as positive examples, when modelling the prognostic problem. Unfortunately, standard algorithms for creating prognostic models can suffer when labeled data is unbalanced, w.r.t. class distribution, leading to prognostic models with poor performance. In this paper we present a methodology for creating synthetic data that can be used to augment the underrepresented class and hence dramatically increase performance of the data driven predictive model. In our study we utilize data collected from heavy trucks and focus on predicting failure of one engine component that is crucial for the operation of heavy trucks. We examine different way of generating synthetic examples in a low dimensional setting, it is found that three methods out of the six methods studied does not improve performance compared to using only the original data. The other three methods based on interpolation is superior to only using the original data, with SMOTE outperforming the two other interpolation methods. SMOTE lowers the estimated cost on test data, compared to using a model trained on the original data set only, with 67%.
-
JUICE: JUstIfied Counterfactual Explanations
2022. Alejandro Kuratomi Hernandez (et al.). Discovery Science, 493-508
KonferensComplex, highly accurate machine learning algorithms support decision-making processes with large and intricate datasets. However, these models have low explainability. Counterfactual explanation is a technique that tries to find a set of feature changes on a given instance to modify the models prediction output from an undesired to a desired class. To obtain better explanations, it is crucial to generate faithful counterfactuals, supported by and connected to observations and the knowledge constructed on them. In this study, we propose a novel counterfactual generation algorithm that provides faithfulness by justification, which may increase developers and users trust in the explanations by supporting the counterfactuals with a known observation. The proposed algorithm guarantees justification for mixed-features spaces and we show it performs similarly with respect to state-of-the-art algorithms across other metrics such as proximity, sparsity, and feasibility. Finally, we introduce the first model-agnostic algorithm to verify counterfactual justification in mixed-features spaces.
-
Hybrid feature tweaking
2021. Tony Mattias Lindgren. ICCDE 2021: 2021 7th International Conference on Computing and Data Engineering, 20-26
KonferensWhen using prediction models created from data, it is in certain cases not sufficient for the users to only get a prediction, sometimes accompanied with a probability of the predictive outcome. Instead, a more elaborate answer is required, like given the predictive outcome, how can this outcome be changed to a wished outcome, i.e., feature tweaking. In this paper we introduce a novel hybrid method for performing feature tweaking that builds upon Random Forest Similarity Tweaking and utilizing a Constraint Logic Programming solver for the Finite Domain (CLPFD). This hybrid method is compared to only using a CLPFD solver and to using a previously known feature tweaking algorithm, Actionable Feature Tweaking. The results show that the hybrid method provides a good balance between the distances, comparing the original example and the tweaked example, and completeness, the number of successfully tweaked examples, compared to the other methods. Another benefit with the novel method, is that the user can specify a prediction threshold for feature tweaking and adjust weights of features to mimic the real-world cost of changing feature values.
-
Prediction of Global Navigation Satellite System Positioning Errors with Guarantees
2021. Alejandro Kuratomi Hernandez, Tony Lindgren, Panagiotis Papapetrou. Machine Learning and Knowledge Discovery in Databases: Applied Data Science Track, 562-578
KonferensIntelligent Transportation Systems employ different localization technologies, such as the Global Navigation Satellite System. This system transmits signals between satellite and receiver devices on the ground which can estimate their position on earth’s surface. The accuracy of this positioning estimate, or the positioning error estimation, is of utmost importance for the efficient and safe operation of autonomous vehicles, which require not only the position estimate, but also an estimation of their operation margin. This paper proposes a workflow for positioning error estimation using a random forest regressor along with a post-hoc conformal prediction framework. The latter is calibrated on the random forest out-of-bag samples to transform the obtained positioning error estimates into predicted integrity intervals, which are confidence intervals on the positioning error prediction with at least 99.999% confidence. The performance is measured as the number of ground truth positioning errors inside the predicted integrity intervals. An extensive experimental evaluation is performed on real-world and synthetic data in terms of root mean square error between predicted and ground truth positioning errors. Our solution results in an improvement of 73% compared to earlier research, while providing prediction statistical guarantees.
-
Z-Hist
2021. Zed Lee (et al.). Advances in Intelligent Data Analysis XIX, 376-388
KonferensMultivariate histogram snapshots are complex data structures that frequently occur in predictive maintenance. Histogram snapshots store large amounts of data in devices with small memory capacity, though it remains a challenge to analyze them effectively. In this paper, we propose Z-Hist, a novel framework for representing and temporally abstracting histogram snapshots by converting them into a set of temporal intervals. This conversion enables the exploitation of frequent arrangement mining techniques for extracting disproportionally frequent patterns of such complex structures. Our experiments on a turbo failure dataset from a truck Original Equipment Manufacturer (OEM) demonstrate a promising use-case of Z-Hist. We also benchmark Z-Hist on six synthetic datasets for studying the relationship between distribution changes over time and disproportionality values.
-
An Interactive Visual Tool Enhance Understanding of Random Forest Prediction
2020. Ram B. Gurung, Tony Lindgren, Henrik Boström. Archives of Data Science, Series A 6 (1)
ArtikelRandom forests are known to provide accurate predictions, but the predictions are not easy to understand. In order to provide support for understanding such predictions, an interactive visual tool has been developed. The tool can be used to manipulate selected features to explore what-if scenarios. It exploits the internal structure of decision trees in a trained forest model and presents these information as interactive plots and charts. In addition, the tool presents a simple decision rule as an explanation for the prediction. It also presents the recommendation for reassignments of feature values of the example that leads to change in the prediction to a preferred class. An evaluation of the tool was undertaken in a large truck manufacturing company, targeting a fault prediction of a selected component in trucks. A set of domain experts were invited to use the tool and provide feedback in post-task interviews. The result of this investigation suggests that the tool indeed may aid in understanding the predictions of random forest, and also allows for gaining new insights.
-
Evaluation of Dimensionality Reduction Techniques
2020. Michael Mammo, Tony Lindgren. ICCDE 2020, 75-79
KonferensOne of the commonly observed phenomena in text classification problems is sparsity of the generated feature set. So far, different dimensionality reduction techniques have been developed to reduce feature spaces into a convenient size that a learner algorithm can infer. Among these, Principal Component Analysis (PCA) is one of the well-established techniques which is capable of generating an undistorted view of the data. As a result, variants of the algorithm have been developed and applied in several domains, including text mining. However, PCA does not provide backward traceability to the original features once it projected the initial features to a new space. Also, it needs a relatively large computational space since it uses all features when generating the final features. These drawbacks especially pose a problem in text classification problems where high dimensionality and sparsity are common phenomena. This paper presents a modified version PCA, Principal Feature Analysis (PFA), which enables backward traceability by choosing a subset of optimal features in the original space using the same criteria PCA uses, without involving the initial features into final computation. The proposed technique is tested against benchmark corpora and produced a comparable result as PCA while maintaining traceability to the original feature space.
-
Z-Miner
2020. Zed Lee, Tony Lindgren, Panagiotis Papapetrou. KDD '20, 524-534
KonferensMining frequent patterns of event intervals from a large collection of interval sequences is a problem that appears in several application domains. In this paper, we propose Z-Miner, a novel algorithm for solving this problem that addresses the deficiencies of existing competitors by employing two novel data structures: Z-Table, a hierarchical hash-based data structure for time-efficient candidate generation and support count, and Z-Arrangement, a data structure for efficient memory consumption. The proposed algorithm is able to handle patterns with repetitions of the same event label, allowing for gap and error tolerance constraints, as well as keeping track of the exact occurrences of the extracted frequent patterns. Our experimental evaluation on eight real-world and six synthetic datasets demonstrates the superiority of Z-Miner against four state-of-the-art competitors in terms of runtime efficiency and memory footprint.
-
A Methodology for Prognostics Under the Conditions of Limited Failure Data Availability
2019. Gishan D. Ranasinghe (et al.). IEEE Access 7, 183996-184007
ArtikelWhen failure data are limited, data-driven prognostics solutions underperform since the number of failure data samples is insufficient for training prognostics models effectively. In order to address this problem, we present a novel methodology for generating failure data which allows training datasets to be augmented so that the number of failure data samples is increased. In contrast to existing data generation techniques which duplicate or randomly generate data, the proposed methodology is capable of generating new and realistic failure data samples. The methodology utilises the conditional generative adversarial network and auxiliary information pertaining to failure modes to control and direct the failure data generation process. The theoretical foundation of the methodology in a non-parametric setting is presented and we show that it holds in practice using empirical results. The methodology is evaluated in a real-world case study involving the prediction of air purge valve failures in heavy-trucks. Two prognostics models are developed using the gradient boosting machine and random forest classifiers. When these models are trained on the augmented training dataset, they outperformed the best solution previously proposed in the literature for the case study by a large margin. More specifically, costs due to breakdowns and false alarms are reduced by 44%.
-
Example-Based Feature Tweaking Using Random Forests
2019. Tony Lindgren (et al.). 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science
KonferensIn certain application areas when using predictive models, it is not enough to make an accurate prediction for an example, instead it might be more important to change a prediction from an undesired class into a desired class. In this paper we investigate methods for changing predictions of examples. To this end, we introduce a novel algorithm for changing predictions of examples and we compare this novel method to an existing method and a baseline method. In an empirical evaluation we compare the three methods on a total of 22 datasets. The results show that the novel method and the baseline method can change an example from an undesired class into a desired class in more cases than the competitor method (and in some cases this difference is statistically significant). We also show that the distance, as measured by the euclidean norm, is higher for the novel and baseline methods (and in some cases this difference is statistically significantly) than for state-of-the-art. The methods and their proposed changes are also evaluated subjectively in a medical domain with interesting results.
-
On Data Driven Organizations and the Necessity of Interpretable Models
2019. Tony Lindgren. Smart Grid and Internet of Things, 121-130
KonferensIt this paper we investigate data driven organizations in the context of predictive models, we also reflect on the need for interpretability of the predictive models in such a context. By investigating a specific use-case, the maintenance offer from a heavy truck manufacturer, we explore their current situation trying to identify areas that needs change in order to go from the current situation towards a more data driven and agile maintenance offer. The suggestions for improvements are captured in a proposed data driven framework for this type of business. The aim of the paper is that the suggested framework can inspire and start further discussions and investigations into the best practices for creating a data driven organization, in businesses facing similar challenges as in the presented use-case.
-
Learning Random Forest from Histogram Data Using Split Specific Axis Rotation
2018. Ram B. Gurung, Tony Lindgren, Henrik Boström. International Journal of Machine Learning and Computing 8 (1), 74-79
ArtikelMachine learning algorithms for data containing histogram variables have not been explored to any major extent. In this paper, an adapted version of the random forest algorithm is proposed to handle variables of this type, assuming identical structure of the histograms across observations, i.e., the histograms for a variable all use the same number and width of the bins. The standard approach of representing bins as separate variables, may lead to that the learning algorithm overlooks the underlying dependencies. In contrast, the proposed algorithm handles each histogram as a unit. When performing split evaluation of a histogram variable during tree growth, a sliding window of fixed size is employed by the proposed algorithm to constrain the sets of bins that are considered together. A small number of all possible set of bins are randomly selected and principal component analysis (PCA) is applied locally on all examples in a node. Split evaluation is then performed on each principal component. Results from applying the algorithm to both synthetic and real world data are presented, showing that the proposed algorithm outperforms the standard approach of using random forests together with bins represented as separate variables, with respect to both AUC and accuracy. In addition to introducing the new algorithm, we elaborate on how real world data for predicting NOx sensor failure in heavy duty trucks was prepared, demonstrating that predictive performance can be further improved by adding variables that represent changes of the histograms over time.
-
Random Rule Sets - Combining Random Covering with the Random Subspace Method
2018. Tony Lindgren. International Journal of Machine Learning and Computing 8 (1), 8-13
ArtikelEnsembles of classifiers has proven itself to be among the best methods for creating highly accurate prediction models. In this paper we combine the random coverage method which facilitates additional diversity when inducing rules using the covering algorithm, with the random subspace selection method which has been used successfully by for example the random forest algorithm. We compare three different covering methods with the random forest algorithm; 1st using random subspace selection and random covering; 2nd using bagging and random subspace selection and 3rd using bagging, random subspace selection and random covering. The results show that all three covering algorithms do perform better than the random forest algorithm. The covering algorithm using random subspace selection and random covering performs best of all methods. The results are not significant according to adjusted p-values but for the unadjusted p-value, indicating that the novel method introduced in this paper warrants further attention.
-
Conformal prediction using random survival forests
2017. Henrik Boström (et al.). 16th IEEE International Conference on Machine Learning and Applications, 812-817
KonferensRandom survival forests constitute a robust approach to survival modeling, i.e., predicting the probability that an event will occur before or on a given point in time. Similar to most standard predictive models, no guarantee for the prediction error is provided for this model, which instead typically is empirically evaluated. Conformal prediction is a rather recent framework, which allows the error of a model to be determined by a user specified confidence level, something which is achieved by considering set rather than point predictions. The framework, which has been applied to some of the most popular classification and regression techniques, is here for the first time applied to survival modeling, through random survival forests. An empirical investigation is presented where the technique is evaluated on datasets from two real-world applications; predicting component failure in trucks using operational data and predicting survival and treatment of heart failure patients from administrative healthcare data. The experimental results show that the error levels indeed are very close to the provided confidence levels, as guaranteed by the conformal prediction framework, and that the error for predicting each outcome, i.e., event or no-event, can be controlled separately. The latter may, however, lead to less informative predictions, i.e., larger prediction sets, in case the class distribution is heavily imbalanced.
-
Planning Flexible Maintenance for Heavy Trucks using Machine Learning Models, Constraint Programming, and Route Optimization
2017. Jonas Biteus, Tony Lindgren. SAE International Journal of Materials & Manufacturing 10 (3), 306-315
ArtikelMaintenance planning of trucks at Scania have previously been done using static cyclic plans with fixed sets of maintenance tasks, determined by mileage, calendar time, and some data driven physical models. Flexible maintenance have improved the maintenance program with the addition of general data driven expert rules and the ability to move sub-sets of maintenance tasks between maintenance occasions. Meanwhile, successful modelling with machine learning on big data, automatic planning using constraint programming, and route optimization are hinting on the ability to achieve even higher fleet utilization by further improvements of the flexible maintenance. The maintenance program have therefore been partitioned into its smallest parts and formulated as individual constraint rules. The overall goal is to maximize the utilization of a fleet, i.e. maximize the ability to perform transport assignments, with respect to maintenance. A sub-goal is to minimize costs for vehicle break downs and the costs for maintenance actions. The maintenance planner takes as input customer preferences and maintenance task deadlines where the existing expert rule for the component has been replaced by a predictive model. Using machine learning, operational data have been used to train a predictive random forest model that can estimate the probability that a vehicle will have a breakdown given its operational data as input. The route optimization takes predicted vehicle health into consideration when optimizing routes and assignment allocations. The random forest model satisfactory predicts failures, the maintenance planner successfully computes consistent and good maintenance plans, and the route optimizer give optimal routes within tens of seconds of operation time. The model, the maintenance planner, and the route optimizer have been integrated into a demonstrator able to highlight the usability and feasibility of the suggested approach.
-
Predicting NOx sensor failure in heavy duty trucks using histogram-based random forests
2017. Ram B. Gurung, Tony Lindgren, Henrik Boström. International Journal of Prognostics and Health Management 8 (1)
ArtikelBeing able to accurately predict the impending failures of truck components is often associated with significant amount of cost savings, customer satisfaction and flexibility in maintenance service plans. However, because of the diversity in the way trucks typically are configured and their usage under different conditions, the creation of accurate prediction models is not an easy task. This paper describes an effort in creating such a prediction model for the NOx sensor, i.e., a component measuring the emitted level of nitrogen oxide in the exhaust of the engine. This component was chosen because it is vital for the truck to function properly, while at the same time being very fragile and costly to repair. As input to the model, technical specifications of trucks and their operational data are used. The process of collecting the data and making it ready for training the model via a slightly modified Random Forest learning algorithm is described along with various challenges encountered during this process. The operational data consists of features represented as histograms, posing an additional challenge for the data analysis task. In the study, a modified version of the random forest algorithm is employed, which exploits the fact that the individual bins in the histograms are related, in contrast to the standard approach that would consider the bins as independent features. Experiments are conducted using the updated random forest algorithm, and they clearly show that the modified version is indeed beneficial when compared to the standard random forest algorithm. The performance of the resulting prediction model for the NOx sensor is promising and may be adopted for the benefit of operators of heavy trucks.
-
Evaluating the Reliability of Self-Explanations in Large Language Models
2025. Korbinian Robert Randl (et al.). Discovery Science, 36-51
KonferensThis paper investigates the reliability of explanations generated by large language models~(LLMs) when prompted to explain their previous output. We evaluate two kinds of such self-explanations -- extractive and counterfactual -- using three state-of-the-art LLMs (2B to 8B parameters) on two different classification tasks (objective and subjective).
Our findings reveal, that, while these self-explanations can correlate with human judgement, they do not fully and accurately follow the model's decision process, indicating a gap between perceived and actual model reasoning.
We show that this gap can be bridged because prompting LLMs for counterfactual explanations can produce faithful, informative, and easy-to-verify results. These counterfactuals offer a promising alternative to traditional explainability methods (e.g. SHAP, LIME), provided that prompts are tailored to specific tasks and checked for validity.
-
Randomized Separate and Conquer Rule induction
2017. Tony Lindgren. Proceedings of the International Conference on Compute and Data Analysis, 207-214
KonferensRule learning comes in many forms, here we investigate a modified version of Separate and Conquer (SAC) learning to see if it improves the predictive performance of the induced predictive models. Our modified version of SAC has a hyperparameter which is used to specify the amount of examples that should not be removed from the induction. This selection is done at random and as a consequence the SAC algorithm will produce more and diverse rules, given the hyperparameter setting. The modified algorithm has been implemented in both an unordered single rule set setting as well as in an ensemble rule set setting. Both of these settings have been evaluated empirically on a number of datasets. The results show that in the single rule set setting, the modified version significantly improves the predictive performance, at the cost of more rules, which was expected. In the ensemble setting the combined method of bagging and the modified SAC algorithm did not perform as good as expected, while using only the modified SAC algorithm in ensemble setting performed better than expected.
-
Indexing Rules in Rule Sets for Fast Classification
2016. Tony Lindgren. Proceedings of the International Conference on Artificial Intelligence and Robotics and the International Conference on Automation, Control and Robotics Engineering
KonferensUsing sets of rules for classification of examples usually in- volves checking a number of conditions to see if they hold or not. If the rule set is large the time to make the classifica- tion can be lengthy. In this paper we propose an indexing algorithm to decrease the classification time when dealing with large rule sets. Unordered rule sets have a high time complexity when conducting classification; we hence con- duct experiments comparing our novel indexing algorithm with the standard way of classifying ensembles of unordered rule sets. The result of the experiment shows decreased clas- sification times for the novel method that are ranging from 0.6 to 0.8 of that of the standard approach averaged over all experimental datasets. This time gain is obtained while re- taining an accuracy ranging from 0.84 to 0.99 with regard to the standard classification method. The index bit size used with the indexing algorithm influence both the classification accuracy and time needed for conducting the classification task.
-
Learning Decision Trees from Histogram Data Using Multiple Subsets of Bins
2016. Ram B. Gurung, Tony Lindgren, Henrik Boström. Proceedings of the Twenty-Ninth International Florida Artificial Intelligence Research Society Conference, 430-435
KonferensThe standard approach of learning decision trees from histogram data is to treat the bins as independent variables. However, as the underlying dependencies among the bins might not be completely exploited by this approach, an algorithm has been proposed for learning decision trees from histogram data by considering all bins simultaneously while partitioning examples at each node of the tree. Although the algorithm has been demonstrated to improve predictive performance, its computational complexity has turned out to be a major bottleneck, in particular for histograms with a large number of bins. In this paper, we propose instead a sliding window approach to select subsets of the bins to be considered simultaneously while partitioning examples. This significantly reduces the number of possible splits to consider, allowing for substantially larger histograms to be handled. We also propose to evaluate the original bins independently, in addition to evaluating the subsets of bins when performing splits. This ensures that the information obtained by treating bins simultaneously is an additional gain compared to what is considered by the standard approach. Results of experiments on applying the new algorithm to both synthetic and real world datasets demonstrate positive results in terms of predictive performance without excessive computational cost.
-
Open government ideologies in post-soviet countries
2016. Karin Hansson (et al.). International Journal of Electronic Governance 8 (3), 244-264
ArtikelMost research in research areas like e-government, e-participation and open government assumes a democratic norm. The open government (OG) concept is commonly based on a general liberal and deliberative ideology emphasising transparency, access, participation and collaboration, but were also innovation and accountability are promoted. In this paper, we outline a terminology and suggest a method for how to investigate the concept more systematically in different policy documents, with a special emphasis on post-soviet countries. The result shows that the main focus in this regions OG policy documents is on freedom of information and accountability, and to a lesser extent on collaboration, while other aspects, such as diversity and innovation, are more rarely mentioned, if at all.
-
Learning Decision Trees from Histogram Data
2015. Ram B. Gurung, Tony Lindgren, Henrik Boström. Proceedings of the 2015 International Conference on Data Mining, 139-145
KonferensWhen applying learning algorithms to histogram data, bins of such variables are normally treated as separate independent variables. However, this may lead to a loss of information as the underlying dependencies may not be fully exploited. In this paper, we adapt the standard decision tree learning algorithm to handle histogram data by proposing a novel method for partitioning examples using binned variables. Results from employing the algorithm to both synthetic and real-world data sets demonstrate that exploiting dependencies in histogram data may have positive effects on both predictive performance and model size, as measured by number of nodes in the decision tree. These gains are however associated with an increased computational cost and more complex split conditions. To address the former issue, an approximate method is proposed, which speeds up the learning process substantially while retaining the predictive performance.
-
Model Based Sampling - Fitting an Ensemble of Models into a Single Model
2015. Tony Lindgren. Proceedings of 2015 International Conference on Computational Science and Computational Intelligence, 186-191
KonferensLarge ensembles of classifiers usually outperform single classifiers. Unfortunately ensembles have two major drawbacks compared to single classifier; interpretability and classifications times. Using the Combined Multiple Models (CMM) framework for compressing an ensemble of classifiers into a single classifier the problems associated with ensembles can be avoided while retaining almost similar classification power as that of the original ensemble. One open question when using CMM concerns how to generate values that constitute the synthetic example. In this paper we present a novel method for generating synthetic examples by utilizing the structure of the ensemble. This novel method is compared with other methods for generating synthetic examples using the CMM framework. From the comparison it is concluded that the novel method outperform the other methods.
Visa alla publikationer av Tony Lindgren vid Stockholms universitet