Stockholm university

Zahra KharazianPhD Student

About me

I am a Ph.D. student at the Department of Computer and Systems Sciences (DSV), Stockholm University. My current research area is applying human-in-the-loop Machine Learning models for Predictive Maintenance. I am collaborating on the RAPIDS project supervised by Prof. Tony Lindgren and Prof. Sindri Magnússon

My research background is in applying machine learning algorithms to real-world problems like predictive maintenance, human activity recognition, and detecting informative texts using natural language processing.

 

Research

Research projects

Publications

Links to my publications:

Increasing safety at smart elderly homes by Human fall detection from video using transfer Learning approaches

Modeling turbocharger failures using Markov process for predictive maintenance

 

A selection from Stockholm University publication database

  • AID4HAI: Automatic Idea Detection for Healthcare-Associated Infections from Twitter, A Framework based on Active Learning and Transfer Learning

    2023. Zahra Kharazian (et al.). Advances in Intelligent Data Analysis XXI, 195-207

    Conference

    This research is an interdisciplinary work between data scientists, innovation management researchers and experts from Swedish academia and a hygiene and health company. Based on this collaboration, we have developed a novel package for automatic idea detection with the motivation of controlling and preventing healthcare-associated infections (HAI). The principal idea of this study is to use machine learning methods to extract informative ideas from social media to assist healthcare professionals in reducing the rate of HAI. Therefore, the proposed package offers a corpus of data collected from Twitter, associated expert-created labels, and software implementation of an annotation framework based on the Active Learning paradigm. We employed Transfer Learning and built a two-step deep neural network model that incrementally extracts the semantic representation of the collected text data using the BERTweet language model in the first step and classifies these representations as informative or non-informative using a multi-layer perception (MLP) in the second step. The package is called AID4HAI (Automatic Idea Detection for controlling and preventing Healthcare-Associated Infections) and is made fully available (software code and the collected data) through a public GitHub repository. We believe that sharing our ideas and releasing these ready-to-use tools contributes to the development of the field and inspires future research.

    Read more about AID4HAI
  • AID4HAI: Automatic Idea Detection for Healthcare-Associated Infections from Twitter, A Framework based on Active Learning and Transfer Learning

    2023. Zahra Kharazian (et al.). 35th Annual Workshop of the Swedish Artificial Intelligence Society SAIS 2023

    Conference

    This study is a collaboration between data scientists, innovation management researchers from academia, and experts from a hygiene and health company. The study aims to develop an automatic idea detection package to control and prevent healthcare-associated infections (HAI) by extracting informative ideas from social media using Active Learning and Transfer Learning. The proposed package includes a dataset collected from Twitter, expert-created labels, and an annotation framework. Transfer Learning has been used to build a twostep deep neural network model that gradually extracts the semantic representation of the text data using the BERTweet language model in the first step. In the second step, the model classifies the extracted representations as informative or non-informative using a multi-layer perception (MLP). The package is named AID4HAI (Automatic Idea Detection for controlling and preventing Healthcare-Associated Infections) and is publicly available on GitHub.

    Read more about AID4HAI
  • Bridging the Gap: A Comparative Analysis of Regressive Remaining Useful Life Prediction and Survival Analysis Methods for Predictive Maintenance

    2023. Mahmoud Rahat (et al.). Vol. 4 No. 1 (2023): Proceedings of the Asia Pacific Conference of the PHM Society 2023

    Conference

    Regressive Remaining Useful Life Prediction and Survival Analysis are two lines of research with similar goals but different origins; one from engineering and the other from survival study in clinical research. Although the two research paths share a common objective of predicting the time to an event, researchers from each path typically do not compare their methods with methods from the other direction. Given the mentioned gap, we propose a framework to compare methods from the two lines of research using run-to-failure datasets. Then by utilizing the proposed framework, we compare six models incorporating three widely recognized degradation models along with two learning algorithms. The first dataset used in this study is C-MAPSS which includes simulation data from aircraft turbofan engines. The second dataset is real-world data from streamed condition monitoring of turbocharger devices installed on a fleet of Volvo trucks.

    Read more about Bridging the Gap: A Comparative Analysis of Regressive Remaining Useful Life Prediction and Survival Analysis Methods for Predictive Maintenance
  • CoPAL: Conformal Prediction for Active Learning with Application to Remaining Useful Life Estimation in Predictive Maintenance

    2024. Zahra Kharazian (et al.). Proceedings of Machine Learning Research, 195-217

    Conference

    Active learning has received considerable attention as an approach to obtain high predictive performance while minimizing the labeling effort. A central component of the active learning framework concerns the selection of objects for labeling, which are used for iteratively updating the underlying model. In this work, an algorithm called CoPAL (Conformal Prediction for Active Learning) is proposed, which makes the selection of objects within active learning based on the uncertainty as quantified by conformal prediction. The efficacy of CoPAL is investigated by considering the task of estimating the remaining useful life (RUL) of assets in the domain of predictive maintenance (PdM). Experimental results

    are presented, encompassing diverse setups, including different models, sample selection criteria, conformal predictors, and datasets, using root mean squared error (RMSE) as the primary evaluation metric while also reporting prediction interval sizes over the iterations. The comprehensive analysis confirms the positive effect of using CoPAL for improving predictive performance

    Read more about CoPAL
  • SHAP-Driven Explainability in Survival Analysis for Predictive Maintenance Applications

    2024. Monireh Kargar-Sharif-Abad (et al.). HAII5.0 2024 Embracing Human-Aware AI in Industry 2024

    Conference

    In the dynamic landscape of industrial operations, ensuring machines operate without interruption is crucial for maintaining optimal productivity levels. Estimating the Remaining Useful Life within Predictive Maintenance is vital for minimizing downtime, improving operational efficiency, and prevent-ing unexpected equipment failures. Survival analysis is a beneficial approach in this context due to its power of handling censored data (here referred to industrial assets that have not experienced a failure during the study period). However, the black-box nature of survival analysis models necessitates the use of explainable AI for greater transparency and interpretability. This study evaluates three Machine Learning-based Survival Analysis models and a traditional Survival Analysis model using real-world data from SCANIA AB, which includes over 90% censored data. Results indicate that Random Survival Forest outperforms the Cox Proportional Hazards model and the Gradient Boosting Survival Analysis and Survival Support vector machine. Additionally, we employ SHAP analysis to provide global and local explanations, highlighting the importance and interaction of features in our best-performing model. To overcome the limitation of applying SHAP on survival output, we utilize a surrogate model. Finally, SHAP identifies specific influential features, shedding light on their effects and interactions. This compre-hensive methodology tackles the inherent opacity of machine learning-based survival analysis models, providing valuable insights into their predictive mechanisms. The findings from our SHAP analysis underscore the pivotal role of these identified features and their interactions, thereby enriching our comprehension of the factors influencing Remaining Useful Life predictions.

    Read more about SHAP-Driven Explainability in Survival Analysis for Predictive Maintenance Applications
  • SurvLoss: A New Survival Loss Function for Neural Networks to Process Censored Data

    2024. Mahmoud Rahat, Zahra Kharazian. Proceedings of the European Conference of the PHM Society 2024

    Conference

    This paper presents SurvLoss, a novel asymmetric partial loss and error calculation function for survival analysis and regression, enabling the inclusion of censored samples. An observation in a dataset for which the complete information regarding an event of interest is not available is called censored. Censored samples are ubiquitous in the industry and play a crucial role in Prognostics and Health Management (PHM) by providing a realistic representation of data, improving the accuracy of analyses, and supporting better decision-making in various industries and the healthcare sector. The proposed approach can effectively equip the conventional regression loss functions such as Mean Absolute Error (MAE), Mean Squared Error (MSE), or Root Mean Squared Error (RMSE) with the ability to process censored samples. This can impact the field hugely by providing a more accessible usage of neural network models in survival analysis. The proposed survival loss incorporates censored samples by penalizing predictions outside the censoring region and skipping them otherwise. Then, it uses weighted averaging to aggregate the loss from censored samples with the loss from event samples.

    Unlike many other methods in the field, the proposed model distinguishes itself by avoiding superficial assumptions and exclusively relies on the available information, considering the entirety of the data.

    We compared the proposed loss function with its baseline on two publicly available datasets. The first dataset, called C-MAPSS, is from NASA Turbofan Jet Engines simulation, and the second is a recently published real-world dataset from

    SCANIA trucks. The goal of both datasets is to predict the remaining useful life (RUL) of the machines. The experimental results show that optimization algorithms for training deep neural networks like Adam can effectively utilize the proposed loss function to calculate gradients, update the model’s weights, and reduce training and test errors. Moreover, the

    proposed model outperformed the baseline by taking advantage of the censored samples. The proposed loss function paves the way for the employment of advanced architectures of neural networks with bigger training sizes in survival analysis.

    Read more about SurvLoss

Show all publications by Zahra Kharazian at Stockholm University

$presentationText

profilePageLayout