Independent thesis Advanced level (degree of Master (One Year)), 12 credits / 18 HE credits
Background: SCADA (Supervisory Control and Data Acquisition) systems are fundamental to the operation and stability of critical power infrastructure, such as electrical grids. However, fault localization in SCADA systems poses significant challenges due to their heterogeneous nature, comprising tightly integrated hardware and software components, and the sheer volume of data generated during their operation. The interplay between diverse system elements, including sensors, communication protocols, control algorithms, and monitoring software, adds layers of complexity to fault identification and resolution. Traditional fault localization methods—relying on manual log analysis, code reviews, and bug triaging—are often inefficient and struggle to scale with the increasing volume and complexity of SCADA environments. While Artificial Intelligence (AI)-based approaches have demonstrated potential in other domains, their application in the power industry, particularly in SCADA systems, remains underexplored.
Objectives: This thesis aims to design, implement, and evaluate an AI fault localization approach tailored for SCADA systems, focusing on improving fault localization and reducing the number of bugs that propagate to production environments. The key innovation lies in guiding pre-trained AI models with domain-specific knowledge derived from SCADA-specific data sources, such as industry-specific bug reports, system logs, and work item histories.
Methods: Employing the Design Science Research Process (DSRP), the research begins with problem identification through literature review and expert consultation to understand the limitations of traditional methods and identify opportunities for AI. In the solution design phase, pre-trained AI models are adapted to process SCADA-specific data using techniques such as Retrieval-Augmented Generation (RAG). By integrating historical and operational knowledge, the models are equipped to generate actionable insights tailored to the SCADA domain. The prototype is then empirically evaluated within a SCADA development environment, focusing on metrics such as accuracy, efficiency, and feedback from industry professionals.
Results: The Power-RAG prototype, designed for fault localization in SCADA systems, was evaluated across two iterations. In the first iteration, open source PrivateGPT achieved 100% fault localization accuracy but was notably slow, averaging 88 seconds per query. To address this, a custom UI was developed, achieving an impressive 95% accuracy while significantly reducing query time to just 12 seconds—a stark contrast to the 343 seconds required by traditional manual methods. The prototype efficiently provided Area Path suggestions and actionable solution insights, that could lead to improved operational efficiency. Feedback from five industry professionals praised the user-friendliness, adaptability, and speed of the custom UI, while highlighting areas for improvement, including query sensitivity and robustness in handling diverse fault scenarios. These results underscore the balance achieved between speed and accuracy, making the Power-RAG a usable initial prototype for SCADA fault localization.
Conclusions: This thesis explores the application of AI-driven fault localization methods within SCADA systems, an area where such implementations remain largely underexplored. By leveraging pre-trained AI models guided with SCADA-specific knowledge, this research demonstrates how these tools can effectively process complex datasets, such as system logs and bug reports, to identify and localize faults. The results show clear improvements in fault detection efficiency, accuracy, and overall system reliability compared to traditional manual approaches.
While the findings highlight the feasibility and potential benefits of AI in enhancing fault localization workflows, it is important to acknowledge that this work represents an initial step rather than a comprehensive solution. The prototype developed in this thesis provides a foundation for further refinement and adaptation.
2025. , p. 42
SCADA Systems, Fault Localization, Artificial Intelligence, Retrieval-Augmented Generation, Software Quality
PA2592 Research Methods and Master's Thesis (60 credits) in Software Engineering for Professionals