Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Hierarchical Methods for Self-Monitoring Systems: Theory and Application
Halmstad University, School of Information Technology.ORCID iD: 0000-0001-5395-5482
2022 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Self-monitoring solutions first appeared to avoid catastrophic breakdowns in safety-critical mechanisms. The design behind these solutions relied heavily on the physical knowledge of the mechanism and its fault. They usually involved installing specialized sensors to monitor the state of the mechanism and statistical modeling of the recorded data. Mainly, these solutions focused on specific components of a machine and rarely considered more than one type of fault.

In our work, on the other hand, we focus on self-monitoring of complex machines, systems composed of multiple components performing heterogeneous tasks and interacting with each other: systems with many possible faults. Today, the data available to monitor these machines is vast but usually lacks the design and specificity to monitor each possible fault in the system accurately. Some faults will show distinctive symptoms in the data; some faults will not; more interestingly, there will be groups of faults with common symptoms in the recorded data.

The thesis in this manuscript is that we can exploit the similarities between faults to train machine learning models that can significantly improve the performance of self-monitoring solutions for complex systems that overlook these similarities. We choose to encode these similarity relationships into hierarchies of faults, which we use to train hierarchical supervised models. We use both real-life problems and standard benchmarks to prove the adequacy of our approach on tasks like fault diagnosis and fault prediction.

We also demonstrate that models trained on different hierarchies result in significantly different performances. We analyze what makes a good hierarchy and what are the best practices to develop methods to extract hierarchies of classes from the data. We advance the state-of-the-art by defining the concept of heterogeneity of decision boundaries and studying how it affects the performance of different class decompositions. 

Place, publisher, year, edition, pages
Halmstad: Halmstad University Press, 2022. , p. 66
Series
Halmstad University Dissertations ; 93
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:hh:diva-48138ISBN: 978-91-88749-98-7 (print)ISBN: 978-91-88749-97-0 (electronic)OAI: oai:DiVA.org:hh-48138DiVA, id: diva2:1698371
Public defence
2022-10-14, Wigforssalen, Hus J (Visionen), Kristian IV:s väg 3, Halmstad, 10:00 (English)
Opponent
Supervisors
Available from: 2022-09-23 Created: 2022-09-23 Last updated: 2023-03-07Bibliographically approved
List of papers
1. Filtering Misleading Repair Log Labels to Improve Predictive Maintenance Models
Open this publication in new window or tab >>Filtering Misleading Repair Log Labels to Improve Predictive Maintenance Models
2022 (English)In: Proceedings of the 7th European Conference of the Prognostics and Health Management Society 2022 / [ed] Phuc Do; Gabriel Michau; Cordelia Ezhilarasu, State College, PA: PHM Society , 2022, Vol. 7 (1), p. 110-117Conference paper, Published paper (Refereed)
Abstract [en]

One of the main challenges for predictive maintenance in real applications is the quality of the data, especially the labels. In this paper, we propose a methodology to filter out the misleading labels that harm the performance of Machine Learning models. Ideally, predictive maintenance would be based on the information of when a fault has occurred in a machine and what specific type of fault it was. Then, we could train machine learning models to identify the symptoms of such fault before it leads to a breakdown. However, in many industrial applications, this information is not available. Instead, we approximate it using a log of component replacements, usually coming from the sales or maintenance departments. The repair history provides reliable labels for fault prediction models only if the replaced component was indeed faulty, with symptoms captured by collected data, and it was going to lead to a breakdown.

However, very often, at least for complex equipment, this assumption does not hold. Models trained using unreliable labels will then, necessarily, fail. We demonstrate that filtering misleading labels leads to improved results. Our central claim is that the same fault, happening several times, should have similar symptoms in the data; thus, we can train a model to predict them. On the contrary, replacements of the same component that do not exhibit similar symptoms will be confusing and harm the ML models. Therefore, we aim to filter the maintenance operations, keeping only those that can be used to predict each other. Suppose we can train a successful model using the data before a component replacement to predict another component replacement. In that case, those maintenance operations must be motivated by the same, or a very similar, type of fault.

We test this approach on a real scenario using data from a fleet of sterilizers deployed in hospitals. The data includes sensor readings from the machines describing their operations and the service logs indicating the replacement of components when the manufacturing company performs the service. Since sterilizers are complex machines consisting of many components and systems interacting with each other, there is the possibility of faults happening simultaneously.

Place, publisher, year, edition, pages
State College, PA: PHM Society, 2022
Series
Proceedings of the European Conference of the Prognostics and Health Management Society (PHME), E-ISSN 2325-016X
Keywords
Predictive maintenance, misleading labels, Machine Learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:hh:diva-48120 (URN)10.36001/phme.2022.v7i1.3360 (DOI)978-1-936263-36-3 (ISBN)
Conference
7th European Conference of the Prognostics and Health Management (PHM) Society, Turin, Italy, July 6-8, 2022
Funder
Knowledge FoundationSwedish Research Council
Available from: 2022-09-22 Created: 2022-09-22 Last updated: 2023-03-21Bibliographically approved
2. Hierarchical Multi-class Classification for Fault Diagnosis
Open this publication in new window or tab >>Hierarchical Multi-class Classification for Fault Diagnosis
2021 (English)In: Proceedings of the 31st European Safety and Reliability Conference (ESREL 2021) / [ed] Bruno Castanier; Marko Cepin; David Bigaud; Christophe Berenguer, Singapore: Research Publishing Services, 2021, p. 2457-2464Conference paper, Published paper (Refereed)
Abstract [en]

This paper formulates the problem of predictive maintenance for complex systems as a hierarchical multi-class classification task. This formulation is useful for equipment with multiple sub-systems and components performing heterogeneous tasks. Often, the data available describes the whole system's operation and is not ideal for accurate condition monitoring. In this setup, specialized predictive models analyzing one component at a time rarely perform much better than random. However, using machine learning and hierarchical approaches, we can still exploit the data to build a fault isolation system that provides measurable benefits for technicians in the field. We propose a method for creating a taxonomy of components to train hierarchical classifiers that aim to identify the faulty component. The output of this model is a structured set of predictions with different probabilities for each component. In this setup, traditional machine learning metrics fail to capture the relationship between the performance of the models and its usefulness in the field.We introduce a new metric to evaluate our approach's benefits; it measures the number of tests a technician needs to perform before pinpointing the faulty component. Using a dataset from a real-case problem coming fro the automotive industry, we demonstrate how traditional machine learning performance metrics, like accuracy, fail to capture practical benefits. Our proposed hierarchical approach succeeds in exploiting the information in the data and outperforms non-hierarchical machine learning solutions. In addition, we can identify the weakest link of our fault isolation model, allowing us to improve it efficiently.

Place, publisher, year, edition, pages
Singapore: Research Publishing Services, 2021
Keywords
Fault diagnosis, Multi-class classification, Hierarchical classification, Automotive industry, Integral fault diagnosis, Structure prediction
National Category
Computer Sciences
Identifiers
urn:nbn:se:hh:diva-46343 (URN)10.3850/978-981-18-2016-8_524-cd (DOI)978-981-18-2016-8 (ISBN)
Conference
European Safety and Reliability Conference (ESREL 2021), 19-23 September, 2021
Available from: 2022-02-14 Created: 2022-02-14 Last updated: 2022-11-16Bibliographically approved
3. Hierarchical multi-fault prognostics for complex systems
Open this publication in new window or tab >>Hierarchical multi-fault prognostics for complex systems
(English)Manuscript (preprint) (Other academic)
Abstract [en]

The field of predictive maintenance for complex machinery with multiple possible faults is an important but largely unexplored area. In general, one assumes, often implicitly, the existence of monitoring data specific enough to capture every possible fault independently from all the others.

In this paper, we focus on the problem of predicting time-to-failure, or remaining useful life, in situations where the above assumption does not hold. Specifically, what happens when the data is not good enough to uniquely predict every fault, and, more importantly, what happens when different faults share the same symptoms on the recorded data.

We demonstrate that prognostics approaches learning independent models for each fault are inadequate. In particular, in the presence of faults that produce similar failure patterns, they produce false alarms disproportionately often or miss the majority of failures. 

We propose the HMP framework (Hierarchical Multi-fault Prognosis) to solve this problem by extracting a hierarchy of faults based on the similarity of the data they produce. At each node of the hierarchy, we train a regression model to predict the time-to-failure for any of the faults contained in this node. The intuition is that while it might be impossible to predict individual time-to-failure in the presence of similar faults, a model trained on aggregated data can still provide useful information. We demonstrate through experiments the validity of our approach.

Keywords
Predictive Maintenance, Complex System, Multi-fault Prognosis
National Category
Computer Sciences
Identifiers
urn:nbn:se:hh:diva-48118 (URN)
Funder
Knowledge FoundationSwedish Research Council
Note

As manuscript in thesis

Available from: 2022-09-22 Created: 2022-09-22 Last updated: 2022-09-23Bibliographically approved
4. Pitfalls of Assessing Extracted Hierarchies for Multi-Class Classification
Open this publication in new window or tab >>Pitfalls of Assessing Extracted Hierarchies for Multi-Class Classification
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Using hierarchies of classes is one of the standard methods to solve multi-class classification problems. In the literature, selecting the right hierarchy is considered to play a key role in improving classification performance. Although different methods have been proposed, there is still a lack of understanding of what makes a hierarchy good and what makes a method to extract hierarchies perform better or worse.

To this effect, we analyze and compare some of the most popular approaches to extracting hierarchies. We identify some common pitfalls that may lead practitioners to make misleading conclusions about their methods.To address some of these problems, we demonstrate that using random hierarchies is an appropriate benchmark to assess how the hierarchy's quality affects the classification performance.

In particular, we show how the hierarchy's quality can become irrelevant depending on the experimental setup: when using powerful enough classifiers, the final performance is not affected by the quality of the hierarchy. We also show how comparing the effect of the hierarchies against non-hierarchical approaches might incorrectly indicate their superiority.

Our results confirm that datasets with a high number of classes generally present complex structures in how these classes relate to each other. In these datasets, the right hierarchy can dramatically improve classification performance.

Keywords
Hierarchical Multi-class Classification, Multi-class Classification, Class Hierarchies
National Category
Computer Sciences
Identifiers
urn:nbn:se:hh:diva-48117 (URN)
Funder
Knowledge Foundation
Note

As manuscript in thesis

Available from: 2022-09-22 Created: 2022-09-22 Last updated: 2023-12-13Bibliographically approved
5. Why Is Multiclass Classification Hard?
Open this publication in new window or tab >>Why Is Multiclass Classification Hard?
2022 (English)In: IEEE Access, E-ISSN 2169-3536, Vol. 10, p. 80448-80462Article in journal (Refereed) Published
Abstract [en]

In classification problems, as the number of classes increases, correctly classifying a new instance into one of them is assumed to be more challenging than making the same decision in the presence of fewer classes. The essence of the problem is that using the learning algorithm on each decision boundary individually is better than using the same learning algorithm on several of them simultaneously. However, why and when it happens is still not well-understood today. This work’s main contribution is to introduce the concept of heterogeneity of decision boundaries as an explanation of this phenomenon. Based on the definition of heterogeneity of decision boundaries, we analyze and explain the differences in the performance of state of the art approaches to solve multi-class classification. We demonstrate that as the heterogeneity increases, the performances of all approaches, except one-vs-one, decrease. We show that by correctly encoding the knowledge of the heterogeneity of decision boundaries in a decomposition of the multi-class problem, we can obtain better results than state of the art decompositions. The benefits can be an increase in classification performance or a decrease in the time it takes to train and evaluate the models. We first provide intuitions and illustrate the effects of the heterogeneity of decision boundaries using synthetic datasets and a simplistic classifier. Then, we demonstrate how a real dataset exhibits these same principles, also under realistic learning algorithms. In this setting, we devise a method to quantify the heterogeneity of different decision boundaries, and use it to decompose the multi-class problem. The results show significant improvements over state-of-the-art decompositions that do not take the heterogeneity of decision boundaries into account. © 2013 IEEE.

Place, publisher, year, edition, pages
Piscataway, NJ: IEEE, 2022
Keywords
Classification complexity, heterogeneity of decision boundaries, multi-class classification
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:hh:diva-48116 (URN)10.1109/access.2022.3192514 (DOI)000838670500001 ()2-s2.0-85135735284 (Scopus ID)
Funder
Knowledge Foundation
Available from: 2022-09-22 Created: 2022-09-22 Last updated: 2022-09-23Bibliographically approved

Open Access in DiVA

fulltext(3456 kB)283 downloads
File information
File name FULLTEXT01.pdfFile size 3456 kBChecksum SHA-512
ecf6b5d1ddc2a8605b80162a45408f4acd6eaf2ca1d3132ef5a822d8377c159b4cc26ef586f89dd63916c2b57852d6a73a0ec99a823769b981877a4bcf5234bc
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Del Moral Pastor, Pablo José
By organisation
School of Information Technology
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 283 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 698 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf