Digitala Vetenskapliga Arkivet

Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Hierarchical Methods for Self-Monitoring Systems: Theory and Application
Högskolan i Halmstad, Akademin för informationsteknologi.ORCID-id: 0000-0001-5395-5482
2022 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

Self-monitoring solutions first appeared to avoid catastrophic breakdowns in safety-critical mechanisms. The design behind these solutions relied heavily on the physical knowledge of the mechanism and its fault. They usually involved installing specialized sensors to monitor the state of the mechanism and statistical modeling of the recorded data. Mainly, these solutions focused on specific components of a machine and rarely considered more than one type of fault.

In our work, on the other hand, we focus on self-monitoring of complex machines, systems composed of multiple components performing heterogeneous tasks and interacting with each other: systems with many possible faults. Today, the data available to monitor these machines is vast but usually lacks the design and specificity to monitor each possible fault in the system accurately. Some faults will show distinctive symptoms in the data; some faults will not; more interestingly, there will be groups of faults with common symptoms in the recorded data.

The thesis in this manuscript is that we can exploit the similarities between faults to train machine learning models that can significantly improve the performance of self-monitoring solutions for complex systems that overlook these similarities. We choose to encode these similarity relationships into hierarchies of faults, which we use to train hierarchical supervised models. We use both real-life problems and standard benchmarks to prove the adequacy of our approach on tasks like fault diagnosis and fault prediction.

We also demonstrate that models trained on different hierarchies result in significantly different performances. We analyze what makes a good hierarchy and what are the best practices to develop methods to extract hierarchies of classes from the data. We advance the state-of-the-art by defining the concept of heterogeneity of decision boundaries and studying how it affects the performance of different class decompositions. 

Ort, förlag, år, upplaga, sidor
Halmstad: Halmstad University Press, 2022. , s. 66
Serie
Halmstad University Dissertations ; 93
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:hh:diva-48138ISBN: 978-91-88749-98-7 (tryckt)ISBN: 978-91-88749-97-0 (digital)OAI: oai:DiVA.org:hh-48138DiVA, id: diva2:1698371
Disputation
2022-10-14, Wigforssalen, Hus J (Visionen), Kristian IV:s väg 3, Halmstad, 10:00 (Engelska)
Opponent
Handledare
Tillgänglig från: 2022-09-23 Skapad: 2022-09-23 Senast uppdaterad: 2025-10-01Bibliografiskt granskad
Delarbeten
1. Filtering Misleading Repair Log Labels to Improve Predictive Maintenance Models
Öppna denna publikation i ny flik eller fönster >>Filtering Misleading Repair Log Labels to Improve Predictive Maintenance Models
2022 (Engelska)Ingår i: Proceedings of the 7th European Conference of the Prognostics and Health Management Society 2022 / [ed] Phuc Do; Gabriel Michau; Cordelia Ezhilarasu, State College, PA: PHM Society , 2022, Vol. 7 (1), s. 110-117Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

One of the main challenges for predictive maintenance in real applications is the quality of the data, especially the labels. In this paper, we propose a methodology to filter out the misleading labels that harm the performance of Machine Learning models. Ideally, predictive maintenance would be based on the information of when a fault has occurred in a machine and what specific type of fault it was. Then, we could train machine learning models to identify the symptoms of such fault before it leads to a breakdown. However, in many industrial applications, this information is not available. Instead, we approximate it using a log of component replacements, usually coming from the sales or maintenance departments. The repair history provides reliable labels for fault prediction models only if the replaced component was indeed faulty, with symptoms captured by collected data, and it was going to lead to a breakdown.

However, very often, at least for complex equipment, this assumption does not hold. Models trained using unreliable labels will then, necessarily, fail. We demonstrate that filtering misleading labels leads to improved results. Our central claim is that the same fault, happening several times, should have similar symptoms in the data; thus, we can train a model to predict them. On the contrary, replacements of the same component that do not exhibit similar symptoms will be confusing and harm the ML models. Therefore, we aim to filter the maintenance operations, keeping only those that can be used to predict each other. Suppose we can train a successful model using the data before a component replacement to predict another component replacement. In that case, those maintenance operations must be motivated by the same, or a very similar, type of fault.

We test this approach on a real scenario using data from a fleet of sterilizers deployed in hospitals. The data includes sensor readings from the machines describing their operations and the service logs indicating the replacement of components when the manufacturing company performs the service. Since sterilizers are complex machines consisting of many components and systems interacting with each other, there is the possibility of faults happening simultaneously.

Ort, förlag, år, upplaga, sidor
State College, PA: PHM Society, 2022
Serie
Proceedings of the European Conference of the Prognostics and Health Management Society (PHME), E-ISSN 2325-016X
Nyckelord
Predictive maintenance, misleading labels, Machine Learning
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:hh:diva-48120 (URN)10.36001/phme.2022.v7i1.3360 (DOI)978-1-936263-36-3 (ISBN)
Konferens
7th European Conference of the Prognostics and Health Management (PHM) Society, Turin, Italy, July 6-8, 2022
Forskningsfinansiär
KK-stiftelsenVetenskapsrådet
Tillgänglig från: 2022-09-22 Skapad: 2022-09-22 Senast uppdaterad: 2025-10-01Bibliografiskt granskad
2. Hierarchical Multi-class Classification for Fault Diagnosis
Öppna denna publikation i ny flik eller fönster >>Hierarchical Multi-class Classification for Fault Diagnosis
2021 (Engelska)Ingår i: Proceedings of the 31st European Safety and Reliability Conference (ESREL 2021) / [ed] Bruno Castanier; Marko Cepin; David Bigaud; Christophe Berenguer, Singapore: Research Publishing Services, 2021, s. 2457-2464Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

This paper formulates the problem of predictive maintenance for complex systems as a hierarchical multi-class classification task. This formulation is useful for equipment with multiple sub-systems and components performing heterogeneous tasks. Often, the data available describes the whole system's operation and is not ideal for accurate condition monitoring. In this setup, specialized predictive models analyzing one component at a time rarely perform much better than random. However, using machine learning and hierarchical approaches, we can still exploit the data to build a fault isolation system that provides measurable benefits for technicians in the field. We propose a method for creating a taxonomy of components to train hierarchical classifiers that aim to identify the faulty component. The output of this model is a structured set of predictions with different probabilities for each component. In this setup, traditional machine learning metrics fail to capture the relationship between the performance of the models and its usefulness in the field.We introduce a new metric to evaluate our approach's benefits; it measures the number of tests a technician needs to perform before pinpointing the faulty component. Using a dataset from a real-case problem coming fro the automotive industry, we demonstrate how traditional machine learning performance metrics, like accuracy, fail to capture practical benefits. Our proposed hierarchical approach succeeds in exploiting the information in the data and outperforms non-hierarchical machine learning solutions. In addition, we can identify the weakest link of our fault isolation model, allowing us to improve it efficiently.

Ort, förlag, år, upplaga, sidor
Singapore: Research Publishing Services, 2021
Nyckelord
Fault diagnosis, Multi-class classification, Hierarchical classification, Automotive industry, Integral fault diagnosis, Structure prediction
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:hh:diva-46343 (URN)10.3850/978-981-18-2016-8_524-cd (DOI)978-981-18-2016-8 (ISBN)
Konferens
European Safety and Reliability Conference (ESREL 2021), 19-23 September, 2021
Tillgänglig från: 2022-02-14 Skapad: 2022-02-14 Senast uppdaterad: 2025-10-01Bibliografiskt granskad
3. Hierarchical multi-fault prognostics for complex systems
Öppna denna publikation i ny flik eller fönster >>Hierarchical multi-fault prognostics for complex systems
(Engelska)Manuskript (preprint) (Övrigt vetenskapligt)
Abstract [en]

The field of predictive maintenance for complex machinery with multiple possible faults is an important but largely unexplored area. In general, one assumes, often implicitly, the existence of monitoring data specific enough to capture every possible fault independently from all the others.

In this paper, we focus on the problem of predicting time-to-failure, or remaining useful life, in situations where the above assumption does not hold. Specifically, what happens when the data is not good enough to uniquely predict every fault, and, more importantly, what happens when different faults share the same symptoms on the recorded data.

We demonstrate that prognostics approaches learning independent models for each fault are inadequate. In particular, in the presence of faults that produce similar failure patterns, they produce false alarms disproportionately often or miss the majority of failures. 

We propose the HMP framework (Hierarchical Multi-fault Prognosis) to solve this problem by extracting a hierarchy of faults based on the similarity of the data they produce. At each node of the hierarchy, we train a regression model to predict the time-to-failure for any of the faults contained in this node. The intuition is that while it might be impossible to predict individual time-to-failure in the presence of similar faults, a model trained on aggregated data can still provide useful information. We demonstrate through experiments the validity of our approach.

Nyckelord
Predictive Maintenance, Complex System, Multi-fault Prognosis
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:hh:diva-48118 (URN)
Forskningsfinansiär
KK-stiftelsenVetenskapsrådet
Anmärkning

As manuscript in thesis

Tillgänglig från: 2022-09-22 Skapad: 2022-09-22 Senast uppdaterad: 2025-10-01Bibliografiskt granskad
4. Pitfalls of Assessing Extracted Hierarchies for Multi-Class Classification
Öppna denna publikation i ny flik eller fönster >>Pitfalls of Assessing Extracted Hierarchies for Multi-Class Classification
(Engelska)Manuskript (preprint) (Övrigt vetenskapligt)
Abstract [en]

Using hierarchies of classes is one of the standard methods to solve multi-class classification problems. In the literature, selecting the right hierarchy is considered to play a key role in improving classification performance. Although different methods have been proposed, there is still a lack of understanding of what makes a hierarchy good and what makes a method to extract hierarchies perform better or worse.

To this effect, we analyze and compare some of the most popular approaches to extracting hierarchies. We identify some common pitfalls that may lead practitioners to make misleading conclusions about their methods.To address some of these problems, we demonstrate that using random hierarchies is an appropriate benchmark to assess how the hierarchy's quality affects the classification performance.

In particular, we show how the hierarchy's quality can become irrelevant depending on the experimental setup: when using powerful enough classifiers, the final performance is not affected by the quality of the hierarchy. We also show how comparing the effect of the hierarchies against non-hierarchical approaches might incorrectly indicate their superiority.

Our results confirm that datasets with a high number of classes generally present complex structures in how these classes relate to each other. In these datasets, the right hierarchy can dramatically improve classification performance.

Nyckelord
Hierarchical Multi-class Classification, Multi-class Classification, Class Hierarchies
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:hh:diva-48117 (URN)
Forskningsfinansiär
KK-stiftelsen
Anmärkning

As manuscript in thesis

Tillgänglig från: 2022-09-22 Skapad: 2022-09-22 Senast uppdaterad: 2025-10-01Bibliografiskt granskad
5. Why Is Multiclass Classification Hard?
Öppna denna publikation i ny flik eller fönster >>Why Is Multiclass Classification Hard?
2022 (Engelska)Ingår i: IEEE Access, E-ISSN 2169-3536, Vol. 10, s. 80448-80462Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

In classification problems, as the number of classes increases, correctly classifying a new instance into one of them is assumed to be more challenging than making the same decision in the presence of fewer classes. The essence of the problem is that using the learning algorithm on each decision boundary individually is better than using the same learning algorithm on several of them simultaneously. However, why and when it happens is still not well-understood today. This work’s main contribution is to introduce the concept of heterogeneity of decision boundaries as an explanation of this phenomenon. Based on the definition of heterogeneity of decision boundaries, we analyze and explain the differences in the performance of state of the art approaches to solve multi-class classification. We demonstrate that as the heterogeneity increases, the performances of all approaches, except one-vs-one, decrease. We show that by correctly encoding the knowledge of the heterogeneity of decision boundaries in a decomposition of the multi-class problem, we can obtain better results than state of the art decompositions. The benefits can be an increase in classification performance or a decrease in the time it takes to train and evaluate the models. We first provide intuitions and illustrate the effects of the heterogeneity of decision boundaries using synthetic datasets and a simplistic classifier. Then, we demonstrate how a real dataset exhibits these same principles, also under realistic learning algorithms. In this setting, we devise a method to quantify the heterogeneity of different decision boundaries, and use it to decompose the multi-class problem. The results show significant improvements over state-of-the-art decompositions that do not take the heterogeneity of decision boundaries into account. © 2013 IEEE.

Ort, förlag, år, upplaga, sidor
Piscataway, NJ: IEEE, 2022
Nyckelord
Classification complexity, heterogeneity of decision boundaries, multi-class classification
Nationell ämneskategori
Data- och informationsvetenskap
Identifikatorer
urn:nbn:se:hh:diva-48116 (URN)10.1109/access.2022.3192514 (DOI)000838670500001 ()2-s2.0-85135735284 (Scopus ID)
Forskningsfinansiär
KK-stiftelsen
Tillgänglig från: 2022-09-22 Skapad: 2022-09-22 Senast uppdaterad: 2025-10-01Bibliografiskt granskad

Open Access i DiVA

fulltext(3456 kB)433 nedladdningar
Filinformation
Filnamn FULLTEXT01.pdfFilstorlek 3456 kBChecksumma SHA-512
ecf6b5d1ddc2a8605b80162a45408f4acd6eaf2ca1d3132ef5a822d8377c159b4cc26ef586f89dd63916c2b57852d6a73a0ec99a823769b981877a4bcf5234bc
Typ fulltextMimetyp application/pdf

Sök vidare i DiVA

Av författaren/redaktören
Del Moral Pastor, Pablo José
Av organisationen
Akademin för informationsteknologi
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 433 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

isbn
urn-nbn

Altmetricpoäng

isbn
urn-nbn
Totalt: 873 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf