Digitala Vetenskapliga Arkivet

Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Nonconformity Measures and Ensemble Strategies: An Analysis of Conformal Predictor Efficiency and Validity
Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap. Department of Information Technology, University of Borås.
2021 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

Conformal predictors are a family of predictive models that associate with each of their predictions a measure of confidence, enabling them to provide quantitative information about their own trustworthiness. In risk-laden machine learning applications, where bad predictions may lead to economic loss, personal injury, or worse, such inherent quality control appears highly beneficial, if not required. While the foundations of conformal prediction were initially published some twenty years ago, their use, and further development, is still (at the time of writing this thesis) not widespread in the machine learning community, and several open questions remain regarding the proper design and use of conformal prediction systems. In this thesis, we attempt to tackle some of these questions, focusing our attention on three specific characteristics of conformal predictors. First, conformal predictors rely on so-called nonconformity functions, which are mappings from the object space onto the real line, typically based on traditional classification or regression models; here, we investigate properties of the underlying learning algorithm and characteristics of the resulting conformal predictor. Second, conformal predictors output predictions on a form that is distinct from traditional prediction methods, by supplying multi-valued prediction regions with a statistically valid coverage probability; we propose two procedures for post-processing the output from conformal classification models that provide interpretations more closely related to traditional predictive models, while still retaining meaningful confidence information. Finally, we provide contributions relating to the construction of conformal predictor ensembles, illustrating potential issues with existing ensemble procedures, as well as proposing and evaluating an alternative ensemble method.

Abstract [sv]

Avhandlingen behandlar områdetconformal prediction, som beskriver en fa-milj prediktiva modeller vars prediktioner associeras med ett konfidensmått,som låter modellerna själva uttrycka sig om sin egen tillförlitlighet. I hög-riskapplikationer, där dåliga prediktioner kan få allvarliga ekonomiska konse-kvenser, eller leda till personskada, tycks en sådan inbyggd säkerhetskontrollhögst värdefull, om inte nödvändig. Medan den teoretiska grunden till confor-mal prediction lades för cirka 20 år sedan, är forskningsområdet fortfaranderelativt ungt, och många öppna frågor kvarstår gällande design och använd-ning av conformal prediction-system. I avhandlingen behandlas några av des-sa öppna frågor, och fokus läggs på tre specifika karakteristika hos conformal-prediktorer. Först behandlas de så kallade icke-konformitetsfunktionerna (non-conformity functions) som ligger till grund för conformal prediction, och sam-bandet utforskas mellan egenskaper hos icke-konformitetsfunktionerna och deresulterande prediktorerna. även egenskaper hos de prediktioner som produ-ceras i en conformal predictor undersöks, och två post-processeringsmetoderpresenteras i ett försök att bistå med en mer intuitivt begriplig tolkning av des-sa prediktioner. Slutligen utforskas strategier för konstruktion av ensemblerav conformal prediction-modeller, där svagheter illustreras i vedertagna stra-tegier, följt av en presentation av en ny ensemblestrategi som ämnar adresseradessa svagheter.

Ort, förlag, år, upplaga, sidor
Stockholm: Department of Computer and Systems Sciences, Stockholm University , 2021. , s. 62
Serie
Report Series / Department of Computer & Systems Sciences, ISSN 1101-8526 ; 21-001
Nyckelord [en]
Data Science, Machine Learning, Conformal Prediction, Classification, Regression
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
data- och systemvetenskap
Identifikatorer
URN: urn:nbn:se:su:diva-192613ISBN: 978-91-7911-502-9 (tryckt)ISBN: 978-91-7911-503-6 (digital)OAI: oai:DiVA.org:su-192613DiVA, id: diva2:1547120
Disputation
2021-06-14, online via Zoom, public link is available at the department website, Stockholm, 13:00 (Engelska)
Opponent
Handledare
Tillgänglig från: 2021-05-20 Skapad: 2021-04-25 Senast uppdaterad: 2022-02-25Bibliografiskt granskad
Delarbeten
1. Signed-Error Conformal Regression
Öppna denna publikation i ny flik eller fönster >>Signed-Error Conformal Regression
2014 (Engelska)Ingår i: Advances in Knowledge Discovery and Data Mining: 18th Pacific-Asia Conference, PAKDD 2014, Tainan, Taiwan, May 13-16, 2014. Proceedings, Part I / [ed] Vincent S. Tseng, Tu Bao Ho, Zhi-Hua Zhou, Arbee L. P. Chen, Hung-Yu Kao, Cham: Springer, 2014, s. 224-236Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

This paper suggests a modification of the Conformal Prediction framework for regression that will strengthen the associated guarantee of validity. We motivate the need for this modification and argue that our conformal regressors are more closely tied to the actual error distribution of the underlying model, thus allowing for more natural interpretations of the prediction intervals. In the experimentation, we provide an empirical comparison of our conformal regressors to traditional conformal regressors and show that the proposed modification results in more robust two-tailed predictions, and more efficient one-tailed predictions.

Ort, förlag, år, upplaga, sidor
Cham: Springer, 2014
Serie
Lecture Notes in Artificial Intelligence, ISSN 0302-9743, E-ISSN 1611-3349 ; 8443
Nyckelord
Conformal Prediction, prediction intervals, regression
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
datalogi
Identifikatorer
urn:nbn:se:su:diva-192610 (URN)10.1007/978-3-319-06608-0_19 (DOI)978-3-319-06607-3 (ISBN)978-3-319-06608-0 (ISBN)
Konferens
18th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Tainan, Taiwan, May 13-16, 2014
Tillgänglig från: 2021-04-25 Skapad: 2021-04-25 Senast uppdaterad: 2022-02-25Bibliografiskt granskad
2. Efficiency Comparison of Unstable Transductive and Inductive Conformal Classifiers
Öppna denna publikation i ny flik eller fönster >>Efficiency Comparison of Unstable Transductive and Inductive Conformal Classifiers
2014 (Engelska)Ingår i: Artificial Intelligence Applications and Innovations: AIAI 2014 Workshops: CoPA, MHDW, IIVC, and MT4BD, Rhodes, Greece, September 19-21, 2014. Proceedings / [ed] Lazaros Iliadis, Berlin: Springer, 2014, s. 261-270Konferensbidrag, Publicerat paper (Refereegranskat)
Ort, förlag, år, upplaga, sidor
Berlin: Springer, 2014
Serie
IFIP Advances in Information and Communication Technology, ISSN 1868-4238 ; 437
Nationell ämneskategori
Systemvetenskap, informationssystem och informatik
Forskningsämne
data- och systemvetenskap
Identifikatorer
urn:nbn:se:su:diva-110979 (URN)10.1007/978-3-662-44722-2_28 (DOI)978-3-662-44721-5 (ISBN)
Konferens
AIAI 2014 Workshops: CoPA, MHDW, IIVC, and MT4BD, Rhodes, Greece, September 19-21, 2014
Tillgänglig från: 2014-12-19 Skapad: 2014-12-19 Senast uppdaterad: 2022-02-23Bibliografiskt granskad
3. Reliable Confidence Predictions Using Conformal Prediction
Öppna denna publikation i ny flik eller fönster >>Reliable Confidence Predictions Using Conformal Prediction
2016 (Engelska)Ingår i: Advances in Knowledge Discovery and Data Mining: 20th Pacific-Asia Conference, PAKDD 2016, Auckland, New Zealand, April 19-22, 2016, Proceedings, Part I / [ed] James Bailey, Latifur Khan, Takashi Washio, Gill Dobbie, Joshua Zhexue Huang, Ruili Wang, Springer, 2016, s. 77-88Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Conformal classifiers output confidence prediction regions, i.e., multi-valued predictions that are guaranteed to contain the true output value of each test pattern with some predefined probability. In order to fully utilize the predictions provided by a conformal classifier, it is essential that those predictions are reliable, i.e., that a user is able to assess the quality of the predictions made. Although conformal classifiers are statistically valid by default, the error probability of the prediction regions output are dependent on their size in such a way that smaller, and thus potentially more interesting, predictions are more likely to be incorrect. This paper proposes, and evaluates, a method for producing refined error probability estimates of prediction regions, that takes their size into account. The end result is a binary conformal confidence predictor that is able to provide accurate error probability estimates for those prediction regions containing only a single class label.

Ort, förlag, år, upplaga, sidor
Springer, 2016
Serie
Lecture Notes in Computer Science, ISSN 0302-9743 ; 9651
Nationell ämneskategori
Systemvetenskap, informationssystem och informatik
Forskningsämne
data- och systemvetenskap
Identifikatorer
urn:nbn:se:su:diva-137489 (URN)10.1007/978-3-319-31753-3_7 (DOI)000389019500007 ()978-3-319-31752-6 (ISBN)978-3-319-31753-3 (ISBN)
Konferens
20th Pacific-Asia Conference, PAKDD 2016, Auckland, New Zealand, April 19-22, 2016
Tillgänglig från: 2017-01-08 Skapad: 2017-01-08 Senast uppdaterad: 2022-02-28Bibliografiskt granskad
4. On the Calibration of Aggregated Conformal Predictors
Öppna denna publikation i ny flik eller fönster >>On the Calibration of Aggregated Conformal Predictors
Visa övriga...
2017 (Engelska)Ingår i: Proceedings of Machine Learning Research: Volume 60: Conformal and Probabilistic Prediction and Applications, 13-16 June 2017, Stockholm, Sweden / [ed] Alex Gammerman, Vladimir Vovk, Zhiyuan Luo, Harris Papadopoulos, 2017, Vol. 60, s. 154-173Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Conformal prediction is a learning framework that produces models that associate with each of their predictions a measure of statistically valid confidence. These models are typically constructed on top of traditional machine learning algorithms. An important result of conformal prediction theory is that the models produced are provably valid under relatively weak assumptions—in particular, their validity is independent of the specific underlying learning algorithm on which they are based. Since validity is automatic, much research on conformal predictors has been focused on improving their informational and computational efficiency. As part of the efforts in constructing efficient conformal predictors, aggregated conformal predictors were developed, drawing inspiration from the field of classification and regression ensembles. Unlike early definitions of conformal prediction procedures, the validity of aggregated conformal predictors is not fully understood—while it has been shown that they might attain empirical exact validity under certain circumstances, their theoretical validity is conditional on additional assumptions that require further clarification. In this paper, we show why validity is not automatic for aggregated conformal predictors, and provide a revised definition of aggregated conformal predictors that gains approximate validity conditional on properties of the underlying learning algorithm.

Serie
Proceedings of Machine Learning Research, ISSN 2640-3498 ; 60
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
datalogi
Identifikatorer
urn:nbn:se:su:diva-192606 (URN)
Konferens
Conformal and Probabilistic Prediction and ApplicationsVolume 60: Conformal and Probabilistic Prediction and Applications, 13-16 June 2017, Stockholm, Sweden
Tillgänglig från: 2021-04-25 Skapad: 2021-04-25 Senast uppdaterad: 2022-02-25Bibliografiskt granskad
5. Classification With Reject Option Using Conformal Prediction
Öppna denna publikation i ny flik eller fönster >>Classification With Reject Option Using Conformal Prediction
2018 (Engelska)Ingår i: Advances in Knowledge Discovery and Data Mining: 22nd Pacific-Asia Conference, PAKDD 2018, Melbourne, VIC, Australia, June 3-6, 2018, Proceedings, Part I / [ed] Dinh Phung; Vincent S. Tseng; Geoffrey I. Webb; Bao Ho; Mohadeseh Ganji; Lida Rashidi, Cham: Springer Nature, 2018, s. 94-105Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

In this paper, we propose a practically useful means of interpreting the predictions produced by a conformal classifier. The proposed interpretation leads to a classifier with a reject option, that allows the user to limit the number of erroneous predictions made on the test set, without any need to reveal the true labels of the test objects. The method described in this paper works by estimating the cumulative error count on a set of predictions provided by a conformal classifier, ordered by their confidence. Given a test set and a user-specified parameter k, the proposed classification procedure outputs the largest possible amount of predictions containing on average at most k errors, while refusing to make predictions for test objects where it is too uncertain. We conduct an empirical evaluation using benchmark datasets, and show that we are able to provide accurate estimates for the error rate on the test set.

Ort, förlag, år, upplaga, sidor
Cham: Springer Nature, 2018
Serie
Lecture Notes in Artificial Intelligence, ISSN 0302-9743, E-ISSN 1611-3349 ; 10937
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
datalogi
Identifikatorer
urn:nbn:se:su:diva-192611 (URN)10.1007/978-3-319-93034-3_8 (DOI)000443224400008 ()2-s2.0-85049360232 (Scopus ID)978-3-319-93033-6 (ISBN)978-3-319-93034-3 (ISBN)
Konferens
22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2018), Melbourne, Australia, June 3-6, 2018
Tillgänglig från: 2021-04-25 Skapad: 2021-04-25 Senast uppdaterad: 2023-10-31Bibliografiskt granskad
6. Efficient conformal predictor ensembles
Öppna denna publikation i ny flik eller fönster >>Efficient conformal predictor ensembles
2020 (Engelska)Ingår i: Neurocomputing, ISSN 0925-2312, E-ISSN 1872-8286, Vol. 397, s. 266-278Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

In this paper, we study a generalization of a recently developed strategy for generating conformal predictor ensembles: out-of-bag calibration. The ensemble strategy is evaluated, both theoretically and empirically, against a commonly used alternative ensemble strategy, bootstrap conformal prediction, as well as common non-ensemble strategies. A thorough analysis is provided of out-of-bag calibration, with respect to theoretical validity, empirical validity (error rate), efficiency (prediction region size) and p-value stability (the degree of variance observed over multiple predictions for the same object). Empirical results show that out-of-bag calibration displays favorable characteristics with regard to these criteria, and we propose that out-of-bag calibration be adopted as a standard method for constructing conformal predictor ensembles.

Nyckelord
Conformal prediction, Classification, Ensembles
Nationell ämneskategori
Datavetenskap (datalogi)
Forskningsämne
datalogi
Identifikatorer
urn:nbn:se:su:diva-192612 (URN)10.1016/j.neucom.2019.07.113 (DOI)
Tillgänglig från: 2021-04-25 Skapad: 2021-04-25 Senast uppdaterad: 2022-02-25Bibliografiskt granskad

Open Access i DiVA

Nonconformity Measures and Ensemble Strategies(1528 kB)913 nedladdningar
Filinformation
Filnamn FULLTEXT01.pdfFilstorlek 1528 kBChecksumma SHA-512
19f865850d8e7f3550a161b76f7c39f68150b54be559ea4f949804c7fa63802128b2f04c30556e29118639a5f422cd84fdc04f116d739bdb24a72374194b9e51
Typ fulltextMimetyp application/pdf

Sök vidare i DiVA

Av författaren/redaktören
Linusson, Henrik
Av organisationen
Institutionen för data- och systemvetenskap
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 913 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

isbn
urn-nbn

Altmetricpoäng

isbn
urn-nbn
Totalt: 2364 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf