Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Accelerating difficulty estimation for conformal regression forests
Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
Antal upphovsmän: 4
2017 (Engelska)Ingår i: Annals of Mathematics and Artificial Intelligence, ISSN 1012-2443, E-ISSN 1573-7470, Vol. 81, nr 1-2, 125-144 s.Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

The conformal prediction framework allows for specifying the probability of making incorrect predictions by a user-provided confidence level. In addition to a learning algorithm, the framework requires a real-valued function, called nonconformity measure, to be specified. The nonconformity measure does not affect the error rate, but the resulting efficiency, i.e., the size of output prediction regions, may vary substantially. A recent large-scale empirical evaluation of conformal regression approaches showed that using random forests as the learning algorithm together with a nonconformity measure based on out-of-bag errors normalized using a nearest-neighbor-based difficulty estimate, resulted in state-of-the-art performance with respect to efficiency. However, the nearest-neighbor procedure incurs a significant computational cost. In this study, a more straightforward nonconformity measure is investigated, where the difficulty estimate employed for normalization is based on the variance of the predictions made by the trees in a forest. A large-scale empirical evaluation is presented, showing that both the nearest-neighbor-based and the variance-based measures significantly outperform a standard (non-normalized) nonconformity measure, while no significant difference in efficiency between the two normalized approaches is observed. The evaluation moreover shows that the computational cost of the variance-based measure is several orders of magnitude lower than when employing the nearest-neighbor-based nonconformity measure. The use of out-of-bag instances for calibration does, however, result in nonconformity scores that are distributed differently from those obtained from test instances, questioning the validity of the approach. An adjustment of the variance-based measure is presented, which is shown to be valid and also to have a significant positive effect on the efficiency. For conformal regression forests, the variance-based nonconformity measure is hence a computationally efficient and theoretically well-founded alternative to the nearest-neighbor procedure.

Ort, förlag, år, upplaga, sidor
2017. Vol. 81, nr 1-2, 125-144 s.
Nyckelord [en]
Conformal prediction, Nonconformity measures, Regression, Random forests
Nationell ämneskategori
Data- och informationsvetenskap Matematik
Identifikatorer
URN: urn:nbn:se:su:diva-146954DOI: 10.1007/s10472-017-9539-9ISI: 000407425000008OAI: oai:DiVA.org:su-146954DiVA: diva2:1142587
Konferens
5th Symposium on Conformal and Probabilistic Prediction with Applications (COPA), Madrid, Spain, April 20-22, 2016.
Tillgänglig från: 2017-09-19 Skapad: 2017-09-19 Senast uppdaterad: 2017-09-19Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas

Övriga länkar

Förlagets fulltext

Sök vidare i DiVA

Av författaren/redaktören
Boström, Henrik
Av organisationen
Institutionen för data- och systemvetenskap
I samma tidskrift
Annals of Mathematics and Artificial Intelligence
Data- och informationsvetenskapMatematik

Sök vidare utanför DiVA

GoogleGoogle Scholar

Altmetricpoäng

Totalt: 4 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf