Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Estimating Class Probabilities in Random Forests
Stockholms universitet, Samhällsvetenskapliga fakulteten, Institutionen för data- och systemvetenskap.
2007 (engelsk)Inngår i: Proceedings of the Sixth International Conference on Machine Learning and Applications, IEEE , 2007, s. 211-216Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

For both single probability estimation trees (PETs) and ensembles of such trees, commonly employed class probability estimates correct the observed relative class frequencies in each leaf to avoid anomalies caused by small sample sizes. The effect of such corrections in random forests of PETs is investigated, and the use of the relative class frequency is compared to using two corrected estimates, the Laplace estimate and the m-estimate. An experiment with 34 datasets from the UCI repository shows that estimating class probabilities using relative class frequency clearly outperforms both using the Laplace estimate and the m-estimate with respect to accuracy, area under the ROC curve (AUC) and Brier score. Hence, in contrast to what is commonly employed for PETs and ensembles of PETs, these results strongly suggest that a non-corrected probability estimate should be used in random forests of PETs. The experiment further shows that learning random forests of PETs using relative class frequency significantly outperforms learning random forests of classification trees (i.e., trees for which only an unweighted vote on the most probable class is counted) with respect to both accuracy and AUC, but that the latter is clearly ahead of the former with respect to Brier score.

sted, utgiver, år, opplag, sider
IEEE , 2007. s. 211-216
HSV kategori
Identifikatorer
URN: urn:nbn:se:su:diva-37838DOI: 10.1109/ICMLA.2007.64ISBN: 978-0-7695-3069-7 (tryckt)OAI: oai:DiVA.org:su-37838DiVA, id: diva2:305369
Konferanse
Machine Learning and Applications, 2007. ICMLA 2007. Sixth International Conference on 13-15 Dec. 2007
Tilgjengelig fra: 2010-03-23 Laget: 2010-03-23 Sist oppdatert: 2018-01-12bibliografisk kontrollert

Open Access i DiVA

fulltext(134 kB)1169 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 134 kBChecksum SHA-512
f417824e009e82a78c4933899363dc0202b6f1e9c60a98bcc7d2d16dfb39341e65e83a85f0feea0c5ff00a85ad1e815a4dd237f30388aa7876a7e5c777acfc70
Type fulltextMimetype application/pdf

Andre lenker

Forlagets fulltekst

Søk i DiVA

Av forfatter/redaktør
Boström, Henrik
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 1169 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

doi
isbn
urn-nbn

Altmetric

doi
isbn
urn-nbn
Totalt: 94 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf