RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Accurate and Interpretable Regression Trees using Oracle Coaching
Högskolan i Borås, Institutionen Handels- och IT-högskolan.ORCID-id: 0000-0003-0412-6199
Högskolan i Borås, Institutionen Handels- och IT-högskolan.
Högskolan i Borås, Institutionen Handels- och IT-högskolan.
2014 (engelsk)Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

In many real-world scenarios, predictive models need to be interpretable, thus ruling out many machine learning techniques known to produce very accurate models, e.g., neural networks, support vector machines and all ensemble schemes. Most often, tree models or rule sets are used instead, typically resulting in significantly lower predictive performance. The over- all purpose of oracle coaching is to reduce this accuracy vs. comprehensibility trade-off by producing interpretable models optimized for the specific production set at hand. The method requires production set inputs to be present when generating the predictive model, a demand fulfilled in most, but not all, predic- tive modeling scenarios. In oracle coaching, a highly accurate, but opaque, model is first induced from the training data. This model (“the oracle”) is then used to label both the training instances and the production instances. Finally, interpretable models are trained using different combinations of the resulting data sets. In this paper, the oracle coaching produces regression trees, using neural networks and random forests as oracles. The experiments, using 32 publicly available data sets, show that the oracle coaching leads to significantly improved predictive performance, compared to standard induction. In addition, it is also shown that a highly accurate opaque model can be successfully used as a pre- processing step to reduce the noise typically present in data, even in situations where production inputs are not available. In fact, just augmenting or replacing training data with another copy of the training set, but with the predictions from the opaque model as targets, produced significantly more accurate and/or more compact regression trees.

sted, utgiver, år, opplag, sider
IEEE , 2014.
Emneord [en]
Oracle coaching, Regression trees, Predictive modeling, Interpretable models, Machine learning, Data mining
HSV kategori
Identifikatorer
URN: urn:nbn:se:hj:diva-38081ISBN: 978-1-4799-4518-4 (tryckt)OAI: oai:DiVA.org:hj-38081DiVA: diva2:1163354
Konferanse
5th IEEE Symposium Computational Intelligence and Data Mining, 9-12 Decmber, Orlando, FL, USA
Merknad

Sponsorship:

This work was supported by the Swedish Foundation for Strategic

Research through the project High-Performance Data Mining for Drug Effect

Detection (IIS11-0053), the Swedish Retail and Wholesale Development

Council through the project Innovative Business Intelligence Tools (2013:5)

and the Knowledge Foundation through the project Big Data Analytics by

Online Ensemble Learning (20120192).

Tilgjengelig fra: 2017-12-06 Laget: 2017-12-06 Sist oppdatert: 2017-12-06bibliografisk kontrollert

Open Access i DiVA

fulltext(96 kB)2 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 96 kBChecksum SHA-512
1c0ec44b4db7a64aafaff0aba2d94ef9ccb500bae9fd4da60b0d55ea09b33dcd7cead2bf12c6b407567b43e8397e900593092c45f220c9486c9cb7fca2b6a171
Type fulltextMimetype application/pdf

Søk i DiVA

Av forfatter/redaktør
Johansson, Ulf

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 2 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

isbn
urn-nbn

Altmetric

isbn
urn-nbn
Totalt: 15 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf