RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Regression Trees for Streaming Data with Local Performance Guarantees
Högskolan i Borås, Institutionen Handels- och IT-högskolan.ORCID-id: 0000-0003-0412-6199
Högskolan i Borås, Institutionen Handels- och IT-högskolan.
Högskolan i Borås, Institutionen Handels- och IT-högskolan.
Dept. of Computer and Systems Sciences Stockholm University, Sweden.
2014 (engelsk)Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Online predictive modeling of streaming data is a key task for big data analytics. In this paper, a novel approach for efficient online learning of regression trees is proposed, which continuously updates, rather than retrains, the tree as more labeled data become available. A conformal predictor outputs prediction sets instead of point predictions; which for regression translates into prediction intervals. The key property of a conformal predictor is that it is always valid, i.e., the error rate, on novel data, is bounded by a preset significance level. Here, we suggest applying Mondrian conformal prediction on top of the resulting models, in order to obtain regression trees where not only the tree, but also each and every rule, corresponding to a path from the root node to a leaf, is valid. Using Mondrian conformal prediction, it becomes possible to analyze and explore the different rules separately, knowing that their accuracy, in the long run, will not be below the preset significance level. An empirical investigation, using 17 publicly available data sets, confirms that the resulting rules are independently valid, but also shows that the prediction intervals are smaller, on average, than when only the global model is required to be valid. All-in-all, the suggested method provides a data miner or a decision maker with highly informative predictive models of streaming data.

sted, utgiver, år, opplag, sider
IEEE , 2014.
Emneord [en]
Conformal Prediction, Streaming data, Regression trees, Interpretable models, Machine learning, Data mining
HSV kategori
Identifikatorer
URN: urn:nbn:se:hj:diva-38085DOI: 10.1109/BigData.2014.7004263ISBN: 978-1-4799-5666-1 (tryckt)OAI: oai:DiVA.org:hj-38085DiVA, id: diva2:1163344
Konferanse
IEEE International Conference on Big Data, 27-30 October, 2014, Washington, DC, USA
Merknad

Sponsorship:

This work was supported by the Swedish Foundation for Strategic

Research through the project High-Performance Data Mining for Drug Effect

Detection (IIS11-0053), the Swedish Retail and Wholesale Development

Council through the project Innovative Business Intelligence Tools (2013:5)

and the Knowledge Foundation through the project Big Data Analytics by

Online Ensemble Learning (20120192).

Tilgjengelig fra: 2017-12-06 Laget: 2017-12-06 Sist oppdatert: 2018-01-13bibliografisk kontrollert

Open Access i DiVA

fulltext(927 kB)7 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 927 kBChecksum SHA-512
2807b050139a83c7037c350d7bdb8b95cf0adaa97debc65db04f3084da17a1198c86913134371578ed56908c32ecee42938df85deb2019a22e003ff621a08699
Type fulltextMimetype application/pdf

Andre lenker

Forlagets fulltekst

Søk i DiVA

Av forfatter/redaktør
Johansson, Ulf

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 7 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

doi
isbn
urn-nbn

Altmetric

doi
isbn
urn-nbn
Totalt: 40 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf