Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
The Problem with Ranking Ensembles Based on Training or Validation Performance
University of Borås, School of Business and IT.
University of Borås, School of Business and IT.
2008 (English)In: Proceedings of the International Joint Conference on Neural Networks, IEEE Press , 2008Conference paper, Published paper (Refereed)
Abstract [en]

The main purpose of this study was to determine whether it is possible to somehow use results on training or validation data to estimate ensemble performance on novel data. With the specific setup evaluated; i.e. using ensembles built from a pool of independently trained neural networks and targeting diversity only implicitly, the answer is a resounding no. Experimentation, using 13 UCI datasets, shows that there is in general nothing to gain in performance on novel data by choosing an ensemble based on any of the training measures evaluated here. This is despite the fact that the measures evaluated include all the most frequently used; i.e. ensemble training and validation accuracy, base classifier training and validation accuracy, ensemble training and validation AUC and two diversity measures. The main reason is that all ensembles tend to have quite similar performance, unless we deliberately lower the accuracy of the base classifiers. The key consequence is, of course, that a data miner can do no better than picking an ensemble at random. In addition, the results indicate that it is futile to look for an algorithm aimed at optimizing ensemble performance by somehow selecting a subset of available base classifiers.

Place, publisher, year, edition, pages
IEEE Press , 2008.
Keyword [en]
ensembles, diversity, Computer Science
Keyword [sv]
data mining
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:hb:diva-5946DOI: 10.1109/IJCNN.2008.4634255Local ID: 2320/3973ISBN: 978-1-4244-1821-3 (print)OAI: oai:DiVA.org:hb-5946DiVA: diva2:886629
Conference
IJCNN 2008, Hong Kong, June 1- 6, 2008
Note

Sponsorship:

This work was supported by the Information Fusion Research Program (University of Skövde, Sweden) in partnership with the Swedish Knowledge Foundation under grant 2003/0104.

Available from: 2015-12-22 Created: 2015-12-22 Last updated: 2018-01-10

Open Access in DiVA

fulltext(125 kB)121 downloads
File information
File name FULLTEXT01.pdfFile size 125 kBChecksum SHA-512
91e449579abf37bcaf25916e7a0547ddd876e6117822d3f1aa8af03a763639e5c3f8f7410d0cdef1d466fb9c13949bd744265b44c221d67e698fe87ee0e790a1
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Löfström, TuveJohansson, Ulf
By organisation
School of Business and IT
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 121 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 87 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf