Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Bayesian optimization for selecting training and validation data for supervised machine learning: using Gaussian processes both to learn the relationship between sets of training data and model performance, and to estimate model performance over the entire problem domain
Linköping University, Department of Computer and Information Science, Artificial Intelligence and Integrated Computer Systems.
2019 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesisAlternative title
Bayesiansk optimering för val av träning- och valideringsdata för övervakad maskininlärning (Swedish)
Abstract [en]

Validation and verification in machine learning is an open problem which becomes increasingly important as its applications becomes more critical. Amongst the applications are autonomous vehicles and medical diagnostics. These systems all needs to be validated before being put into use or else the consequences might be fatal.

This master’s thesis focuses on improving both learning and validating machine learning models in cases where data can either be generated or collected based on a chosen position. This can for example be taking and labeling photos at the position or running some simulation which generates data from the chosen positions.

The approach is twofold. The first part concerns modeling the relationship between any fixed-size set of positions and some real valued performance measure. The second part involves calculating such a performance measure by estimating the performance over a region of positions.

The result is two different algorithms, both variations of Bayesian optimization. The first algorithm models the relationship between a set of points and some performance measure while also optimizing the function and thus finding the set of points which yields the highest performance. The second algorithm uses Bayesian optimization to approximate the integral of performance over the region of interest. The resulting algorithms are validated in two different simulated environments.

The resulting algorithms are applicable not only to machine learning but can also be used to optimize any function which takes a set of positions and returns a value, but are more suitable when the function is expensive to evaluate.

Place, publisher, year, edition, pages
2019. , p. 39
Keywords [en]
Bayesian optimization, AutoML, supervised learning
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:liu:diva-157327ISRN: LIU-IDA/LITH-EX-A--19/016--SEOAI: oai:DiVA.org:liu-157327DiVA, id: diva2:1321271
Subject / course
Computer science
Presentation
2019-04-30, Alan Turing, E-huset, Linköpings universitet, Linköping, 10:15 (English)
Supervisors
Examiners
Available from: 2019-10-17 Created: 2019-06-07 Last updated: 2019-10-17Bibliographically approved

Open Access in DiVA

david-bergstrom-masters-thesis(4495 kB)8 downloads
File information
File name FULLTEXT01.pdfFile size 4495 kBChecksum SHA-512
ef000c5c506fad7087bce06f9cecfe41a8013cabed2ef3b0bc93eb42eb675f66893a15994b0c1daa918c52e9323ef5a21e9af990590fb45f7f94b7f700258bb2
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Bergström, David
By organisation
Artificial Intelligence and Integrated Computer Systems
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 8 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 75 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf