Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A Comparative Study of Black-box Optimization Algorithms for Tuning of Hyper-parameters in Deep Neural Networks
Luleå University of Technology, Department of Engineering Sciences and Mathematics.
2018 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Deep neural networks (DNNs) have successfully been applied across various data intensive applications ranging from computer vision, language modeling, bioinformatics and search engines. Hyper-parameters of a DNN are defined as parameters that remain fixed during model training and heavily influence the DNN performance. Hence, regardless of application, the design-phase of constructing a DNN model becomes critical. Framing the selection and tuning of hyper-parameters as an expensive black-box optimization (BBO) problem, obstacles encountered in manual by-hand tuning could be addressed by taking instead an automated algorithmic approach.

In this work, the following BBO algorithms: Nelder-Mead Algorithm (NM), ParticleSwarm Optmization (PSO), Bayesian Optimization with Gaussian Processes (BO-GP) and Tree-structured Parzen Estimator (TPE), are evaluated side-by-side for two hyper-parameter optimization problem instances. These instances are: Problem 1, incorporating a convolutionalneural network and Problem 2, incorporating a recurrent neural network. A simple Random Search (RS) algorithm acting as a baseline for performance comparison is also included in the experiments. Results in this work show that the TPE algorithm achieves the overall highest performance with respect to mean solution quality, speed ofimprovement and with a comparatively low trial-to-trial variability for both Problem 1 and Problem 2. The NM, PSO and BO-GP algorithms are shown capable of outperforming the RS baseline for Problem 1, but fails to do so in Problem 2.

Place, publisher, year, edition, pages
2018.
National Category
Computational Mathematics
Identifiers
URN: urn:nbn:se:ltu:diva-69865OAI: oai:DiVA.org:ltu-69865DiVA, id: diva2:1223709
External cooperation
Fraunhofer-Chalmers centrum för industrimatematik
Educational program
Engineering Physics and Electrical Engineering, master's level
Supervisors
Examiners
Available from: 2018-06-29 Created: 2018-06-25 Last updated: 2018-06-29Bibliographically approved

Open Access in DiVA

fulltext(2035 kB)63 downloads
File information
File name FULLTEXT01.pdfFile size 2035 kBChecksum SHA-512
bae4ab4cb277b21f4528e5cfd6832b709beb4550ed2c60e3fef46326e22804320226f4b558d50a9af7c881dead1eab67d5351eb919e81002954858e182387bf9
Type fulltextMimetype application/pdf

By organisation
Department of Engineering Sciences and Mathematics
Computational Mathematics

Search outside of DiVA

GoogleGoogle Scholar
Total: 63 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 85 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf