Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Gain estimation of linear dynamical systems using Thompson Sampling
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Decision and Control Systems (Automatic Control).ORCID iD: 0000-0002-6322-7857
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Decision and Control Systems (Automatic Control).ORCID iD: 0000-0003-0355-2663
2019 (English)In: Proceedings of Machine Learning Research / [ed] Kamalika Chaudhuri, Masashi Sugiyama, 2019, Vol. 89, p. 1535-1543Conference paper, Published paper (Refereed)
Abstract [en]

We present the gain estimation problem for linear dynamical systems as a multi-armed bandit. This is particularly a very important engineering problem in control design, where performance guarantees are casted in terms of the largest gain of the frequency response of the system. The dynamical system is unknown and only noisy input-output data is available. In a more general setup, the noise perturbing the data is non-white and the variance at each frequency band is unknown, resulting in a two-dimensional Gaussian bandit model with unknown mean and scaled-identity covariance matrix. This model corresponds to a two-parameter exponential family. Within a bandit framework, the set of means is given by the frequency response of the system and, unlike traditional bandit problems, the goal here is to maximize the probability of choosing the arm drawing samples with the highest norm of its mean. A problem-dependent lower bound for the expected cumulative regret is derived and a matching upper bound is obtained for a Thompson-Sampling algorithm under a uniform prior over the variances and the two-dimensional means.

Place, publisher, year, edition, pages
2019. Vol. 89, p. 1535-1543
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
URN: urn:nbn:se:kth:diva-256045ISI: 000509687901060Scopus ID: 2-s2.0-85084995878OAI: oai:DiVA.org:kth-256045DiVA, id: diva2:1343465
Conference
22nd International Conference on Artificial Intelligence and Statistics, AISTATS 2019; LOISIR Hotel NahaNaha; Japan; 16 April 2019 through 18 April 2019
Note

QC 20211012

Available from: 2019-08-16 Created: 2019-08-16 Last updated: 2022-06-26Bibliographically approved

Open Access in DiVA

fulltext(892 kB)143 downloads
File information
File name FULLTEXT01.pdfFile size 892 kBChecksum SHA-512
19bb6e408cdf7eb539a6d180410fc9e82eca4a6f0a88392de39f6b07af97782f931fbb2e07c7a38855c5aaf687d091e9d22a9d294e927adcc10252c3f5407005
Type fulltextMimetype application/pdf

Scopus

Search in DiVA

By author/editor
Müller, Matias I.Rojas, Cristian R.
By organisation
Decision and Control Systems (Automatic Control)
Other Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 143 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 353 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf