Change search
ReferencesLink to record
Permanent link

Direct link
Structured Stochastic Bandits
KTH, School of Electrical Engineering (EES), Automatic Control.ORCID iD: 0000-0003-2988-8464
2016 (English)Licentiate thesis, monograph (Other academic)
Abstract [en]

In this thesis we address the multi-armed bandit (MAB) problem with stochastic rewards and correlated arms. Particularly, we investigate the case when the expected rewards are a Lipschitz function of the arm, and the learning to rank problem, as viewed from a MAB perspective. For the former, we derive a problem specific lower bound and propose both an asymptotically optimal algorithm (OSLB) and a (pareto)optimal, algorithm (POSLB). For the latter, we construct the regret lower bound and determine its closed form for some particular settings, as well as propose two asymptotically optimal algorithms PIE and PIE-C. For all algorithms mentioned above, we present performance analysis in the form of theoretical regret guarantees as well as numerical evaluation on artificial datasets as well as real-world datasets, in the case of PIE and PIE-C.

Place, publisher, year, edition, pages
Stockholm: KTH Royal Institute of Technology, 2016. , vii, 94 p.
TRITA-EE, ISSN 1653-5146 ; 2016:021
Keyword [en]
Multi-armed bandits, Learning to rank, reinforcement learning, Lipschitz Bandits
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Electrical Engineering
URN: urn:nbn:se:kth:diva-182816ISBN: 978-91-7595-880-4OAI: diva2:905873
2016-03-15, Q2, Osquldasväg 10, KTH, Stockholm, 10:00 (English)
EU, European Research Council

QC 20160223

Available from: 2016-02-23 Created: 2016-02-23 Last updated: 2016-02-23Bibliographically approved

Open Access in DiVA

Thesis(931 kB)167 downloads
File information
File name FULLTEXT01.pdfFile size 931 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Magureanu, Stefan
By organisation
Automatic Control
Other Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 167 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 345 hits
ReferencesLink to record
Permanent link

Direct link