Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Efficient Online Learning under Bandit Feedback
KTH, School of Electrical Engineering and Computer Science (EECS), Automatic Control.ORCID iD: 0000-0003-2988-8464
2018 (English)Doctoral thesis, monograph (Other academic)
Abstract [en]

In this thesis we address the multi-armed bandit (MAB) problem with stochastic rewards and correlated arms. Particularly, we investigate the case when the expected rewards are a Lipschitz function of the arm and extend these results to bandits with arbitrary structure that is known to the decision maker. In these settings, we derive problem specific regret lower bounds and propose both an asymptotically optimal algorithm (OSLB and OSSB respectively) and (pareto) optimal algorithms (POSLB and POSB, in the generic setting). We further examine the \emph{learning to rank} problem, as viewed from a MAB perspective. We construct the regret lower bound and determine its closed form for some particular settings, as well as propose two asymptotically optimal algorithms PIE and PIE-C. We further present a mathematical model of the learning to rank problem where the need for diversity appears naturally and devise an order optimal, numerically competitive algorithm, LDR. For all algorithms mentioned above, we present performance analysis in the form of theoretical regret guarantees as well as numerical evaluation on artificial as well as real-world datasets.

Place, publisher, year, edition, pages
KTH Royal Institute of Technology, 2018. , p. 152
Series
TRITA-EE, ISSN 1653-5146 ; 2017:180
Keyword [en]
multi-armed bandits, reinforcement learning, learning to rank
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Electrical Engineering
Identifiers
URN: urn:nbn:se:kth:diva-222004ISBN: 978-91-7729-653-9 (print)OAI: oai:DiVA.org:kth-222004DiVA, id: diva2:1178410
Public defence
2018-02-20, F3, Lindstedtsvägen 26, Sing-Sing, floor 2, KTH Campus, Stockholm, 09:00
Opponent
Supervisors
Note

QC 20180130

Available from: 2018-01-30 Created: 2018-01-29 Last updated: 2018-01-30Bibliographically approved

Open Access in DiVA

fulltext(1735 kB)88 downloads
File information
File name FULLTEXT01.pdfFile size 1735 kBChecksum SHA-512
d5aca9361f1e05cc99ae8bf0e7900c0416618de07064c14f1d1968c14c838724ab9de628ac0a497b83e81138cd325691d128850fc75f58f0ad8e210ec3701b65
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Magureanu, Stefan
By organisation
Automatic Control
Other Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar
Total: 88 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1239 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf