Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
User Plane Selection for Core Networks using Deep Reinforcement Learning
KTH, School of Electrical Engineering and Computer Science (EECS).
2019 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Allocating service functions to a core network upon users’ various demands isof importance in 5G networks. In this thesis work, we have studied reinforcementlearning models to solve this allocation problem. More precisely, 1) webuild a simple version of an MDP model for allocation in 5G core networks,2) we train an agent using a family of deep-Q learning (DQN) algorithms.When the number of nodes in the core network is large, one critical challengeis overcoming the sampling inefficiency due to a high dimensional actionspace, i.e., most of the exploratory allocations made by the agent gives zeroreward. To deal with such reward sparsity, we applied two techniques: prioritizedexperience replay (PER) and hindsight experience replay (HER).Our study shows that a DQN agent trained with both HER and PER providesa reasonable allocation in a larger sized networks, whereas a vanillaDQN agent works only for a very limited case where the number of nodesis small.

Abstract [sv]

Att allokera service funktioner på ett kärnnätverk för att bemöta användarnasdiverse efterfrågningar är av viktig betydelse inom 5G-nätverk. I detta masterprojekthar vi studerat förstärkande inlärningsmodeller för att lösa detta allokeringsproblem.Mer specifikt: 1) vi bygger en simpel version av en MDPmodell för allokering i ett 5G-kärnnätverk, 2) vi tränar en agent med en familjav deep Q-learning (DQN) algoritmer.När antalet noder i kärnnätverket är stort är ett av de största problemensamplingsinneffektiviteten som uppstår p.g.a. handlingsrymdens höga dimensionalitet. Detta innebär att de flesta utforskningshandlingarna ger agentenen noll-belöning. För att detta problem med gles belöning applicerade vi tvåtekniker: prioritized experience replay (PER) och hindsight experience replay(HER).Våra studier visar att en DQN agent tränad med både HER och PER löserallokeringsproblemet för större nätverk medan en vanlig DQN agent endastlöser problemet för nätverk med ett begränsat antal noder.

Place, publisher, year, edition, pages
2019. , p. 47
Series
TRITA-EECS-EX ; 2019:604
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:kth:diva-265780OAI: oai:DiVA.org:kth-265780DiVA, id: diva2:1380828
External cooperation
Ericsson
Educational program
Master of Science - Systems, Control and Robotics
Examiners
Available from: 2019-12-19 Created: 2019-12-19 Last updated: 2019-12-19Bibliographically approved

Open Access in DiVA

fulltext(1879 kB)6 downloads
File information
File name FULLTEXT01.pdfFile size 1879 kBChecksum SHA-512
b62db4e22b0c16d8cf8858565665c29f032b7113c5b314afa6dcd15ac3cc8fbc234f068a954b48e6fbdd19535de9b78876cd74f5dff2d80be6df7d19953235ad
Type fulltextMimetype application/pdf

By organisation
School of Electrical Engineering and Computer Science (EECS)
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 6 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 39 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf