Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Concurrent Markov decision processes for robot team learning
Institute for Aerospace Studies, University of Toronto.
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Space Technology.ORCID iD: 0000-0003-4977-6339
2015 (English)In: Engineering applications of artificial intelligence, ISSN 0952-1976, E-ISSN 1873-6769, Vol. 39, p. 223-234, article id 12Article in journal (Refereed) Published
Abstract [en]

Multi-agent learning, in a decision theoretic sense, may run into deficiencies if a single Markov decision process (MDP) is used to model agent behaviour. This paper discusses an approach to overcoming such deficiencies by considering a multi-agent learning problem as a concurrence between individual learning and task allocation MDPs. This approach, called Concurrent MDP (CMDP), is contrasted with other MDP models, including decentralized MDP. The individual MDP problem is solved by a Q-Learning algorithm, guaranteed to settle on a locally optimal reward maximization policy. For the task allocation MDP, several different concurrent individual and social learning solutions are considered. Through a heterogeneous team foraging case study, it is shown that the CMDP-based learning mechanisms reduce both simulation time and total agent learning effort.

Place, publisher, year, edition, pages
2015. Vol. 39, p. 223-234, article id 12
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering
Research subject
Onboard space systems
Identifiers
URN: urn:nbn:se:ltu:diva-9269DOI: 10.1016/j.engappai.2014.12.007ISI: 000349878400019Scopus ID: 2-s2.0-84921775228Local ID: 7dba5c06-d6bf-41af-918d-7cdb44312c4fOAI: oai:DiVA.org:ltu-9269DiVA, id: diva2:982207
Note
Validerad; 2015; Nivå 2; 20150120 (andbra)Available from: 2016-09-29 Created: 2016-09-29 Last updated: 2018-07-10Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Emami, Reza
By organisation
Space Technology
In the same journal
Engineering applications of artificial intelligence
Other Electrical Engineering, Electronic Engineering, Information Engineering

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 50 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf