Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Simulation-Aided Policy Tuning for Black-Box Robot Learning
Hangzhou City Univ, Sch Informat & Elect Engn, Hangzhou 310015, Peoples R China..
Rhein Westfal TH Aachen, Inst Data Sci Mech Engn, Biointerface Lab, D-52062 Aachen, Germany.;Tech Univ Munich, TUM Sch Computat Informat & Technol, Dept Comp Engn, Learning Syst & Robot Lab, D-80333 Munich, Germany.;Munich Inst Robot & Machine Intelligence, D-80333 Munich, Germany..
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Systems and Control. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Artificial Intelligence. Aalto Univ, Dept Elect Engn & Automat, Espoo 02150, Finland..ORCID iD: 0000-0001-7340-2180
Zhejiang Univ, Coll Elect Engn, Hangzhou 310027, Peoples R China.;Zhejiang Univ, Huzhou Inst, Huzhou 313000, Peoples R China..
Show others and affiliations
2025 (English)In: IEEE Transactions on robotics, ISSN 1552-3098, E-ISSN 1941-0468, Vol. 41, p. 2533-2548Article in journal (Refereed) Published
Abstract [en]

How can robots learn and adapt to new tasks and situations with little data? Systematic exploration and simulation are crucial tools for efficient robot learning. We present a novel black-box policy search algorithm focused on data-efficient policy improvements. The algorithm learns directly on the robot and treats simulation as an additional information source to speed up the learning process. At the core of the algorithm, a probabilistic model learns the dependence between the policy parameters and the robot learning objective not only by performing experiments on the robot, but also by leveraging data from a simulator. This substantially reduces interaction time with the robot. Using the model, we can guarantee improvements with high probability for each policy update, thereby facilitating fast, goal-oriented learning. We evaluate our algorithm on simulated fine-tuning tasks and demonstrate the data-efficiency of the proposed dual-information source optimization algorithm. In a real robot learning experiment, we show fast and successful task learning on a robot manipulator with the aid of an imperfect simulator.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2025. Vol. 41, p. 2533-2548
Keywords [en]
Robots, Robot learning, Closed box, Trajectory, Search problems, Optimization, Bayes methods, Tuning, Probabilistic logic, Hardware, Bayesian optimization (BO), sim-to-real
National Category
Robotics and automation Control Engineering Computer Sciences
Identifiers
URN: urn:nbn:se:uu:diva-555387DOI: 10.1109/TRO.2025.3539192ISI: 001463453000003Scopus ID: 2-s2.0-105002557810OAI: oai:DiVA.org:uu-555387DiVA, id: diva2:1954843
Available from: 2025-04-28 Created: 2025-04-28 Last updated: 2025-04-28Bibliographically approved

Open Access in DiVA

fulltext(2307 kB)26 downloads
File information
File name FULLTEXT01.pdfFile size 2307 kBChecksum SHA-512
c02cf93bc41be0fdbee65262f1335a526b431af1c2ac252dad02736f5c220dbef6392aed5c3b0bc115b1ded903a2d489fa463b9cfc1aa7b28e0ac06d28022bc4
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Baumann, Dominik
By organisation
Division of Systems and ControlArtificial Intelligence
In the same journal
IEEE Transactions on robotics
Robotics and automationControl EngineeringComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 26 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 128 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf