Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Unsupervised construction of 4D semantic maps in a long-term autonomy scenario
KTH, School of Computer Science and Communication (CSC), Centres, Centre for Autonomous Systems, CAS. KTH, School of Computer Science and Communication (CSC), Robotics, perception and learning, RPL.ORCID iD: 0000-0002-3111-3812
2017 (English)Doctoral thesis, monograph (Other academic)
Abstract [en]

Robots are operating for longer times and collecting much more data than just a few years ago. In this setting we are interested in exploring ways of modeling the environment, segmenting out areas of interest and keeping track of the segmentations over time, with the purpose of building 4D models (i.e. space and time) of the relevant parts of the environment.

Our approach relies on repeatedly observing the environment and creating local maps at specific locations. The first question we address is how to choose where to build these local maps. Traditionally, an operator defines a set of waypoints on a pre-built map of the environment which the robot visits autonomously. Instead, we propose a method to automatically extract semantically meaningful regions from a point cloud representation of the environment. The resulting segmentation is purely geometric, and in the context of mobile robots operating in human environments, the semantic label associated with each segment (i.e. kitchen, office) can be of interest for a variety of applications. We therefore also look at how to obtain per-pixel semantic labels given the geometric segmentation, by fusing probabilistic distributions over scene and object types in a Conditional Random Field.

For most robotic systems, the elements of interest in the environment are the ones which exhibit some dynamic properties (such as people, chairs, cups, etc.), and the ability to detect and segment such elements provides a very useful initial segmentation of the scene. We propose a method to iteratively build a static map from observations of the same scene acquired at different points in time. Dynamic elements are obtained by computing the difference between the static map and new observations. We address the problem of clustering together dynamic elements which correspond to the same physical object, observed at different points in time and in significantly different circumstances. To address some of the inherent limitations in the sensors used, we autonomously plan, navigate around and obtain additional views of the segmented dynamic elements. We look at methods of fusing the additional data and we show that both a combined point cloud model and a fused mesh representation can be used to more robustly recognize the dynamic object in future observations. In the case of the mesh representation, we also show how a Convolutional Neural Network can be trained for recognition by using mesh renderings.

Finally, we present a number of methods to analyse the data acquired by the mobile robot autonomously and over extended time periods. First, we look at how the dynamic segmentations can be used to derive a probabilistic prior which can be used in the mapping process to further improve and reinforce the segmentation accuracy. We also investigate how to leverage spatial-temporal constraints in order to cluster dynamic elements observed at different points in time and under different circumstances. We show that by making a few simple assumptions we can increase the clustering accuracy even when the object appearance varies significantly between observations. The result of the clustering is a spatial-temporal footprint of the dynamic object, defining an area where the object is likely to be observed spatially as well as a set of time stamps corresponding to when the object was previously observed. Using this data, predictive models can be created and used to infer future times when the object is more likely to be observed. In an object search scenario, this model can be used to decrease the search time when looking for specific objects.

Place, publisher, year, edition, pages
KTH Royal Institute of Technology: Universitetsservice US AB , 2017. , p. 160
Series
TRITA-CSC-A, ISSN 1653-5723 ; 22
Keywords [en]
Mobile robotics, autonomous systems, perception, computer vision, RGB-D object segmentation, modelling and recognition, semantic segmentation, long-term autonomy, mapping, temporal modeling
National Category
Computer Vision and Robotics (Autonomous Systems)
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-215323ISBN: 978-91-7729-570-9 (print)OAI: oai:DiVA.org:kth-215323DiVA, id: diva2:1147678
Public defence
2017-11-14, F3, Lindstedtsvägen 26, Stockholm, 13:00 (English)
Opponent
Supervisors
Funder
EU, FP7, Seventh Framework Programme, 600623Swedish Foundation for Strategic Research , C0475401
Note

QC 20171009

Available from: 2017-10-09 Created: 2017-10-06 Last updated: 2018-01-13Bibliographically approved

Open Access in DiVA

Rares_Ambrus_PhD_Thesis(38814 kB)103 downloads
File information
File name FULLTEXT01.pdfFile size 38814 kBChecksum SHA-512
7c0d63668752aca3f438429fc627b3be5a9dcab3dd7773a103894ba6b8a522c21ecd352f79a792a683ac6fbf6ed939ff916d8127d511c1ac6f16d30f522e233a
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Ambrus, Rares
By organisation
Centre for Autonomous Systems, CASRobotics, perception and learning, RPL
Computer Vision and Robotics (Autonomous Systems)

Search outside of DiVA

GoogleGoogle Scholar
Total: 103 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 581 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf