Estimating 6D Pose from Depth and Color Images
2025 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Student thesis
Abstract [en]
Estimating the 6D pose, i.e. the location and orientation, of an object is a central problem in many applications of computer vision. One of those is robotic bin-picking, an automated process where a robot picks an object from one location and moves it to another. As factories become more automated, able robotic bin-picking systems are becoming increasingly more in demand, and a robust pose estimation is a vital basis for the systems.
There are many different approaches to solve the problem of 6D pose estimation, and in this study some of these approaches are assessed. A big part of this study is an extensive search to find state of the art methods. An initial reference point for this search is the BOP challenge. Numerous recent methods build upon machine learning approaches, and though the potential of these seems great, classical approaches should not be overlooked. Therefore, a classical point cloud registration method, TEASER, is compared to a machine learning-based algorithm, SAM-6D.
To evaluate these two methods, large amounts of data are needed. Real RGBD (Red, Green, Blue, Depth) images are captured with a stereo camera that utilizes structured light to enhance the triangulation. To increase the size of the dataset, synthetic data are created using BlenderProc2, a procedural Blender pipeline, developed specifically for generating data for 6D pose estimation. In BlenderProc2, a physics engine is used to create realistic bin-picking scenes.
Both methods (TEASER and SAM-6D) proved to be able to accurately solve the problem of 6D pose estimation for some objects within the scope of this project. However, both methods exhibited weaknesses, struggling to robustly estimate the pose of certain objects. Objects in heavily cluttered scenes, with large amounts of occlusion, proved more difficult to localize. Also, both methods proved to be slow, requiring an unbecoming amount of time to process a bin-picking scene and locate the objects in it.
Place, publisher, year, edition, pages
2025. , p. 94
Keywords [en]
6D Pose, Pose Estimation, Point Cloud, Robotic bin-picking, RGBD, Color image, Depth image, Computer Vision, Industrial Automation, Point Cloud Registration, Machine Learning, Synthetic Data Generation
National Category
Computer Vision and Learning Systems
Identifiers
URN: urn:nbn:se:liu:diva-212239ISRN: LiTH-ISY-EX--25/5731--SEOAI: oai:DiVA.org:liu-212239DiVA, id: diva2:1944398
Subject / course
Computer Vision Laboratory
Presentation
2025-03-10, Systemet, B-huset, Linköpings universitet, Linköping, 13:15 (English)
Supervisors
Examiners
2025-03-192025-03-132025-03-19Bibliographically approved