Automatic Semantic Segmentation of Indoor Datasets
2024 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE credits
Student thesis
Abstract [en]
Background: In recent years, computer vision has undergone significant advancements, revolutionizing fields such as robotics, augmented reality, and autonomoussystems. Key to this transformation is Simultaneous Localization and Mapping(SLAM), a fundamental technology that allows machines to navigate and interactintelligently with their surroundings. Challenges persist in harmonizing spatial andsemantic understanding, as conventional methods often treat these tasks separately,limiting comprehensive evaluations with shared datasets. As applications continueto evolve, the demand for accurate and efficient image segmentation ground truthbecomes paramount. Manual annotation, a traditional approach, proves to be bothcostly and resource-intensive, hindering the scalability of computer vision systems.This thesis addresses the urgent need for a cost-effective and scalable solution byfocusing on the creation of accurate and efficient image segmentation ground truth,bridging the gap between spatial and semantic tasks.
Objective: This thesis addresses the challenge of creating an efficient image segmentation ground truth to complement datasets with spatial ground truth. Theprimary objective is to reduce the time and effort taken for annotation of datasets.
Method: Our methodology adopts a systematic approach to evaluate and combineexisting annotation techniques, focusing on precise object detection and robust segmentation. By merging these approaches, we aim to enhance annotation accuracywhile streamlining the annotation process. This approach is systematically appliedand evaluated across multiple datasets, including the NYU V2 dataset(consists ofover 1449 images), ARID(real-world sequential dataset), and Italian flats(sequentialdataset created in blender).
Results: The developed pipeline demonstrates promising outcomes, showcasing asubstantial reduction in annotation time compared to manual annotation, thereby addressing the challenges posed by the cost and resource intensiveness of the traditionalapproach. We observe that although not initially optimized for SLAM datasets, thepipeline performs exceptionally well on both ARID and Italian flats datasets, highlighting its adaptability to real-world scenarios.
Conclusion: In conclusion, this research introduces an innovative annotation pipeline,offering a systematic and efficient approach to annotation. It tries to bridge the gapbetween spatial and semantic tasks, addressing the pressing need for comprehensiveannotation tools in this domain.
Place, publisher, year, edition, pages
2024. , p. 61
Keywords [en]
Semantic Segmentation, Annotation, SLAM, Indoor datasets, YOLO V8, DETIC, Segment Anything Model.
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:bth-26006OAI: oai:DiVA.org:bth-26006DiVA, id: diva2:1842045
External cooperation
Ericsson AB, Lund
Subject / course
DV2572 Master´s Thesis in Computer Science
Educational program
DVADA Master Qualification Plan in Computer Science
Presentation
2024-01-23, J3208 Claude Shannon, Valhallavägen 1, Karlskrona, Blekinge, 13:00 (English)
Supervisors
Examiners
2024-03-122024-03-022024-03-12Bibliographically approved