Combining RGB and Depth Images for Robust Object Detection using Convolutional Neural Networks
Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesisAlternative title
Kombinera RGB- och djupbilder för robust objektdetektering med neurala faltningsnätverk (Swedish)
We investigated the advantage of combining RGB images with depth data to get more robust object classifications and detections using pre-trained deep convolutional neural networks. We relied upon the raw images from publicly available datasets captured using Microsoft Kinect cameras. The raw images varied in size, and therefore required resizing to fit our network. We designed a resizing method called "bleeding edge" to avoid distorting the objects in the images. We present a novel method of interpolating the missing depth pixel values by comparing to similar RGB values. This method proved superior to the other methods tested. We showed that a simple colormap transformation of the depth image can provide close to state-of-art performance. Using our methods, we can present state-of-art performance on the Washington Object dataset and we provide some results on the Washington Scenes (V1) dataset. Specifically, for the detection, we used contours at different thresholds to find the likely object locations in the images. For the classification task we can report state-of-art results using only RGB and RGB-D images, depth data alone gave close to state-of-art results. For the detection task we found the RGB only detector to be superior to the other detectors.
Place, publisher, year, edition, pages
CNN, Convolutional Neural Network, SVM, Support Vector Machine, RGB-D, Depth
IdentifiersURN: urn:nbn:se:kth:diva-174137OAI: oai:DiVA.org:kth-174137DiVA: diva2:858100