Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Deep Convolutional Neural Networks for Real-Time Single Frame Monocular Depth Estimation
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Systems and Control.
2017 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Vision based active safety systems have become more frequently occurring in modern vehicles to estimate depth of the objects ahead and for autonomous driving (AD) and advanced driver-assistance systems (ADAS). In this thesis a lightweight deep convolutional neural network performing real-time depth estimation on single monocular images is implemented and evaluated. Many of the vision based automatic brake systems in modern vehicles only detect pre-trained object types such as pedestrians and vehicles. These systems fail to detect general objects such as road debris and roadside obstacles. In stereo vision systems the problem is resolved by calculating a disparity image from the stereo image pair to extract depth information. The distance to an object can also be determined using radar and LiDAR systems. By using this depth information the system performs necessary actions to avoid collisions with objects that are determined to be too close. However, these systems are also more expensive than a regular mono camera system and are therefore not very common in the average consumer car. By implementing robust depth estimation in mono vision systems the benefits from active safety systems could be utilized by a larger segment of the vehicle fleet. This could drastically reduce human error related traffic accidents and possibly save many lives.

The network architecture evaluated in this thesis is more lightweight than other CNN architectures previously used for monocular depth estimation. The proposed architecture is therefore preferable to use on computationally lightweight systems. The network solves a supervised regression problem during the training procedure in order to produce a pixel-wise depth estimation map. The network was trained using a sparse ground truth image with spatially incoherent and discontinuous data and output a dense spatially coherent and continuous depth map prediction. The spatially incoherent ground truth posed a problem of discontinuity that was addressed by a masked loss function with regularization. The network was able to predict a dense depth estimation on the KITTI dataset with close to state-of-the-art performance. 

Place, publisher, year, edition, pages
2017. , p. 70
Series
UPTEC F, ISSN 1401-5757 ; 17060
Keywords [en]
deep learning, machine learning, mono vision system, lightweight, CNN, convolutional neural network, depth estimation, lidar, kitti, vehicle camera, mono camera, camera, real-time, real time, ad, autonomous driving, adas, advanced driver assistance systems, mono depth, computer vision, regression, pixel-wise, pixel wise, object detection, general object detection, pedestrian detection, vehicle detection, supervised learning, supervised, tensorflow, python, keras, opencv, autoliv
National Category
Computer Vision and Robotics (Autonomous Systems)
Identifiers
URN: urn:nbn:se:uu:diva-336923OAI: oai:DiVA.org:uu-336923DiVA, id: diva2:1167554
External cooperation
Autoliv AB
Subject / course
Computer Systems Sciences
Educational program
Master Programme in Engineering Physics
Presentation
2017-11-27, Å2004, Lägerhyddsvägen 1, Uppsala, 16:00 (Swedish)
Supervisors
Examiners
Available from: 2017-12-19 Created: 2017-12-19 Last updated: 2018-01-13Bibliographically approved

Open Access in DiVA

fulltext(31058 kB)998 downloads
File information
File name FULLTEXT01.pdfFile size 31058 kBChecksum SHA-512
39a44c95641fb985fdb454b7eee2bf4140081bff7d481289d59720b9694286ba362972a50161c362e890fc469ef5ee56e8845e79ae656e28aa065910de835b6e
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Schennings, Jacob
By organisation
Division of Systems and Control
Computer Vision and Robotics (Autonomous Systems)

Search outside of DiVA

GoogleGoogle Scholar
Total: 998 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 4648 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf