Change search
ReferencesLink to record
Permanent link

Direct link
Fast morphological image processing open-source extensions for GPU processing with CUDA
Luleå University of Technology, Department of Computer Science, Electrical and Space Engineering, Signals and Systems.
Luleå University of Technology.
2012 (English)In: IEEE Journal on Selected Topics in Signal Processing, ISSN 1932-4553, E-ISSN 1941-0484, Vol. 6, no 7, 849-855 p.Article in journal (Refereed) Published
Abstract [en]

GPU architectures offer a significant opportunity for faster morphological image processing, and the NVIDIA CUDA architecture offers a relatively inexpensive and powerful framework for performing these operations. However, the generic morphological erosion and dilation operation in the CUDA NPP library is relatively naive, and performance scales expensively with increasing structuring element size. The objective of this work is to produce a freely available GPU capability for morphological operations so that fast GPU processing can be readily available to those in the morphological image processing community. Open-source extensions to CUDA (hereafter referred to as LTU-CUDA) have been produced for erosion and dilation using a number of structuring elements for both 8 bit and 32 bit images. Support for 32 bit image data is a specific objective of the work in order to facilitate fast processing of image data from 3D range sensors with high depth precision. Furthermore, the implementation specifically allows scalability of image size and structuring element size for processing of large image sets. Images up to 4096 by 4096 pixels with 32 bit precision were tested. This scalability has been achieved by forgoing the use of shared memory in CUDA multiprocessors. The vHGW algorithm for erosion and dilation independent of structuring element size has been implemented for horizontal, vertical, and 45 degree line structuring elements with significant performance improvements over NPP. However, memory handling limitations hinder performance in the vertical line case providing results not independent of structuring element size and posing an interesting challenge for further optimisation. This performance limitation is mitigated for larger structuring elements using an optimised transpose function, which is not default in NPP, and applying the horizontal structuring element. LTU-CUDA is an ongoing project and the code is freely available at

Place, publisher, year, edition, pages
2012. Vol. 6, no 7, 849-855 p.
Keyword [en]
Information technology - Signal processing
Keyword [sv]
Informationsteknik - Signalbehandling
Research subject
Signal Processing
URN: urn:nbn:se:ltu:diva-8288DOI: 10.1109/JSTSP.2012.2204857Local ID: 6c7190b8-d2f4-484a-8c66-97886560718eOAI: diva2:981180
VSB - Vision Systems Business Development Platform
Validerad; 2012; 20120603 (mjt)Available from: 2016-09-29 Created: 2016-09-29Bibliographically approved

Open Access in DiVA

fulltext(230 kB)3 downloads
File information
File name FULLTEXT01.pdfFile size 230 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Thurley, Matthew
By organisation
Signals and Systems
In the same journal
IEEE Journal on Selected Topics in Signal Processing

Search outside of DiVA

GoogleGoogle Scholar
Total: 3 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 1 hits
ReferencesLink to record
Permanent link

Direct link