Change search
ReferencesLink to record
Permanent link

Direct link
On Feature Extraction and Classification in Speech and Image Processing
Responsible organisation
2007 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

The natural world is home to innumerable patterns in various forms, which humans are able to locate and interpret by means of the senses. This thesis presents and explores different techniques that mimic such behavior through the use of artificial sensors and computational power, i.e. aspects of machine learning with particular emphasis on pattern recognition. Theory and practical issues are explored with respect to two main operations; feature extraction and classification. On the topic of feature extraction, this thesis introduces a new signal processing transform, denoted the Successive Mean Quantization Transform (SMQT). The relevant theory, extensions and numerical transformations are presented, along with possible usage of this transform in various situations. Two different classifiers are investigated; the hidden Markov model and the sparse network of winnows. The hidden Markov model is a stochastic model which has been used successfully in the context of various pattern recognition applications. During the implementation of a complete system using the hidden Markov model, a number of possible numerical issues can arise. The relevant theory behind these numerical issues is presented, as are a number of possible solutions. The sparse network of winnows is a general purpose classifier. In the context of this thesis, it is tailored for the task of fast binary classification using lookup tables. Further, a scheme is proposed to split up this classifier in order to perform faster classification. This scheme is denoted the split up sparse network of winnows. The sections of this thesis dedicated to feature extraction and classification present a number of tools which are utilized further in three applications. The first application is concerned with the enhancement of noise degraded speech. Specifically, this application addresses the task of reducing non-stationary noise from speech using the hidden Markov model. The second application addresses the task of automatic image enhancement. For this task, the Successive Mean Quantization Transform is investigated. The final application is concerned with face detection. For this task, illumination problems and speed issues are discussed, along with proposed solutions.

Place, publisher, year, edition, pages
Karlskrona: Blekinge Institute of Technology , 2007. , 166 p.
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 15
National Category
Signal Processing
URN: urn:nbn:se:bth-00380Local ID: 978-91-7295-123-5OAI: diva2:836630
Available from: 2012-09-18 Created: 2007-11-13 Last updated: 2015-06-30Bibliographically approved

Open Access in DiVA

fulltext(5151 kB)36 downloads
File information
File name FULLTEXT01.pdfFile size 5151 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Nilsson, Mikael
Signal Processing

Search outside of DiVA

GoogleGoogle Scholar
Total: 36 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 34 hits
ReferencesLink to record
Permanent link

Direct link