Change search
ReferencesLink to record
Permanent link

Direct link
Source Localization and Speech Enhancement for Speech Recognition for Real time Environment
Blekinge Institute of Technology, School of Engineering.
Blekinge Institute of Technology, School of Engineering.
2012 (English)Independent thesis Advanced level (degree of Master (Two Years))Student thesis
Abstract [en]

Popularity of speech communication is rapidly increasing in various contexts such as conferencing systems, mobile/fixed electronic devices and laptops thus leading to a heightened demand for new services and improved speech quality. Dictaphones used for dictations usually have one microphone. Single microphone does not give enough degree of freedom to allow estimation of location of the source. Microphone array makes use of multiple microphones for spatial filtering suppressing the background noise. This report aims for speech enhancement utilizing the benefits inherited with microphone arrays to find direction of desired speaker and focus the listening beam in that direction. A comparison is made between Generalized Cross Correlation (GCC) methods for locating the source in real office environment. Beamforming is implemented to make the microphone array listen in the desired direction thus reducing the interference from other sources. Minimum Variance Distortion-less Response (MVDR) approach is shown to give better results compared to more simplistic techniques. Perceptual based Eigen filter incorporating human hearing models in subspace incorporated in the suppressor eliminates the residual noise. Objective system performance is evaluated by estimating Signal-to-Noise-Ratio improvement (SNRI), segmental SNR, signal degradation and noise suppression. Perpetual Evaluation of Speech Quality (PESQ) gives Mean Opinion Score for subjective evaluation.

Place, publisher, year, edition, pages
2012. , 58 p.
Keyword [en]
Beamforming, Localization, Lapped Transform, SRP-PHAT, MVDR, Subspace Supression, PESQ
National Category
Computer Science Signal Processing
URN: urn:nbn:se:bth-4130Local ID: diva2:831453
Note, akbarali45@gmail.comAvailable from: 2015-04-22 Created: 2013-01-09 Last updated: 2015-06-30Bibliographically approved

Open Access in DiVA

fulltext(1218 kB)116 downloads
File information
File name FULLTEXT01.pdfFile size 1218 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
School of Engineering
Computer ScienceSignal Processing

Search outside of DiVA

GoogleGoogle Scholar
Total: 116 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 36 hits
ReferencesLink to record
Permanent link

Direct link