Change search
ReferencesLink to record
Permanent link

Direct link
Direction of Arrival Estimation and Localization of Multiple Speech Sources in Enclosed Environments
Blekinge Institute of Technology, School of Engineering, Department of Electrical Engineering.
Responsible organisation
2012 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Speech communication is gaining in popularity in many different contexts as technology evolves. With the introduction of mobile electronic devices such as cell phones and laptops, and fixed electronic devices such as video and teleconferencing systems, more people are communicating which leads to an increasing demand for new services and better speech quality. Methods to enhance speech recorded by microphones often operate blindly without prior knowledge of the signals. With the addition of multiple microphones to allow for spatial filtering, many blind speech enhancement methods have to operate blindly also in the spatial domain. When attempting to improve the quality of spoken communication it is often necessary to be able to reliably determine the location of the speakers. A dedicated source localization method on top of the speech enhancement methods can assist the speech enhancement method by providing the spatial information about the sources. This thesis addresses the problem of speech-source localization, with a focus on the problem of localization in the presence of multiple concurrent speech sources. The primary work consists of methods to estimate the direction of arrival of multiple concurrent speech sources from an array of sensors and a method to correct the ambiguities when estimating the spatial locations of multiple speech sources from multiple arrays of sensors. The thesis also improves the well-known SRP-based methods with higher-order statistics, and presents an analysis of how the SRP-PHAT performs when the sensor array geometry is not fully calibrated. The thesis is concluded by two envelope-domain-based methods for tonal pattern detection and tonal disturbance detection and cancelation which can be useful to further increase the usability of the proposed localization methods. The main contribution of the thesis is a complete methodology to spatially locate multiple speech sources in enclosed environments. New methods and improvements to the combined solution are presented for the direction-of-arrival estimation, the location estimation and the location ambiguity correction, as well as a sensor array calibration sensitivity analysis.

Place, publisher, year, edition, pages
Karlskrona: Blekinge Institute of Technology , 2012.
Blekinge Institute of Technology Doctoral Dissertation Series, ISSN 1653-2090 ; 3
Keyword [en]
Beamforming, Detection and classification, Speech enhancement, Source localization
National Category
Signal Processing
URN: urn:nbn:se:bth-00520Local ID: 978-91-7295-226-3OAI: diva2:835040
Available from: 2012-09-18 Created: 2011-12-12 Last updated: 2016-09-06Bibliographically approved

Open Access in DiVA

fulltext(73 kB)26 downloads
File information
File name FULLTEXT01.pdfFile size 73 kBChecksum SHA-512
Type fulltextMimetype application/pdf
fulltext(2333 kB)100 downloads
File information
File name FULLTEXT02.pdfFile size 2333 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Swartling, Mikael
By organisation
Department of Electrical Engineering
Signal Processing

Search outside of DiVA

GoogleGoogle Scholar
Total: 126 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 158 hits
ReferencesLink to record
Permanent link

Direct link