Change search
ReferencesLink to record
Permanent link

Direct link
Improvements of the voice activity detector in AMR-WB
2007 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

In speech coding one can make use of the speech inactivity to reduce the average bit-rate of the encoded signal. This demands a process commonly referred to as Voice Activity Detection (VAD) that separates the speech frames from the frames that only contains background noise. The purpose of the VAD is to tell the speech encoder to stop or reduce the data flow when no speech is present. The goal with such a process is to lower the average bit-rate without affecting the perceived speech quality. This work is an investigation and evaluation of possible improvements of the voice activity detector in the Adaptive Multirate Wideband (AMR-WB) speech coder. The purpose of the work was to reduce the sensitivity to babble background noise and improve the performance for detection of music. In the report there is a brief introduction to the theory of speech coding and VAD followed by the outline of the AMR-WB speech coder. The main part of this thesis discusses possible improvements of the detector starting with recent findings in the Adaptive Multirate Narrowband (AMR-NB) algorithm. Based on the limited material used for evaluation in this work the modifications proposed for the AMR-NB VAD showed good results also for AMR- WB. It turned out however that additional modifications should be done in order to ensure reliable detection of high level non-stationary noises. A music hangover solution was also suggested for better handling of music when the suggested modifications are implemented. The solution suggested for reduction of the sensitivity to babble noises offers a compromise between voice activity and speech clipping that can be tuned to desired performance. The results and conclusions in this thesis are based on objective tests of limited material and contain no formal subjective testing. The conclusions should therefore be treated as guidance for further studies but indicates that the solutions proposed will reduce the AMR-WB VADs sensitivity to non- stationary background noises.

Place, publisher, year, edition, pages
Keyword [en]
Technology, Signalbehandling, AMR-WB, Voice Activity Detection, Speech, Coding
Keyword [sv]
URN: urn:nbn:se:ltu:diva-51486ISRN: LTU-EX--07/262--SELocal ID: 8b1fd497-9535-420e-907d-d1fd1a1aa35cOAI: diva2:1024847
Subject / course
Student thesis, at least 30 credits
Educational program
Arenaprogrammes (2002-2014)
Validerat; 20101217 (root)Available from: 2016-10-04 Created: 2016-10-04Bibliographically approved

Open Access in DiVA

fulltext(581 kB)0 downloads
File information
File name FULLTEXT01.pdfFile size 581 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

ReferencesLink to record
Permanent link

Direct link