Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Improvements of the voice activity detector in AMR-WB
2007 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

In speech coding one can make use of the speech inactivity to reduce the average bit-rate of the encoded signal. This demands a process commonly referred to as Voice Activity Detection (VAD) that separates the speech frames from the frames that only contains background noise. The purpose of the VAD is to tell the speech encoder to stop or reduce the data flow when no speech is present. The goal with such a process is to lower the average bit-rate without affecting the perceived speech quality. This work is an investigation and evaluation of possible improvements of the voice activity detector in the Adaptive Multirate Wideband (AMR-WB) speech coder. The purpose of the work was to reduce the sensitivity to babble background noise and improve the performance for detection of music. In the report there is a brief introduction to the theory of speech coding and VAD followed by the outline of the AMR-WB speech coder. The main part of this thesis discusses possible improvements of the detector starting with recent findings in the Adaptive Multirate Narrowband (AMR-NB) algorithm. Based on the limited material used for evaluation in this work the modifications proposed for the AMR-NB VAD showed good results also for AMR- WB. It turned out however that additional modifications should be done in order to ensure reliable detection of high level non-stationary noises. A music hangover solution was also suggested for better handling of music when the suggested modifications are implemented. The solution suggested for reduction of the sensitivity to babble noises offers a compromise between voice activity and speech clipping that can be tuned to desired performance. The results and conclusions in this thesis are based on objective tests of limited material and contain no formal subjective testing. The conclusions should therefore be treated as guidance for further studies but indicates that the solutions proposed will reduce the AMR-WB VADs sensitivity to non- stationary background noises.

Place, publisher, year, edition, pages
2007.
Keyword [en]
Technology, Signalbehandling, AMR-WB, Voice Activity Detection, Speech, Coding
Keyword [sv]
Teknik
Identifiers
URN: urn:nbn:se:ltu:diva-51486ISRN: LTU-EX--07/262--SELocal ID: 8b1fd497-9535-420e-907d-d1fd1a1aa35cOAI: oai:DiVA.org:ltu-51486DiVA: diva2:1024847
Subject / course
Student thesis, at least 30 credits
Educational program
Arenaprogrammes (2002-2014)
Examiners
Note
Validerat; 20101217 (root)Available from: 2016-10-04 Created: 2016-10-04Bibliographically approved

Open Access in DiVA

fulltext(581 kB)13 downloads
File information
File name FULLTEXT01.pdfFile size 581 kBChecksum SHA-512
ec6a9d02d25b18919d7b25b3b2f4575e39bba4bf47d9bc43daf2b29a8abe3988485c2250c58ec1bee6f522eef5db3f20058ae60329d969a9ac7517b983c64507
Type fulltextMimetype application/pdf

Search outside of DiVA

GoogleGoogle Scholar
Total: 13 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 8 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf