Change search
ReferencesLink to record
Permanent link

Direct link
Bandwidth Extension of Telephony Speech
Norwegian University of Science and Technology, Faculty of Information Technology, Mathematics and Electrical Engineering, Department of Electronics and Telecommunications.
2009 (English)MasteroppgaveStudent thesis
Abstract [en]

The public switched telephone network (PSTN) restricts the acoustic bandwidth of telephony speech to less than 4 kHz. For compatibility with analog telephone networks, a 0.3 − 3.4 kHz pass band is common. This bandwidth reduction has a significant impact on perceived quality, and is especially noticeable and even distracting when PSTN users call into, e.g., video conferencing systems in which the other participants may use wideband (50 − 7k Hz) speech codecs. To reduce the gap in quality, one may attempt to resynthesize the missing spectrum. Techniques for this are referred to as bandwidth extension (BWE). For this thesis, two systems for BWE of speech into the high band (f ≥ 3.4 kHz) were imple- mented in Matlab, based on systems proposed in literature. The extension was done according to the linear source-filter model for speech, meaning estimation of the excitation and spectral envelope from the narrowband (0.3 − 3.4 kHz) signal were done separately. BWE System 1 made use of linear prediction (LP) analysis in combination with modulation for extension of the excitation. Its wideband spectral envelope estimation was primarily based on linear prediction cepstral coefficients (LPCC) and artificial neural networks (ANN). BWE System 2 made use of bandpass-modulation of Gaussian noise (BP-MGN) for extension of the excitation. Its wideband spectral envelope estimation was based on Mel-frequency cepstral coefficients (MFCC) and Gaussian mixture modelling (GMM), which was the most complex estimation method of the two systems. Objective analysis of the two systems? spectral envelope estimation and informal listening tests were carried out. These analyses showed that BWE System 1 performed best, though both systems improved the perceived quality. BWE systems based on LP analysis therefore seem to be preferrable due to the superior excitation, and efficient computation of the cepstrum.

Place, publisher, year, edition, pages
Institutt for elektronikk og telekommunikasjon , 2009. , 79 p.
URN: urn:nbn:no:ntnu:diva-25084Local ID: ntnudaim:4707OAI: diva2:730486
Available from: 2014-06-27 Created: 2014-06-27 Last updated: 2014-06-27Bibliographically approved

Open Access in DiVA

fulltext(1377 kB)479 downloads
File information
File name FULLTEXT01.pdfFile size 1377 kBChecksum SHA-512
Type fulltextMimetype application/pdf
cover(46 kB)20 downloads
File information
File name COVER01.pdfFile size 46 kBChecksum SHA-512
Type coverMimetype application/pdf
attachment(33371 kB)69 downloads
File information
File name ATTACHMENT01.zipFile size 33371 kBChecksum SHA-512
Type attachmentMimetype application/zip

By organisation
Department of Electronics and Telecommunications

Search outside of DiVA

GoogleGoogle Scholar
Total: 479 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 2439 hits
ReferencesLink to record
Permanent link

Direct link