Change search
ReferencesLink to record
Permanent link

Direct link
Stereo coding for the ITU-T G.719 codec
Uppsala University, Disciplinary Domain of Science and Technology, Technology, Department of Engineering Sciences, Signals and Systems Group.
2011 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

This thesis presents a stereo coding architecture for the ITU-T G.719 fullband mono codec. G.719 is suitable for teleconferencing applications with a competitive audio quality for speech and audio signals that are encoded at 32, 48 and 64 kbps. The proposed stereo architecture comprises parametric stereo coding where the spatial properties of the stereo channels are modeled with the use of parameters, which are encoded and transmitted to the decoder together with an encoded downmix of the stereo channels. The stereo architecture has been implemented in MATLAB with an external mono coding using a floating point ANSI-C implementation of the ITU-T G.719 codec.

Two parametric stereo models have been implemented in a framework operating in the complex-valued Modified Discrete Fourier Transform (MDFT) domain. The first model is based on the inter-channel cues that represent level differences, time differences and coherences between the stereo channels. The cues approximate the corresponding interaural cues that characterize our localization of sound in space. The second model is based on the Karhunen-Loève Transform (KLT) with the associated rotation angles, the inter-channel time differences and the residual scaling parameters. An improved MDFT domain extraction of the inter-channel time difference between the stereo channels has been used for both stereo models. The extracted stereo parameters have been non-uniformly quantized based on the spatial accuracy and the frequency dependency of the human auditory system.

The data rate of the stereo parameters has been estimated for each model to around 4 kbps. As a result G.719 has been used as a core codec at 44 and 60 kbps in order to subjectively evaluate the performance of the fullband stereo codec at 48 and 64 kbps. In the comparison with G.719 dual mono coding, i.e. independent mono coding of the stereo channels, the evaluation showed a higher performance of the proposed stereo models for complex clean and reverberant speech signals. However, no consistent gain of the parametric stereo coding was revealed for noisy speech, mixed content and music signals. In addition, the first stereo model showed consistently a slightly higher performance than the second model in the subjective evaluation but with no significant difference.

The results revealed a high potential for parametric stereo coding using the ITU-T G.719 codec. In comparison to the existing stereo codecs 3GPP AMR-WB+ and 3GPP eAAC+ the average performance was better at the equal bitrate of 48 kbps.

Place, publisher, year, edition, pages
2011. , 164 p.
Series
UPTEC F, ISSN 1401-5757 ; 11 034
Keyword [en]
stereo coding, ITU-T G.719, audio, stereo models
Keyword [sv]
stereokodning, ITU-T G.719, ljud, stereomodeller
Identifiers
URN: urn:nbn:se:uu:diva-153636OAI: oai:DiVA.org:uu-153636DiVA: diva2:417362
Uppsok
Technology
Supervisors
Examiners
Available from: 2011-05-17 Created: 2011-05-16 Last updated: 2011-05-17Bibliographically approved

Open Access in DiVA

fulltext(2492 kB)1212 downloads
File information
File name FULLTEXT01.pdfFile size 2492 kBChecksum SHA-512
7a8ee76d0d1dfe70f8e0848e6f89632d15a3c018ec3a358db0fe2103446c5151f2be160bb6c397a98cda0e433231e0822d0226bef0b1973856f74d356066816f
Type fulltextMimetype application/pdf

By organisation
Signals and Systems Group

Search outside of DiVA

GoogleGoogle Scholar
Total: 1212 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 589 hits
ReferencesLink to record
Permanent link

Direct link