WIDEBAND CODING

Chair: Karlheinz Brandenburg, IIG

Home

Optimizing High Quality Audio Coding: Advantages of Full System Observability

Authors:

Anibal J. S. Ferreira, INESC (PORTUGAL)

Volume 5, Page 3063

Abstract:

Perceptual audio coders rely on the efficient reduction of perceptually irrelevant components of the audio signal as well as on the removal of statistical signal redundancies to achieve good coding gains. In order to reach high compression ratios without reducing the subjective quality of the encoded audio signal, it is necessary to identify critically interdependent functional units of the encoding algorithm and to jointly optimize their performance. A flexible and interactive simulation and analysis environment has been programmed to assist the development and optimization of a new perceptual coder. The main features of this environment will be explained and the most relevant aspects that were found to limit the encoding performance will be presented.

300dpi TIFF Images of pages:

3063 3064 3065 3066

Acrobat PDF file of whole paper:

ic953063.pdf

TOP

High Quality Audio Coding Using Multipulse LPC and Wavelet Decomposition

Authors:

S. Boland, Queensland University of Technology (AUSTRALIA)
M. Deriche, Queensland University of Technology (AUSTRALIA)

Volume 5, Page 3067

Abstract:

Most current work in the area of high quality audio coding falls under one of two categories: transform or sub-band coding. LPC coders since based on modelling human voice production systems are found to be inappropriate in modelling music and other non-speech sounds. A more improved model for such signals is shown to be the Multipulse LPC model. In this paper we propose to improve the quality of the Multipulse model by first passing the signal of interest through a filter bank and then extracting the Multipulse parameters from each of the bandpass filter outputs. The idea of the wavelet decomposition is utilised for the design of the filter bank. Both the Multipulse model and the wavelet decomposition are well known. But a combination of both has not been exploited yet. This combination is expected to lead to a new way in high quality low bitrate audio coding.

300dpi TIFF Images of pages:

3067 3068 3069 3070

Acrobat PDF file of whole paper:

ic953067.pdf

TOP

Audio Coding with Signal Adaptive Filterbanks

Authors:

J. Princen, AT&T Bell Laboratories (USA)
J.D. Johnston, AT&T Bell Laboratories (USA)

Volume 5, Page 3071

Abstract:

In this paper we present a high quality audio coding system based on a novel nonuniform modulated filterbank coupled with time-varying cosine modulated filterbanks in a cascade architecture. The system makes use of psychoacoustic thresholds in a natural way to adapt the resolution of the filterbank to achieve high coding gain on a wide range of signal types. Results show that the system provides excellent quality at 64 kb/s and good quality at 48 kb/s for monophonic coding.

300dpi TIFF Images of pages:

3071 3072 3073 3074

Acrobat PDF file of whole paper:

ic953071.pdf

TOP

Computationally Efficient Wavelet Packet Coding of Wide-Band Stereo Audio Signals

Authors:

Mark Black, The University of Western Ontario
Mehmet Zeytinoglu, Ryerson Polytechnic University (CANADA)

Volume 5, Page 3075

Abstract:

This paper presents a new audio compressor based on the wavelet packet (WP) decomposition. The major drawback of the present wideband multichannel audio compressors is the large computational effort associated with the subband decomposition and the psychoacoustic model. We integrate the psychoacoustic model with the design of the decomposition filterbank which separates the wideband signal into 28 subbands closely approximating the critical bands. The psychoacoustic model exploits noise masking and joint stereo coding to compress the subband signals. We demonstrate that the WP decomposition provides sufficient resolution to extract the time-frequency characteristics of the wideband input signal. The WP based audio compressor provides transparent sound quality at compression rates comparable to the MPEG compressor with less than one third of the computational effort.

300dpi TIFF Images of pages:

3075 3076 3077 3078

Acrobat PDF file of whole paper:

ic953075.pdf

TOP

Incorporation of Biorthogonality into Lapped Transforms for Audio Compression

Authors:

Shiufun Cheung, Massachusetts Institute of Technology (USA)
Jae S. Lim, Massachusetts Institute of Technology (USA)

Volume 5, Page 3079

Abstract:

Acoustic signal representations used in current audio coding algorithms can be improved by the incorporation of biorthogonality into Malvar's Extended Lapped Transform (ELT). Biorthogonality allows more flexibility in the design of the analysis and synthesis windows by increasing the number of degrees of freedom. This paper examines this increase for two special cases and demonstrates the importance of the additional flexibility to the proper implementation of psychoacoustic modeling, a feature central to all modern audio compression schemes.

300dpi TIFF Images of pages:

3079 3080 3081 3082

Acrobat PDF file of whole paper:

ic953079.pdf

TOP

Comparison of the Wavelet Decomposition and the Fourier Transform in TCX Encoding of Wideband Speech and Audio

Authors:

J- M. LeRoux, Matra Communication
R. Lefebvre, University of Sherbrooke (CANADA)
J-P. Adoul, University of Sherbrooke (CANADA)

Volume 5, Page 3083

Abstract:

This paper reports on the specific contribution of the Wavelet Transform (WT) in the TCX coding model for audio signals. TCX, or Transform Coded eXcitation is a frame based coding algorithm that uses both time domain (linear prediction) and frequency domain (transform coding) approaches to exploit signal redundancies as well as frequency masking. While previous work on TCX used the Discrete Fourier Transform (DCT), the quality for highly non-stationary signals such as percussions was less than satisfactory. The WT has therefore been investigated as a compromise between time and frequency resolution.

300dpi TIFF Images of pages:

3083 3084 3085 3086

Acrobat PDF file of whole paper:

ic953083.pdf

TOP

On the Performance of Wavelets for Low Bit Rate Coding of Audio Signals

Authors:

P.E. Kudumakis, King's College London (UK)
M.B. Sandler, King's College London (UK)

Volume 5, Page 3087

Abstract:

The performance of some different wavelet families, including for comparison a well known family of QMFs, is investigated for low bit rate coding of audio signals. For the assessment of the coding gain of these wavelets, both octave and uniform subband coding schemes have been evaluated, using both constant and dynamic bit allocation, with and without entropy noiseless Huffman coding. The influence of complexity of these wavelets, in terms of number of filter coefficients, against the quality of the decompressed audio signals in terms of Segmental-SNR (dB), is presented, at different bit rates. In addition, this evaluation suggests that perceptually transparent quality of monophonic signals can be achieved at 24 kbits/sec (Fs= 8kHz, 3 bits/sample) for speech applications and at 64 kbits/sec (Fs= 48kHz, 1.33 bits/sample) for music related applications, as in digital audio transmission and storage.

300dpi TIFF Images of pages:

3087 3088 3089 3090

Acrobat PDF file of whole paper:

ic953087.pdf

TOP

Lossless Compression of High Fidelity Audio and Adaptive Linear Prediction

Authors:

Andrew L. Adams, Harris RF Communications
Steven W. McLaughlin, Rochester Institute of Technology (USA)

Volume 5, Page 3091

Abstract:

We consider the lossless compression of high fidelity (e.g. 16-bit) digital audio using adaptive linear prediction. Both linear predictive coding (LPC) and least mean squares (LMS) predictors are considered. Preliminary results are presented for the compression of industry standard Sound Quality Assessment Material (SQAM) [1] samples from 16 bits to 1.5 - 3 bits. Previous results by others on the same audio source was in the 8-bit range.

300dpi TIFF Images of pages:

3091 3092 3093 3094

Acrobat PDF file of whole paper:

ic953091.pdf

TOP

High-Quality Audio-Coding at Less than 64Kbit/s by Using Transform-Domain Weighted Interleave Vector Quantization (TwinVQ)

Authors:

Naoki Iwakami, NTT Human Interface Labs. (JAPAN)
Takehiro Moriya, NTT Human Interface Labs. (JAPAN)
Satoshi Miki, NTT Human Interface Labs. (JAPAN)

Volume 5, Page 3095

Abstract:

A new audio-coding method is proposed. This method is called transform-domain weighted interleave vector quantization (TwinVQ) and achieves high-quality reproduction at less than 64 kbit/s. The method is a transform coding using modified discrete cosine transform (MDCT). There are three novel techniques in this method: flattening of the MDCT coefficients by the spectrum of linear predictive coding (LPC) coefficients; interframe backward prediction for flattening the MDCT coefficients; evaluation tests showed that the quality of the reproduction of TwinVQ exceeded that of an MPEG Layer II coder at the same bitrate.

300dpi TIFF Images of pages:

3095 3096 3097 3098

Acrobat PDF file of whole paper:

ic953095.pdf

TOP

Adaptive Filtering Algorithms for Stereophonic Acoustic Echo Cancellation

Authors:

J. Benesty, Telecom Paris
F. Amand, CNET LAA/TSS/CMC
A. Gilloire, CNET LAA/TSS/CMC
Y. Grenier, Telecom Paris (FRANCE)

Volume 5, Page 3099

Abstract:

It is likely that stereophonic (and more generally, multi-channel) sound pick-up, transmission and diffusion will be implemented in future teleconference systems to provide the users with enhanced quality. Therefore, adequate solutions must be found to solve the problem of stereophonic acoustic echo which will occur in such systems. We explain in this paper the difference between the mono and two-channel systems and the behavior of the two-channel classical adaptive algorithms in comparison with the same algorithms in the mono-channel case. Also, we outline a new NLMS-like algorithm derived from the two-channel RLS algorithm as a first member of a family of improved two-channel adaptive filters.

300dpi TIFF Images of pages:

3099 3100 3101 3102

Acrobat PDF file of whole paper:

ic953099.pdf