ICASSP99 Coding

Coding
Home Full List of Titles 1: Speech Processing CELP Coding Large Vocabulary Recognition Speech Analysis and Enhancement Acoustic Modeling I ASR Systems and Applications Topics in Speech Coding Speech Analysis Low Bit Rate Speech Coding I Robust Speech Recognition in Noisy Environments Speaker Recognition Acoustic Modeling II Speech Production and Synthesis Feature Extraction Robust Speech Recognition and Adaptation Low Bit Rate Speech Coding II Speech Understanding Language Modeling I 2: Speech Processing, Audio and Electroacoustics, and Neural Networks Acoustic Modeling III Lexical Issues/Search Speech Understanding and Systems Speech Analysis and Quantization Utterance Verification/Acoustic Modeling Language Modeling II Adaptation /Normalization Speech Enhancement Topics in Speaker and Language Recognition Echo Cancellation and Noise Control Coding Auditory Modeling, Hearing Aids and Applications of Signal Processing to Audio and Acoustics Spatial Audio Music Applications Application - Pattern Recognition & Speech Processing Theory & Neural Architecture Signal Separation Application - Image & Nonlinear Signal Processing 3: Signal Processing Theory & Methods I Filter Design and Structures Detection Wavelets Adaptive Filtering: Applications and Implementation Nonlinear Signals and Systems Time/Frequency and Time/Scale Analysis Signal Modeling and Representation Filterbank and Wavelet Applications Source and Signal Separation Filterbanks Emerging Applications and Fast Algorithms Frequency and Phase Estimation Spectral Analysis and Higher Order Statistics Signal Reconstruction Adaptive Filter Analysis Transforms and Statistical Estimation Markov and Bayesian Estimation and Classification 4: Signal Processing Theory & Methods II, Design and Implementation of Signal Processing Systems, Special Sessions, and Industry Technology Tracks System Identification, Equalization, and Noise Suppression Parameter Estimation Adaptive Filters: Algorithms and Performance DSP Development Tools VLSI Building Blocks DSP Architectures DSP System Design Education Recent Advances in Sampling Theory and Applications Steganography: Information Embedding, Digital Watermarking, and Data Hiding Speech Under Stress Physics-Based Signal Processing DSP Chips, Architectures and Implementations DSP Tools and Rapid Prototyping Communication Technologies Image and Video Technologies Automotive Applications / Industrial Signal Processing Speech and Audio Technologies Defense and Security Applications Biomedical Applications Voice and Media Processing Adaptive Interference Cancellation 5: Communications, Sensor Array and Multichannel Source Coding and Compression Compression and Modulation Channel Estimation and Equalization Blind Multiuser Communications Signal Processing for Communications I CDMA and Space-Time Processing Time-Varying Channels and Self-Recovering Receivers Signal Processing for Communications II Blind CDMA and Multi-Channel Equalization Multicarrier Communications Detection, Classification, Localization, and Tracking Radar and Sonar Signal Processing Array Processing: Direction Finding Array Processing Applications I Blind Identification, Separation, and Equalization Antenna Arrays for Communications Array Processing Applications II 6: Multimedia Signal Processing, Image and Multidimensional Signal Processing, Digital Signal Processing Education Multimedia Analysis and Retrieval Audio and Video Processing for Multimedia Applications Advanced Techniques in Multimedia Video Compression and Processing Image Coding Transform Techniques Restoration and Estimation Image Analysis Object Identification and Tracking Motion Estimation Medical Imaging Image and Multidimensional Signal Processing Applications I Segmentation Image and Multidimensional Signal Processing Applications II Facial Recognition and Analysis Digital Signal Processing Education Author Index A B C D E F G H I J K L M N O P Q R S T U V W X Y Z	On The Utilization Of Overshoot Effects In Low-Delay Audio Coding Authors: Aki Härmä, Unto K. Laine, Matti Karjalainen, Page (NA) Paper number 1254 Abstract: In low-delay audio coding (coding delay < 5 ms) there is no time for detailed spectral modeling in the case of brief percussive sounds, e.g., the castanets, and onsets of music or speech sounds. On the other hand, it is known from psychoacoustic experiments that the ear is not accurate near the onset of a wideband sound. In this paper, we study the audibility of coding errors near the onsets of musical sounds in a simulated low-delay audio codec based on frequency-warped linear prediction. It is suggested that for many musical transients it is sufficient to reproduce a rough temporal and spectral envelope of the original signal during the first 5-10 ms. Preliminary listening tests support this idea. It is proposed that the overshoot effect of hearing could be utilized efficiently in enhancing the performance of a low-delay audio coding scheme. IC991254.PDF (From Author) IC991254.PDF (Rasterized) TOP Scalable Audio Coder Based On Quantizer Units Of MDCT Coefficients Authors: Akio Jin, Takehiro Moriya, Takeshi Norimatsu, Mineo Tsushima, Tomokazu Ishikawa, Page (NA) Paper number 1605 Abstract: A scalable codec has been constructed by using transform coding and the basic modules for scalable encoder and decoder. It allows users to choose a variety of scalable configrations in the frequency domain. The basic module is a quantizer that can quantize MDCT (Modified DCT) coefficients transformed from a variety of frequency regions. This module mainly works at bitrates of more than 8 kbit/s. We can also change the target frequency regions of the basic module's input-output signals in each transform frame; i.e., we can change the scalable structure according to the nature of input signals. In the scalable codec described here, the input-output signals are monaural and the sampling frequency is 24 kHz. The total bit rate of this scalable codec is more than 8 kbit/s. Subjective quality evaluation tests, mainly for musical sound sources, showed that its sound quality is better than that of an MPEG2-layer3 codec at 8, 16, and 24 kbit/s when our scalable codec is construced of 8-kbit/s basic modules. In combination with AAC (Advanced Audio Coding), our scalable codec will be chosen as an international standard in ISO/IEC-MPEG-4/Audio. IC991605.PDF (From Author) IC991605.PDF (Rasterized) TOP An Algorithm For Compression of Wideband and Diverse Speech and Audio Signals Authors: Trevor R Trinkaus, Mark A Clements, Page (NA) Paper number 2026 Abstract: A compression scheme for diverse speech and audio signals is proposed. In this scheme, signals are analyzed with a 2-band QMF filterbank followed by the application of a Modulated Lapped Biorthogonal Transform (MLBT) to each of the filter bank channels. Subsequent encoding of transform coefficients is performed using Laplacian optimized scalar and vector quantizers, whose rates are determined by an estimated noise threshold, i.e., masking threshold. Listening tests show that the coder achieves a quality at 32 Kbits/s that is preferred over the ITU G.722 coder at 64 Kbits/s, for speech, music, and more diverse signals consisting of speech in the presence of eventful background sounds. Both the delay of the coder, at 40 ms, and the level of complexity are moderate. IC992026.PDF (From Author) IC992026.PDF (Rasterized) TOP A New Forward Masking Model and Its Application to Perceptual Audio Coding Authors: Yuan-Hao Huang, Room 333, Department of Electrical Engineering, Nation Taiwan University, Taipei, Taiwan R.O.C (Taiwan) Tzi-Dar Chiueh, Room 511, Department of Electrical Engineering, Nation Taiwan University, Taipei, Taiwan R.O.C (Taiwan) Page (NA) Paper number 1363 Abstract: This paper presents a new forward masking model for perceptual audio coding. This model exploits adaptation of the peripheral sensory and neural elements in the auditory system, which is often deemed as the cause of forward masking. Nonlinearity of the ear is modeled by a nonlinear analog circuit with difference equations. We incorporate this model in the MPEG Layer III audio coding scheme and construct a masking plane in the frequency-time space. With some extra computations, the new audio coding scheme can improve the sound quality of the decoded audio signals. In our experiments, subjective and objective sound quality measurements show that, to achieve the same reconstructed sound quality, the new scheme requires 12% to 23% less bits than the original MPEG Layer III scheme. IC991363.PDF (From Author) IC991363.PDF (Rasterized) TOP Best Wavelet-Packet Bases for Audio Coding Using Perceptual and Rate-Distortion Criteria Authors: Markus Erne, George Moschytz, Christof Faller, Page (NA) Paper number 1442 Abstract: This paper presents a new approach to the adaptation of a wavelet filterbank based on perceptual and rate-distortion criteria. The system makes use of a wavelet-packet transform where each subband can have an individual time-segmentation. Boundary effects can be avoided by using overlapping blocks of samples and therefore switching bases is possible at every tree-level without affecting other subbands. A modified psychoacoustic model using perceptual entropy can control the switching of the wavelet filterbank and the individual time-segmentation of every subband allows to take advantage of temporal masking. Additionally a rate-distortion measure can control the filterbank for lossless audio coding applications or in cases where large coding gains can be achieved without using perceptual criteria. The weight of the perceptual measure as well as the weight of the rate-distortion measure can be selected individually, enabling to trade lossless-coding versus perceptual coding. IC991442.PDF (From Author) IC991442.PDF (Rasterized) TOP Improving Perceptual Coding of Narrowband Audio Signals at Low Rates Authors: Hossein Najafzadeh-Azghandi, Peter Kabal, Page (NA) Paper number 1779 Abstract: This paper discusses perceptual coding of narrowband audio signals at low rates. In particular, it proposes a new error measure which shapes the noise inside the critical bands, a window switching criterion based on the temporal masking effect of the hearing system, a more accurate model of the simultaneous masking effect of the hearing system, perceptually-based bit allocation algorithms based on two different approaches towards quantization noise shaping and a predictive vector quantization scheme to code the scale factors. The resulting coding scheme outperforms existing low rate speech coders for non-speech signals IC991779.PDF (From Author) IC991779.PDF (Rasterized) TOP Subband-Domain Filtering of MPEG Audio Signals Authors: Chris A Lanciani, Ronald W Schafer, Page (NA) Paper number 1994 Abstract: The cosine modulated filter bank is commonly used for the time-frequency decomposition of audio signals. For example, it is a basic element of the MPEG-1 and MPEG-2 audio coding standards. While this filter bank is not perfectly-reconstructing, it does provide for the cancelation of aliasing components that are introduced during the analysis decomposition. If the subband signals are to be processed, care must be taken to preserve the properties of the subband signals such that the aliased terms will be canceled successfully in the synthesis filter bank despite the modification of the subband signals. In this paper, a framework is provided for the generation and application of arbitrary FIR filters to signals that have been decomposed using the MPEG filter bank. IC991994.PDF (From Author) IC991994.PDF (Rasterized) TOP