Coding

Home
Full List of Titles
1: Speech Processing
CELP Coding
Large Vocabulary Recognition
Speech Analysis and Enhancement
Acoustic Modeling I
ASR Systems and Applications
Topics in Speech Coding
Speech Analysis
Low Bit Rate Speech Coding I
Robust Speech Recognition in Noisy Environments
Speaker Recognition
Acoustic Modeling II
Speech Production and Synthesis
Feature Extraction
Robust Speech Recognition and Adaptation
Low Bit Rate Speech Coding II
Speech Understanding
Language Modeling I
2: Speech Processing, Audio and Electroacoustics, and Neural Networks
Acoustic Modeling III
Lexical Issues/Search
Speech Understanding and Systems
Speech Analysis and Quantization
Utterance Verification/Acoustic Modeling
Language Modeling II
Adaptation /Normalization
Speech Enhancement
Topics in Speaker and Language Recognition
Echo Cancellation and Noise Control
Coding
Auditory Modeling, Hearing Aids and Applications of Signal Processing to Audio and Acoustics
Spatial Audio
Music Applications
Application - Pattern Recognition & Speech Processing
Theory & Neural Architecture
Signal Separation
Application - Image & Nonlinear Signal Processing
3: Signal Processing Theory & Methods I
Filter Design and Structures
Detection
Wavelets
Adaptive Filtering: Applications and Implementation
Nonlinear Signals and Systems
Time/Frequency and Time/Scale Analysis
Signal Modeling and Representation
Filterbank and Wavelet Applications
Source and Signal Separation
Filterbanks
Emerging Applications and Fast Algorithms
Frequency and Phase Estimation
Spectral Analysis and Higher Order Statistics
Signal Reconstruction
Adaptive Filter Analysis
Transforms and Statistical Estimation
Markov and Bayesian Estimation and Classification
4: Signal Processing Theory & Methods II, Design and Implementation of Signal Processing Systems, Special Sessions, and Industry Technology Tracks
System Identification, Equalization, and Noise Suppression
Parameter Estimation
Adaptive Filters: Algorithms and Performance
DSP Development Tools
VLSI Building Blocks
DSP Architectures
DSP System Design
Education
Recent Advances in Sampling Theory and Applications
Steganography: Information Embedding, Digital Watermarking, and Data Hiding
Speech Under Stress
Physics-Based Signal Processing
DSP Chips, Architectures and Implementations
DSP Tools and Rapid Prototyping
Communication Technologies
Image and Video Technologies
Automotive Applications / Industrial Signal Processing
Speech and Audio Technologies
Defense and Security Applications
Biomedical Applications
Voice and Media Processing
Adaptive Interference Cancellation
5: Communications, Sensor Array and Multichannel
Source Coding and Compression
Compression and Modulation
Channel Estimation and Equalization
Blind Multiuser Communications
Signal Processing for Communications I
CDMA and Space-Time Processing
Time-Varying Channels and Self-Recovering Receivers
Signal Processing for Communications II
Blind CDMA and Multi-Channel Equalization
Multicarrier Communications
Detection, Classification, Localization, and Tracking
Radar and Sonar Signal Processing
Array Processing: Direction Finding
Array Processing Applications I
Blind Identification, Separation, and Equalization
Antenna Arrays for Communications
Array Processing Applications II
6: Multimedia Signal Processing, Image and Multidimensional Signal Processing, Digital Signal Processing Education
Multimedia Analysis and Retrieval
Audio and Video Processing for Multimedia Applications
Advanced Techniques in Multimedia
Video Compression and Processing
Image Coding
Transform Techniques
Restoration and Estimation
Image Analysis
Object Identification and Tracking
Motion Estimation
Medical Imaging
Image and Multidimensional Signal Processing Applications I
Segmentation
Image and Multidimensional Signal Processing Applications II
Facial Recognition and Analysis
Digital Signal Processing Education

Author Index
A B C D E F G H I
J K L M N O P Q R
S T U V W X Y Z

On The Utilization Of Overshoot Effects In Low-Delay Audio Coding

Authors:

Aki Härmä,
Unto K. Laine,
Matti Karjalainen,

Page (NA) Paper number 1254

Abstract:

In low-delay audio coding (coding delay < 5 ms) there is no time for detailed spectral modeling in the case of brief percussive sounds, e.g., the castanets, and onsets of music or speech sounds. On the other hand, it is known from psychoacoustic experiments that the ear is not accurate near the onset of a wideband sound. In this paper, we study the audibility of coding errors near the onsets of musical sounds in a simulated low-delay audio codec based on frequency-warped linear prediction. It is suggested that for many musical transients it is sufficient to reproduce a rough temporal and spectral envelope of the original signal during the first 5-10 ms. Preliminary listening tests support this idea. It is proposed that the overshoot effect of hearing could be utilized efficiently in enhancing the performance of a low-delay audio coding scheme.

IC991254.PDF (From Author) IC991254.PDF (Rasterized)

TOP


Scalable Audio Coder Based On Quantizer Units Of MDCT Coefficients

Authors:

Akio Jin,
Takehiro Moriya,
Takeshi Norimatsu,
Mineo Tsushima,
Tomokazu Ishikawa,

Page (NA) Paper number 1605

Abstract:

A scalable codec has been constructed by using transform coding and the basic modules for scalable encoder and decoder. It allows users to choose a variety of scalable configrations in the frequency domain. The basic module is a quantizer that can quantize MDCT (Modified DCT) coefficients transformed from a variety of frequency regions. This module mainly works at bitrates of more than 8 kbit/s. We can also change the target frequency regions of the basic module's input-output signals in each transform frame; i.e., we can change the scalable structure according to the nature of input signals. In the scalable codec described here, the input-output signals are monaural and the sampling frequency is 24 kHz. The total bit rate of this scalable codec is more than 8 kbit/s. Subjective quality evaluation tests, mainly for musical sound sources, showed that its sound quality is better than that of an MPEG2-layer3 codec at 8, 16, and 24 kbit/s when our scalable codec is construced of 8-kbit/s basic modules. In combination with AAC (Advanced Audio Coding), our scalable codec will be chosen as an international standard in ISO/IEC-MPEG-4/Audio.

IC991605.PDF (From Author) IC991605.PDF (Rasterized)

TOP


An Algorithm For Compression of Wideband and Diverse Speech and Audio Signals

Authors:

Trevor R Trinkaus,
Mark A Clements,

Page (NA) Paper number 2026

Abstract:

A compression scheme for diverse speech and audio signals is proposed. In this scheme, signals are analyzed with a 2-band QMF filterbank followed by the application of a Modulated Lapped Biorthogonal Transform (MLBT) to each of the filter bank channels. Subsequent encoding of transform coefficients is performed using Laplacian optimized scalar and vector quantizers, whose rates are determined by an estimated noise threshold, i.e., masking threshold. Listening tests show that the coder achieves a quality at 32 Kbits/s that is preferred over the ITU G.722 coder at 64 Kbits/s, for speech, music, and more diverse signals consisting of speech in the presence of eventful background sounds. Both the delay of the coder, at 40 ms, and the level of complexity are moderate.

IC992026.PDF (From Author) IC992026.PDF (Rasterized)

TOP


A New Forward Masking Model and Its Application to Perceptual Audio Coding

Authors:

Yuan-Hao Huang, Room 333, Department of Electrical Engineering, Nation Taiwan University, Taipei, Taiwan R.O.C (Taiwan)
Tzi-Dar Chiueh, Room 511, Department of Electrical Engineering, Nation Taiwan University, Taipei, Taiwan R.O.C (Taiwan)

Page (NA) Paper number 1363

Abstract:

This paper presents a new forward masking model for perceptual audio coding. This model exploits adaptation of the peripheral sensory and neural elements in the auditory system, which is often deemed as the cause of forward masking. Nonlinearity of the ear is modeled by a nonlinear analog circuit with difference equations. We incorporate this model in the MPEG Layer III audio coding scheme and construct a masking plane in the frequency-time space. With some extra computations, the new audio coding scheme can improve the sound quality of the decoded audio signals. In our experiments, subjective and objective sound quality measurements show that, to achieve the same reconstructed sound quality, the new scheme requires 12% to 23% less bits than the original MPEG Layer III scheme.

IC991363.PDF (From Author) IC991363.PDF (Rasterized)

TOP


Best Wavelet-Packet Bases for Audio Coding Using Perceptual and Rate-Distortion Criteria

Authors:

Markus Erne,
George Moschytz,
Christof Faller,

Page (NA) Paper number 1442

Abstract:

This paper presents a new approach to the adaptation of a wavelet filterbank based on perceptual and rate-distortion criteria. The system makes use of a wavelet-packet transform where each subband can have an individual time-segmentation. Boundary effects can be avoided by using overlapping blocks of samples and therefore switching bases is possible at every tree-level without affecting other subbands. A modified psychoacoustic model using perceptual entropy can control the switching of the wavelet filterbank and the individual time-segmentation of every subband allows to take advantage of temporal masking. Additionally a rate-distortion measure can control the filterbank for lossless audio coding applications or in cases where large coding gains can be achieved without using perceptual criteria. The weight of the perceptual measure as well as the weight of the rate-distortion measure can be selected individually, enabling to trade lossless-coding versus perceptual coding.

IC991442.PDF (From Author) IC991442.PDF (Rasterized)

TOP


Improving Perceptual Coding of Narrowband Audio Signals at Low Rates

Authors:

Hossein Najafzadeh-Azghandi,
Peter Kabal,

Page (NA) Paper number 1779

Abstract:

This paper discusses perceptual coding of narrowband audio signals at low rates. In particular, it proposes a new error measure which shapes the noise inside the critical bands, a window switching criterion based on the temporal masking effect of the hearing system, a more accurate model of the simultaneous masking effect of the hearing system, perceptually-based bit allocation algorithms based on two different approaches towards quantization noise shaping and a predictive vector quantization scheme to code the scale factors. The resulting coding scheme outperforms existing low rate speech coders for non-speech signals

IC991779.PDF (From Author) IC991779.PDF (Rasterized)

TOP


Subband-Domain Filtering of MPEG Audio Signals

Authors:

Chris A Lanciani,
Ronald W Schafer,

Page (NA) Paper number 1994

Abstract:

The cosine modulated filter bank is commonly used for the time-frequency decomposition of audio signals. For example, it is a basic element of the MPEG-1 and MPEG-2 audio coding standards. While this filter bank is not perfectly-reconstructing, it does provide for the cancelation of aliasing components that are introduced during the analysis decomposition. If the subband signals are to be processed, care must be taken to preserve the properties of the subband signals such that the aliased terms will be canceled successfully in the synthesis filter bank despite the modification of the subband signals. In this paper, a framework is provided for the generation and application of arbitrary FIR filters to signals that have been decomposed using the MPEG filter bank.

IC991994.PDF (From Author) IC991994.PDF (Rasterized)

TOP