Low Bit Rate Speech Coding I

Home
Full List of Titles
1: Speech Processing
CELP Coding
Large Vocabulary Recognition
Speech Analysis and Enhancement
Acoustic Modeling I
ASR Systems and Applications
Topics in Speech Coding
Speech Analysis
Low Bit Rate Speech Coding I
Robust Speech Recognition in Noisy Environments
Speaker Recognition
Acoustic Modeling II
Speech Production and Synthesis
Feature Extraction
Robust Speech Recognition and Adaptation
Low Bit Rate Speech Coding II
Speech Understanding
Language Modeling I
2: Speech Processing, Audio and Electroacoustics, and Neural Networks
Acoustic Modeling III
Lexical Issues/Search
Speech Understanding and Systems
Speech Analysis and Quantization
Utterance Verification/Acoustic Modeling
Language Modeling II
Adaptation /Normalization
Speech Enhancement
Topics in Speaker and Language Recognition
Echo Cancellation and Noise Control
Coding
Auditory Modeling, Hearing Aids and Applications of Signal Processing to Audio and Acoustics
Spatial Audio
Music Applications
Application - Pattern Recognition & Speech Processing
Theory & Neural Architecture
Signal Separation
Application - Image & Nonlinear Signal Processing
3: Signal Processing Theory & Methods I
Filter Design and Structures
Detection
Wavelets
Adaptive Filtering: Applications and Implementation
Nonlinear Signals and Systems
Time/Frequency and Time/Scale Analysis
Signal Modeling and Representation
Filterbank and Wavelet Applications
Source and Signal Separation
Filterbanks
Emerging Applications and Fast Algorithms
Frequency and Phase Estimation
Spectral Analysis and Higher Order Statistics
Signal Reconstruction
Adaptive Filter Analysis
Transforms and Statistical Estimation
Markov and Bayesian Estimation and Classification
4: Signal Processing Theory & Methods II, Design and Implementation of Signal Processing Systems, Special Sessions, and Industry Technology Tracks
System Identification, Equalization, and Noise Suppression
Parameter Estimation
Adaptive Filters: Algorithms and Performance
DSP Development Tools
VLSI Building Blocks
DSP Architectures
DSP System Design
Education
Recent Advances in Sampling Theory and Applications
Steganography: Information Embedding, Digital Watermarking, and Data Hiding
Speech Under Stress
Physics-Based Signal Processing
DSP Chips, Architectures and Implementations
DSP Tools and Rapid Prototyping
Communication Technologies
Image and Video Technologies
Automotive Applications / Industrial Signal Processing
Speech and Audio Technologies
Defense and Security Applications
Biomedical Applications
Voice and Media Processing
Adaptive Interference Cancellation
5: Communications, Sensor Array and Multichannel
Source Coding and Compression
Compression and Modulation
Channel Estimation and Equalization
Blind Multiuser Communications
Signal Processing for Communications I
CDMA and Space-Time Processing
Time-Varying Channels and Self-Recovering Receivers
Signal Processing for Communications II
Blind CDMA and Multi-Channel Equalization
Multicarrier Communications
Detection, Classification, Localization, and Tracking
Radar and Sonar Signal Processing
Array Processing: Direction Finding
Array Processing Applications I
Blind Identification, Separation, and Equalization
Antenna Arrays for Communications
Array Processing Applications II
6: Multimedia Signal Processing, Image and Multidimensional Signal Processing, Digital Signal Processing Education
Multimedia Analysis and Retrieval
Audio and Video Processing for Multimedia Applications
Advanced Techniques in Multimedia
Video Compression and Processing
Image Coding
Transform Techniques
Restoration and Estimation
Image Analysis
Object Identification and Tracking
Motion Estimation
Medical Imaging
Image and Multidimensional Signal Processing Applications I
Segmentation
Image and Multidimensional Signal Processing Applications II
Facial Recognition and Analysis
Digital Signal Processing Education

Author Index
A B C D E F G H I
J K L M N O P Q R
S T U V W X Y Z

Low Delay Multi-level Decomposition and Quantisation Techniques for WI Coding

Authors:

Nicola R Chong,
Ian S Burnett,
Joe F Chicharo,

Page (NA) Paper number 1419

Abstract:

For efficient coding of speech, it is desirable to separate the slowly and rapidly evolving spectral components to take advantage of their different perceptual qualities. In this paper, we present a multi-level wavelet decomposition mechanism, using low-delay FIR filters, applied to Waveform Interpolation coding. The technique overcomes the substantial delay problems of [2] and identifies a preferred technique for the quantisation of the decomposed surfaces. Phase is shown to be particularly sensitive to the compounding of quantisation errors within the tree-structured transform. The proposed solution involves the use of VDVQ on separately decomposed magnitude/phase surfaces. This approach provides for coarse or no phase quantisation while maintaining high speech quality. The techniques discussed may also be applied to other transforms and to the quantisation of surfaces in the standard Waveform Interpolation coder.

IC991419.PDF (From Author) IC991419.PDF (Rasterized)

TOP


An Improved Mixed Excitation Linear Prediction (MELP) Coder

Authors:

Takahiro Unno,
Thomas P Barnwell III,
Kwan Truong,

Page (NA) Paper number 1764

Abstract:

This paper presents an improved Mixed Excitation Linear Prediction (MELP) coder. The MELP is the linear-prediction-based speech coder that was recently chosen as the new 2400 bps U.S. Federal Standard. Even though the MELP is quite good, there are still some perceivable distortions, particularly around non-stationary speech segments and for some low-pitch male speakers. The key features of our new coder include a robust pitch detection algorithm, a new plosive analysis/synthesis method, and a post processor for the Fourier magnitude model. Formal quality tests are used to show that the new MELP improves the quality of the U.S. Federal Standard MELP coder while requiring only a small increase in algorithmic delay and while also retaining compatibility with the Federal Standard MELP bit-stream specification.

IC991764.PDF (From Author) IC991764.PDF (Rasterized)

TOP


Split Band LPC Based Adaptive Multi-Rate GSM Candidate

Authors:

Stephane Villette, CCSR, University of Surrey, UK (U.K.)
Milos Stefanovic, CCSR, University of Surrey, UK (U.K.)
Ahmet Kondoz, CCSR, University of Surrey, UK (U.K.)

Page (NA) Paper number 1798

Abstract:

The European Telecommunications Standards Institute (ETSI) has launched a competition for a new mobile communications standard designed to provide better performance than the current GSM standard. This standard is to be called AMR for Adaptive Multi-Rate: the source and channel coding rates can be adapted depending on the state of the channel, thus providing optimal balance between them at any time. The University of Surrey has submitted a candidate for this competition through the Mobile VCE. This candidate was the only one amongst eleven to use a vocoder in the half-rate GSM channel instead of a CELP based coder. The testing which took place as part of the first stage of the competition has shown that this candidate was among the best. This paper presents the system submitted for the half-rate channel as well as the results of the testing.

IC991798.PDF (From Author) IC991798.PDF (Rasterized)

TOP


Frequency-Domain Spectral Envelope Estimation for Low Rate Coding of Speech

Authors:

Milan Jelinek,
Jean-Pierre Adoul,

Page (NA) Paper number 1818

Abstract:

Estimation of spectral envelope in frequency domain allows to avoid some problems of the Linear Prediction (LP) algorithms for voiced speech. We present a low complexity method of spectral envelope estimation from harmonics for low rate coding. The method consists in computing harmonic amplitude spectrum using pitch-synchronous DFT with length depending on voicing, modifying this spectrum outside the telephone bandwidth to simplify modeling of the useful bandwidth and interpolating it by a frequency-domain low-pass filter. An all-pole model is then fitted to this modified smoothed version of the harmonic spectrum. The method was implemented on the Harmonic-Stochastic Excitation (HSX) vocoder and the performance was compared with the LP algorithm similar to that used in the G.729 speech coding standard. A-B comparative tests show an important increase in perceptual quality.

IC991818.PDF (From Author) IC991818.PDF (Rasterized)

TOP


Robust Closed-Loop Pitch Estimation for Harmonic Coders by Time Scale Modification

Authors:

Chunyan Li,
Vladimir Cuperman,
Allen Gersho,

Page (NA) Paper number 1855

Abstract:

Harmonic coders that synthesize speech without transmitting phase information abandon the benefits of closed-loop parameter estimation via waveform matching. In this paper, we show that effective closed loop parameter estimation can be achieved when a suitable time-scale modification is applied to the speech LP residual in harmonic coders. The concept is demonstrated here specifically for pitch estimation, but is more broadly applicable. For each of a set of pitch candidates generated by a time-domain pitch estimator, the residual is modified to match the pitch contour derived from that candidate. The best candidate is selected by evaluating for each candidate the match between the modified residual and the synthesized residual. The new pitch estimation algorithm significantly reduces gross pitch errors compared to a conventional time-domain pitch estimator and enhances the perceptual performance of a 4 kbps harmonic coder.

IC991855.PDF (From Author) IC991855.PDF (Rasterized)

TOP


Phase Adjustment In Waveform Interpolation

Authors:

Hong-Goo Kang,
D. Sen,

Page (NA) Paper number 2043

Abstract:

This paper describes a method of improving the quality of the Waveform Interpolation (WI) speech coder by adjustment of the phase information. In WI, a slowly-evolving waveform (SEW) and a rapidly-evolving waveform (REW) represent the periodic and the non-periodic part of the signal. The phase of the synthesized signal is determined by the SEW and REW, and thus the correct quantization of these parameters are important for producing natural speech quality. A method is described, whereby the phase of the synthesized signal is adjusted by modifying the quantized REW spectrum as a function of the fundamental frequency. This essentialy attempts to correct the discrepancies in phase that arise due to variation in pitch and also accounts for the difference in noise sensitivity between female and male speech. The overall effect would be the same if multiple codebooks (depending on pitch) were used to code the REW spectrum. Experimental results confirm that the new method results in significantly improved performance.

IC992043.PDF (From Author) IC992043.PDF (Rasterized)

TOP


A Low Resolution Pulse Position Coding Method for Improved Excitation Modeling of Speech Transition

Authors:

Jongseo Sohn,
Wonyong Sung,

Page (NA) Paper number 2269

Abstract:

We propose a new excitation model for transitional speech to reduce the distortion due to the traditional two-excitation source, voiced and unvoiced, model. The proposed low resolution pulse position coding (LRPPC) algorithm detects the existence of pulses at frames of weak periodicity, which are determined as unvoiced, and transmits the approximate pulse positions. In the decoder, dispersed pulses that have a flat magnitude spectrum are synthesized at the decoded positions to form the excitation signal. A subjective quality test shows that the vocoder employing the LRPPC algorithm produces better quality of speech, and is very robust to mode decision errors.

IC992269.PDF (From Author) IC992269.PDF (Rasterized)

TOP


Dispersion Phase Vector Quantization For Enhancement Of Waveform Interpolative Coder

Authors:

Oded Gottesman, Signal Compression Laboratory, Department of Electrical and Computer Engineering, University of California, Santa Barbara, CA 93106, USA (USA)

Page (NA) Paper number 1834

Abstract:

This paper presents an efficient analysis-by-synthesis vector quantizer for the dispersion phase of the excitation signal which was used to enhance a waveform-interpolative coder. The scheme can be used to enhance other harmonic coders, such as the sinusoidal-transform coder and the multiband-excitation coder. The scheme incorporates perceptual weighting, and does not require any phase unwarping. The proposed quantizer achieves a segmental signal-to-noise ratio of up to 14dB for as low as 6-bit quantization. Subjective testing shows improvement in synthesized speech quality using the quantized phase over a male speaker extracted phase. The improvement was larger for female speakers.

IC991834.PDF (From Author) IC991834.PDF (Rasterized)

TOP