SpacerHome

Spacer
Mirror Sites
Spacer
General Information
Spacer
Confernce Schedule
Spacer
Technical Program
Spacer
     Plenary Sessions
Spacer
     Special Sessions
Spacer
     Expert Summaries
Spacer
     Tutorials
Spacer
     Industry Technology Tracks
Spacer
     Technical Sessions
    
By Date
    March 16
    March 17
    March 18
    March 19
    
By Category
    AE     COMM
    DISPS     DSPE
    ESS     IMDSP
    ITT     MMSP
    NNSP     SAM
    SP     SPEC
    SPTM
    
By Author
        A    B    C    D   
        E    F    G    H   
        I    J    K    L   
        M    N    O    P   
        Q    R    S    T   
        U    V    W    X   
        Y    Z   
Spacer
Tutorials
Spacer
Industry Technology Tracks
Spacer
Exhibits
Spacer
Sponsors
Spacer
Registration
Spacer
Coming to Phoenix
Spacer
Call for Papers
Spacer
Author's Kit
Spacer
On-line Review
Spacer
Future Conferences
Spacer
Help

Abstract: Session SP-21

Conference Logo

SP-21.1  

PDF File of Paper Manuscript
Linguistic Mapping in LSF Space for Low-Bit Rate Coding
John J Parry, Ian S Burnett, Joe F Chicharo (University of Wollongong)

In this paper we investigate the spectral density of Line Spectral Frequency (LSF) content in languages. The results show that the phonetic variation of languages is reflected in the LSF space. This leads to an alternative approach to the design of LSF quantisers. A trained LSF codebook, like the phonetic inventory of a language, is a static description of spectral behaviour of speech. As clear relationships exist between phonetic segments and LSFs, the structure of an LSF codebook can be analysed in terms of the phonetic segments. The new approach incorporates phonetic information into the structure of LSF codebooks through combining individual phonetic codebooks. The investigation leads to the conclusion that phonetic information can be usefully employed in codebook training in terms of perceptual performance and bit-rate reductions.


SP-21.2  

PDF File of Paper Manuscript
Predictive Multiple-Scale Lattice VQ for LSF Quantization
Adriana Vasilache (Tampere University of Technology), Marcel Vasilache (Nokia Research Center), Ioan Tabus (Tampere University of Technology)

This paper introduces a new lattice quantization scheme, the multiple-scale lattice vector quantization (MSLVQ), based on the truncation of the D10+ lattice. The codebook is composed of several copies of the truncated lattice scaled with different scaling factors. A fast nearest neighbor search is introduced. We compare the performance of predictive MSLVQ for quantization of LSF coefficients with the quantization technique used in the codec G.729 and show the better performance of our method in terms of spectral distortion. The MSLVQ scheme achieves the transparent quality at 21 bits/frame.


SP-21.3  

PDF File of Paper Manuscript
A Rootfinding Algorithm for Line Spectral Frequencies
Joseph H Rothweiler (Sanders, A Lockheed Martin Company)

Published techniques for computing line spectral frequencies generally avoid rootfinding methods because of concerns about convergence and complexity. However, this paper shows that stable predictor polynomials have properties that make rootfinding an attractive approach. It is well known that the problem of finding the LSF's for an N'th order predictor polynomial can be reduced to the problem of finding the roots of a pair of polynomails of order N/2 with real roots. I extend this result by showing that these polynomials have the following properties: - It is possible to select starting points for a Newton's rootfinding method such that the iteration will converge monotonically to the largest root. - The Newton iteration can be modified to speed up the process while still maintaining good convergence properties. In this paper, I present the rootfinding procedures with proofs of their good convergence properties. Finally, I present experimental results showing that this procedure performs well on speech signals, and that it can be implemented on fixed-point DSP's.


SP-21.4  

PDF File of Paper Manuscript
Incorporation of Temporal Masking Effects into Bark Spectral Distortion Measure
Bob Novorita (Motorola and University of Illinois - Chicago)

The objective of this paper is to extend a promising objective speech distortion measurement method, the Bark Spectral Distance (BSD) measure, with the auditory concepts of forward and backward temporal masking to improve its measurement accuracy. The results of this investigation show that automatic BSD-based speech quality ratings may be made to correlate better with existing MOS ratings by removing perceptually irrelevant areas of speech from the distance measure. The correlation between the objective BSD measure to the subjective MOS measure increases from 0. 91 to 0. 98. The best results were found with a window duration of 128 samples, use of exponential-slope filter characteristics for both forward and backward masking effects, forward masking delays up to 100 msec, and a backward masking time advance of 40 msec.


SP-21.5  

PDF File of Paper Manuscript
MVDR BASED ALL-POLE MODELS FOR SPECTRAL CODING OF SPEECH
Manohar N Murthi, Bhaskar D Rao (Dept. ECE, University of California, San Diego)

We present several analytical properties of Minimum Variance Distortionless Response (MVDR) based all-pole models that demonstrate the advantages and usefulness of these models for speech spectral coding. In particular, we show that a sufficient order MVDR all-pole model provides a spectral envelope that fits a set of spectral samples exactly with a parameterization convenient for quantization purposes. In addition, we show that MVDR all-pole filters provide a monotonically decreasing spectral distortion with increasing filter order. Furthermore, we show that the MVDR all-pole filter possesses the flexibility to be obtained from correlations based upon either spectral samples or conventional time-domain correlations. Finally, exploiting the insight gained from MVDR modeling, we introduce a novel class of constrained all-pole models for efficient spectral coding. In this approach, a subset of the Line Spectral Frequency (LSF) parameters associated with the all-pole model are judiciously fixed, leading to a simpler model parameterization.


SP-21.6  

PDF File of Paper Manuscript
Improvement of MBSD by Scaling Noise Masking Threshold and Correlation Analysis with MOS Difference Instead of MOS
Wonho Yang, Robert Yantorno (Electrical & Computer Engineering Department, College of Engineering, Temple University)

The Modified Bark Spectral Distortion (MBSD), used for an objective speech quality measure, was presented previously [1][2]. The MBSD measure estimates speech distortion in the loudness domain taking into account the noise masking threshold in order to include only audible distortions in the calculation of the distortion measure. Preliminary simulation results have shown improvement of the MBSD over the conventional BSD. In this paper, the performance of the MBSD is improved by scaling noise masking threshold and comparing it to ITU-T Recommendation P.861 [3] and MNB [4] measures. Correlation analysis with MOS difference instead of MOS has been examined in order to evaluate objective speech quality measures.


SP-21.7  

PDF File of Paper Manuscript
Performance Bounds for LPC Spectrum Quantization
Per Hedelin (Information Theory, Chalmers University of Technology), Jan Skoglund, Jonas Samuelsson (Information Theory, Chalmers University of Technolo)

This paper presents a method for obtaining numerical estimates of high rate vector quantization (VQ) performance suitable for sources for which the pdf is not analytically available. In the proposed method, the VQ point density is described from a Gaussian mixture model optimized for the data. Employing this method for LPC spectrum quantization, we obtain high rate expressions for both the average spectral distortion (SD) and the distribution function of the SD. We estimate the minimum bits required for a quantizer to obtain an average SD of 1 dB and the outlier statistics for that quantizer. We find that approximately 3 bits can be saved as compared to a 2-split LSF-based vector quantizer.


SP-21.8  

PDF File of Paper Manuscript
Channel Optimized Predictive VQ
Jan Lindén (Chalmers University of Technology)

In this paper combined source-channel coding is considered for the case of predictive vector quantization. A design algorithm for channel optimized predictive vector quantizers is proposed. Under reasonable assumptions, the optimal encoder is presented and a sample iterative design method that simultaneously optimizes the predictor and the codebook is derived. We also demonstrate that this design method can be used to obtain index assignments that are advantageous to what is obtained by post process index assignment algorithms. Results are presented for a correlated Gauss-Markov process and for speech LSF parameters.


SP-20 SP-22 >


Last Update:  February 4, 1999         Ingo Höntsch
Return to Top of Page