SPEECH CODING BELOW 4 KB/S

Chair: Thomas E. Tremain, U.S. Department of Defense (USA)

Home


NATO STANAG 4479: A Standard for an 800 bps Vocoder and Channel Coding in HF-ECCM System

Authors:

B. Mouy, Thomson CSF-RGS (FRANCE)
P. de La Noue, Thomson CSF-RGS (FRANCE)
G. Goudezeune, Thomson CSF-RGS (FRANCE)

Volume 1, Page 480

Abstract:

This paper presents a new voice coder for applications in very low bit rate communication systems normalized by NATO under STANAG agreement 4479. The originality of this standardization is the description of both the source and the channel coding. It is the natural continuation of the well known LPC10e 2400 bps voice coder normalized under STANAG 4198. The analysis and synthesis are the same as in the LPC10e vocoder but the quantization process is specific. The main points of innovation are presented. An associated error correcting scheme increases the source bit rate from 800 up to 2400 bps. It has been optimized in the framework of HF-ECCM system (HF-Electronic Counter Counter Measure), to take into account all possible channel perturbations as well as system constraints especially in terms of minimization of the delay and Turn Around Time.

300dpi TIFF Images of pages:

480 481 482 483

Acrobat PDF file of whole paper:

ic950480.pdf

TOP



Harmonic and Noise Coding of LPC Residuals with Classified Vector Quantization

Authors:

Masayuki Nishiguchi, Sony Corporation (JAPAN)
Jun Matsumoto, Sony Corporation (JAPAN)

Volume 1, Page 484

Abstract:

An efficient coding scheme for Linear Predictive Coding (LPC) residuals is proposed based on harmonic and noise representation. New features of the scheme include classified vector quantization of the spectral envelope of LPC residuals with a weighted distortion measure. The improvement in performance obtained by classifying codebooks based on a voiced/unvoiced (V/UV) decision is shown. Sequences of the short-term rms power of time domain waveforms are also vector quantized and transmitted for unvoiced signals. A fast synthesis algorithm for voiced signals using an FFT is also presented, which reduces the high complexity of the direct sinusoidal synthesis method with interpolated magnitudes and phases. Informal listening tests indicate that, in combination with a known LSP quantization technique, this residual coding scheme provides good communication quality at a total bit rate of less than 2.0Kbps.

300dpi TIFF Images of pages:

484 485 486 487

Acrobat PDF file of whole paper:

ic950484.pdf

TOP



Progress Towards a New Government Standard 2400 bps Voice Coder

Authors:

M.A. Kohler, U.S. Department of Defense (USA)
L.M. Supplee, U.S. Department of Defense (USA)
T.E. Tremain, U.S. Department of Defense (USA)

Volume 1, Page 488

Abstract:

In order to support the need for higher quality low rate voice communications for government, industry, and military customers, the United States Government is conducting a search for a new voice compression algorithm at 2400 bits per second (bps). The United States Department of Defense Digital Voice Processing Consortium (DDVPC), consisting of members from civilian and military branches of the U.S. government, is directing the testing and evaluation of several candidate 2400 bps algorithms. The goal of the DDVPC is to select a new algorithm which meets or exceeds the published requirements by mid 1996. The selected algorithm, to become the new standard, should be implementable in a small, low powered device by 1997. This paper describes the status of the testing and evaluation process from its beginning in early 1993 through the end of 1994.

300dpi TIFF Images of pages:

488 489 490 491

Acrobat PDF file of whole paper:

ic950488.pdf

TOP



Variable Dimension Spectral Coding of Speech at 2400 bps and Below with Phonetic Classification

Authors:

Amitava Das, University of California - Santa Barbara (USA)
Allen Gersho, University of California - Santa Barbara (USA)

Volume 1, Page 492

Abstract:

The low bit rate enhanced multiband excitation or EMBE speech coder adds several important new features including phonetic classification and a novel spectral quantization technique called variable dimension vector quantization (VDVQ) to the basic multiband excitation vocoder. Phonetic classification allows the adaptation of spectral modeling and quantization to the local acoustic-phonetic character of the speech signal, enhancing quality and robustness. The VDVQ scheme quantizes the log-spectrum with relatively few bits while preserving perceptually important features. Both the fixed rate (2.4 kb/s) and the variable rate (1.44 kb/s average) implementations of EMBE deliver speech quality compara- ble to the 4.8 kb/s Federal Standard 1016 CELP coder and the 4.15 kb/s Inmarsat-M standard IMBE coder.

300dpi TIFF Images of pages:

492 493 494 495

Acrobat PDF file of whole paper:

ic950492.pdf

TOP



Spectral Excitation Coding of Speech At 2.4 kb/s

Authors:

V. Cuperman, Simon Fraser University (CANADA)
P. Lupini, Simon Fraser University (CANADA)
B. Bhattacharya, Simon Fraser University (CANADA)

Volume 1, Page 496

Abstract:

In this paper we present Spectral Excitation Coding (SEC), a speech codec based on a sinusoidal model applied to the excitation signal. A phase dispersion algorithm allows the same model to be used for voiced as well as unvoiced and transitional sounds. The phase dispersion algorithm significantly improves the perceived quality resulting in more natural reconstructed speech. A new technique for variable-dimension vector quantization called Non-Square Transform Vector Quantization (NSTVQ) is used for quantization of the harmonic magnitudes. The SEC system at 2.45 kb/s achieved an MOS score 0.8 points higher than the 2.4 kb/s LPC-10 standard. A preliminary 1.85 kb/s SEC system which uses zero-bit magnitude quantization is also presented. Informal listening tests indicate that the quality of the 1.85 kb/s system exceeds that of the LPC-10 standard.

300dpi TIFF Images of pages:

496 497 498 499

Acrobat PDF file of whole paper:

ic950496.pdf

TOP



A Robust 2400 bps Subband LPC Vocoder

Authors:

P. A. Laurent, Thomson CSF-RGS (FRANCE)
P. de La Noue, Thomson CSF-RGS (FRANCE)

Volume 1, Page 500

Abstract:

This paper presents a new voice coder for applications in future low bit rate communication systems. The emphasis has been put on speech quality, noise robustness and complexity. The coder realizes a multiband+LPC spectral analysis and synthesis of speech. The transmitted information consists in a LPC10 filter, a set of voicing rates, a pitch, energies, spectral density of excitation in five subbands, and information about stationarity of the signal in each half-frame. Depending upon this stationarity, the quantization process is adapted to provide more spectral information (stable speech) or more temporal information (transitory speech). In order to be less sensitive to surrounding noise, pitch and voicing rates are first computed in each subband. The final values of these parameters are obtained from the values in the current frame and its neighbors. The excitation signal used at the synthesis side consists in a mixture of isolated pulses, periodic and aperiodic signals of adjustable spectral composition. Test results are provided.

300dpi TIFF Images of pages:

500 501 502 503

Acrobat PDF file of whole paper:

ic950500.pdf

TOP



Band-Widened Harmonic Vocoder at 2 to 4 kbps

Authors:

Gao Yang, Lernout & Hauspie Speech Products
G. Zanellato, Lernout & Hauspie Speech Products
H. Leich, Faculte Polytechnique de Mons (BELGIUM)

Volume 1, Page 504

Abstract:

For speech coding at a bit rate below 4 kbps, the attention has been concentrated on sinusoidal-based vocoders during the past decade. Several models such as the MBE [8] have been proposed to synthesize high quality speech while removing the buzzy quality often produced because of over strong periodicity. This paper proposes a new model for voiced speech coding at very low bit rates, referred to as Band-Widened Harmonic coding (BWH). This model was demonstrated to be able to win some advantages over existing ones in both quality and complexity. A comparison between the BWH and the MBE will be given in this paper.

300dpi TIFF Images of pages:

504 505 506 507

Acrobat PDF file of whole paper:

ic950504.pdf

TOP



A Speech Coder Based on Decomposition of Characteristic Waveforms

Authors:

W. Bastiaan Kleijn, AT&T Bell Laboratories (USA)
Jesper Haagen, AT&T Bell Laboratories (USA)

Volume 1, Page 508

Abstract:

For low-rate speech coding it is advantageous to represent the speech signal as an evolving characteristic waveform (CW). The CW evolves slowly when the speech signal is clearly voiced and rapidly when the speech signal is clearly unvoiced. The voiced (periodic) and unvoiced (nonperiodic) components of the speech signal can be separated by a simple nonadaptive filter in the CW domain. Because of perceptual effects, a significant increase in coding efficiency is obtained by coding these two components separately. A 2.4 kb/s coder using these principles was developed. In an independent evaluation, the performance of the 2.4 kb/s WI coder was found to be at least equivalent to the 4.8 kb/s FS1016 standard for all of the many tests.

300dpi TIFF Images of pages:

508 509 510 511

Acrobat PDF file of whole paper:

ic950508.pdf

TOP



Speech Compression Using Pitch Synchronous Interpolation

Authors:

R. Taori, Philips Research Laboratories (THE NETHERLANDS)
R.J. Sluijter, Philips Research Laboratories (THE NETHERLANDS)
E. Kathmann, Philips Research Laboratories (THE NETHERLANDS)

Volume 1, Page 512

Abstract:

This paper presents a new time-domain algorithm for compressing speech signals. Using a novel tool which we will refer to as Time Weighted Average (TWA), a periodically extendable pitch cycle is extracted from the voiced regions in the speech signal. This procedure is carried out every x^th pitch period. The discarded x - 1 pitch periods are recovered using pitch synchronous interpolation (PSI). The computational complexity of the resulting decoder is surprisingly modest and shows reasonable potential of implementation on hardware as primitive as the Intel 8088 (mu)-processor. Simulation results show that the reconstruction quality is comparable to G.721.

300dpi TIFF Images of pages:

512 513 514 515

Acrobat PDF file of whole paper:

ic950512.pdf

TOP



Pitch-Synchronous Multi-Band (PSMB) Speech Coding

Authors:

Haiyun Yang, Nanyang Technological University (SINGAPORE)
Soo-Ngee Koh, Nanyang Technological University (SINGAPORE)
Pratab Sivaprakasapillai, Nanyang Technological University (SINGAPORE)

Volume 1, Page 516

Abstract:

A novel speech coding algorithm, named pitch synchronous multi-band (PSMB), is proposed. It uses the multiband excitation (MBE) model to generate a representative pitch-cycle waveform (PCW) for each frame. The representative PCW of a frame is encoded by two out of three codebooks depending upon whether the frame is related or unrelated to the previous frame. The new speech coder introduces a pitch-period- based coding feature. The PSMB coder operating at 4 kbps outperforms the Inmarsat 4.15 kbps IMBE coder by a clear margin. It is also found to be slightly better than the FS1016 4.8 kbps code excited linear predictive (CELP) coder in terms of perceptual quality. Fast search algorithms for the three codebooks used in PSMB are also developed.

300dpi TIFF Images of pages:

516 517 518 519

Acrobat PDF file of whole paper:

ic950516.pdf

TOP