Home
 Mirror Sites
 General Information
 Confernce Schedule
 Technical Program
 Tutorials
 Industry Technology Tracks
 Exhibits
 Sponsors
 Registration
 Coming to Phoenix
 Call for Papers
 Author's Kit
 On-line Review
 Future Conferences
 Help
|
Abstract: Session SP-1 |
|
SP-1.1
|
Analysis by Synthesis Speech Coding with Generalized Pitch Prediction
Paul Mermelstein,
Yasheng Qian (INRS-Telecommunications, Universite du Quebec)
A new analysis-by-synthesis speech coding structure is presented
for high-quality speech coding in the 4 to 8 kb/s range.
CELP with generalized pitch prediction (GPP-CELP) differs from classical
code-excited linear prediction (CELP) in that for voiced segments it is the
speech signal that is decomposed into a component predictable with the aid
of the adaptive codebook (ACB) and a nonpredictable aperiodic component,
not the LPC residual.
The spectrum of the aperiodic component is estimated
by linear-prediction analysis. An approximation to the aperiodic component
is synthesized from a stochastic codebook of sparse pulse sequences and its
spectrum is shaped by the LPC synthesis filter. The ACB contains samples of
the past reconstructed signal, low-passed to increase the pitch prediction
gain. For voiced segments the new structure yields higher pitch prediction
gain and lower linear-prediction gain than classical CELP.
Subjective and objective comparisons reveal significant advantages for GPP-CELP
over classical CELP.
|
SP-1.2
|
A 16, 24, 32 kbit/s Wideband Speech Codec Based on ATCELP
Pierre Combescure (France Telecom CNET, DIH/DIPS, France),
Juergen Schnitzler (Aachen University of Technology, IND, Germany),
Kyrill Fischer,
Ralf Kirchherr (Deutsche Telekom Berkom, Germany),
Claude Lamblin,
Alain Le Guyader,
Dominique Massaloux,
Catherine Quinquis (France Telecom CNET, DIH/DIPS, France),
Joachim Stegmann (Deutsche Telekom Berkom, Germany),
Peter Vary (Aachen University of Technology, IND, Germany)
This paper describes a combined Adaptive Transform Codec (ATC) and Code-Excited
Linear Prediction (CELP) algorithm, called ATCELP, for the compression of
wideband (7 kHz) signals. The CELP algorithm applies mainly
to speech, whereas the ATC mode is selected for music and noise signals.
We propose a switching scheme between CELP and ATC mode and describe a frame
erasure concealment technique.
Subjective listening tests have shown that the ATCELP codec at bit rates of 16, 24
and 32 kbit/s achieved performances close to those of the CCITT G.722 at 48,
56 and 64 kbit/s, respectively, at most operating conditions.
|
SP-1.3
|
A 6.1 to 13.3-kb/s Variable Rate CELP Codec (VR-CELP) for AMR Speech Coding
Stefan Heinen,
Marc Adrat (Institute of Communication Systems and Data Processing, Aachen University of Technology, Germany),
Oliver Steil (),
Peter Vary (Institute of Communication Systems and Data Processing, Aachen University of Technology, Germany),
Wen Xu (Department of Mobile Phone Development, Siemens AG, Hofmannstr. 51, 81359 Munich, Germany)
We propose a new 6.1 to 13.3-kb/s speech codec called
variable rate code-excited linear prediction (VR-CELP) for
Adaptive Multi-Rate (AMR) transmission over mobile radio channels such as GSM
or UMTS.
The AMR concept allows to operate with almost wireline
speech quality for poor channel conditions and better quality for good
channel conditions. This is achieved by dynamically splitting
the gross bit rate of the transmission system between source and channel
coding according to the current channel conditions. Thus the
source coding
scheme must be designed for seamless switching between rates without
annoying artifacts.
To enhance the transmission quality under very poor channel conditions,
a new powerful error concealment strategy based on estimation theory is
applied.
|
SP-1.4
|
CELP Speech Coding Based on an Adaptive Pulse Position Codebook
Tadashi Amada,
Kimio Miseki,
Masami Akamine (Toshiba Corporation)
CELP coders using pulse codebooks for excitations such as ACELP have the advantages of low complexity and high speech quality. At low bit rates, however, the decrease of pulse position candidates and the number of pulses degrades reconstructed speech quality. This paper describes a method for adaptive allocating of pulse position candidates. In the proposed method, N efficient candidates of pulse positions
are selected out of all possible positions in a subframe. Amplitude envelope of an adaptive code vector is used for selecting N efficient candidates. The larger the amplitude is, the more pulse positions are assigned. Using an adaptive code vector for the adaptation, the proposed method requires no additional bits for the adaptaion. Experimental results show that the proposed method increases WSNRseg by 0.3dB and MOS by 0.15.
|
SP-1.5
|
A Multistage Search of Algebraic CELP Codebooks
Miguel A Ramírez,
Max Gerken (Electronics Eng. Dept. - Escola Politécnica, University of São Paulo)
A joint amplitude and position search procedure is proposed for
searching algebraic multipulse codebooks. It is implemented within the
reference G.723.1 codec as an example. This joint search method is shown
to reduce down to one third the number of comparisons per subframe
relative to the focused search over an extensive speech database. An
efficient implementation of the joint search is derived which
incorporates backward filtering of the residual target vector and
precomputation of autocorrelation elements, bringing about a reduction
in complexity of one third in comparison to the focused search. The
joint search performs about one thirtieth as many comparisons as the
full position search.
|
SP-1.6
|
A Fast Search Method Of Algebraic Codebook By Reordering Search Sequence
NAM KYU HA (SK Teletech Co. Ltd.,)
This paper proposes a fast search method of algebraic codebook in CELP coders. In the proposed method, the sequence of codebook search is reordered according to the criterion of mean-squared weighted error between target vector and filtered adaptive codebook vector, and the algebraic codebook is searched until a predefined threshold is satisfied. This method reduces the computations considerably compared with G.729 at the expense of a slight degradation of speech quality. Moreover, it gives better speech quality with smaller average search space than G.729A.
|
SP-1.7
|
An 8 kbit/s ACELP Coder with Improved Background Noise Performance
Roar Hagen,
Erik Ekudden (Ericsson Radio Systems AB)
This paper describes an 8 kbit/s ACELP speech coder with high performance for
both speech and non-speech signals such as background noise. While the traditional
waveform matching LPAS structure employed in many existing speech coders provides
high quality for speech signals, it has significant performance limitations for
e.g. background noise. The coder presented here employs a novel adaptive gain
coding technique using energy matching in combination with a traditional waveform
matching criterion providing high quality for both speech and background noise.
The coder has a basic structure similar to that of the 7.4 kbit/s D-AMPS EFR
coder, with a 10th order LPC, high resolution adaptive codebook and a 4-pulse
algebraic codebook. The performance for speech signals is equivalent to or better
than that of state-of the-art 8 kbit/s coders, while for background noise conditions
the performance is significantly improved.
|
SP-1.8
|
On Phase Perception in Speech
Harald Pobloth,
W. Bastiaan Kleijn (Royal Institute of Technology, Stockholm)
In this paper we define perceptual phase capacity as the size of a
codebook of phase spectra necessary to represent all possible phase
spectra in a perceptually accurate manner. We determine the perceptual
phase capacity for voiced speech. To this purpose, we use an auditory
model which indicates if phase spectrum changes are audible or not.
The correct performance of the model was adjusted and verified by
listening tests.
The perceptual phase capacity in low pitched speech is found to be
much higher than it is for high pitched speech.
Our results are consistent with the well known fact that speech coding
schemes which preserve the phase accurately work better for male
voices, while coders which put more weight on the amplitude spectrum
of the speech signal result in better quality for female speech.
|
|