Authors:
Paul Mermelstein,
Yasheng Qian,
Page (NA) Paper number 1137
Abstract:
A new analysis-by-synthesis speech coding structure is presented for
high-quality speech coding in the 4 to 8 kb/s range. CELP with generalized
pitch prediction (GPP-CELP) differs from classical code-excited linear
prediction (CELP) in that for voiced segments it is the speech signal
that is decomposed into a component predictable with the aid of the
adaptive codebook (ACB) and a nonpredictable aperiodic component, not
the LPC residual. The spectrum of the aperiodic component is estimated
by linear-prediction analysis. An approximation to the aperiodic component
is synthesized from a stochastic codebook of sparse pulse sequences
and its spectrum is shaped by the LPC synthesis filter. The ACB contains
samples of the past reconstructed signal, low-passed to increase the
pitch prediction gain. For voiced segments the new structure yields
higher pitch prediction gain and lower linear-prediction gain than
classical CELP. Subjective and objective comparisons reveal significant
advantages for GPP-CELP over classical CELP.
Authors:
Pierre Combescure, France Telecom CNET, DIH/DIPS, France (France)
Jürgen Schnitzler, Aachen University of Technology, IND, Germany (Germany)
Kyrill Fischer, Deutsche Telekom Berkom, Germany (Germany)
Ralf Kirchherr, Deutsche Telekom Berkom, Germany (Germany)
Claude Lamblin, France Telecom CNET, DIH/DIPS, France (France)
Alain Le Guyader, France Telecom CNET, DIH/DIPS, France (France)
Dominique Massaloux, France Telecom CNET, DIH/DIPS, France (France)
Catherine Quinquis, France Telecom CNET, DIH/DIPS, France (France)
Joachim Stegmann, Deutsche Telekom Berkom, Germany (Germany)
Peter Vary, Aachen University of Technology, IND, Germany (Germany)
Page (NA) Paper number 1369
Abstract:
This paper describes a combined Adaptive Transform Codec (ATC) and
Code-Excited Linear Prediction (CELP) algorithm, called ATCELP, for
the compression of wideband (7 kHz) signals. The CELP algorithm applies
mainly to speech, whereas the ATC mode is selected for music and noise
signals. We propose a switching scheme between CELP and ATC mode and
describe a frame erasure concealment technique. Subjective listening
tests have shown that the ATCELP codec at bit rates of 16, 24 and 32
kbit/s achieved performances close to those of the CCITT G.722 at 48,
56 and 64 kbit/s, respectively, at most operating conditions.
Authors:
Stefan Heinen, Institute of Communication Systems and Data Processing, Aachen University of Technology, Germany (Germany)
Marc Adrat, Institute of Communication Systems and Data Processing, Aachen University of Technology, Germany (Germany)
Oliver Steil,
Peter Vary, Institute of Communication Systems and Data Processing, Aachen University of Technology, Germany (Germany)
Wen Xu, Department of Mobile Phone Development, Siemens AG, Hofmannstr. 51, 81359 Munich, Germany (Germany)
Page (NA) Paper number 1375
Abstract:
We propose a new 6.1 to 13.3-kb/s speech codec called variable rate
code-excited linear prediction (VR-CELP) for Adaptive Multi-Rate (AMR)
transmission over mobile radio channels such as GSM or UMTS. The AMR
concept allows to operate with almost wireline speech quality for poor
channel conditions and better quality for good channel conditions.
This is achieved by dynamically splitting the gross bit rate of the
transmission system between source and channel coding according to
the current channel conditions. Thus the source coding scheme must
be designed for seamless switching between rates without annoying artifacts.
To enhance the transmission quality under very poor channel conditions,
a new powerful error concealment strategy based on estimation theory
is applied.
Authors:
Tadashi Amada,
Kimio Miseki,
Masami Akamine,
Page (NA) Paper number 1542
Abstract:
CELP coders using pulse codebooks for excitations such as ACELP have
the advantages of low complexity and high speech quality. At low bit
rates, however, the decrease of pulse position candidates and the number
of pulses degrades reconstructed speech quality. This paper describes
a method for adaptive allocating of pulse position candidates. In the
proposed method, N efficient candidates of pulse positions are selected
out of all possible positions in a subframe. Amplitude envelope of
an adaptive code vector is used for selecting N efficient candidates.
The larger the amplitude is, the more pulse positions are assigned.
Using an adaptive code vector for the adaptation, the proposed method
requires no additional bits for the adaptaion. Experimental results
show that the proposed method increases WSNRseg by 0.3dB and MOS by
0.15.
Authors:
Miguel Arjona Ramírez,
Max Gerken,
Page (NA) Paper number 1652
Abstract:
A joint amplitude and position search procedure is proposed for searching
algebraic multipulse codebooks. It is implemented within the reference
G.723.1 codec as an example. This joint search method is shown to reduce
down to one third the number of comparisons per subframe relative to
the focused search over an extensive speech database. An efficient
implementation of the joint search is derived which incorporates backward
filtering of the residual target vector and precomputation of autocorrelation
elements, bringing about a reduction in complexity of one third in
comparison to the focused search. The joint search performs about one
thirtieth as many comparisons as the full position search.
Authors:
Nam Kyu Ha,
Page (NA) Paper number 1796
Abstract:
This paper proposes a fast search method of algebraic codebook in CELP
coders. In the proposed method, the sequence of codebook search is
reordered according to the criterion of mean-squared weighted error
between target vector and filtered adaptive codebook vector, and the
algebraic codebook is searched until a predefined threshold is satisfied.
This method reduces the computations considerably compared with G.729
at the expense of a slight degradation of speech quality. Moreover,
it gives better speech quality with smaller average search space than
G.729A.
Authors:
Roar Hagen,
Erik Ekudden,
Page (NA) Paper number 2336
Abstract:
This paper describes an 8 kbit/s ACELP speech coder with high performance
for both speech and non-speech signals such as background noise. While
the traditional waveform matching LPAS structure employed in many existing
speech coders provides high quality for speech signals, it has significant
performance limitations for e.g. background noise. The coder presented
here employs a novel adaptive gain coding technique using energy matching
in combination with a traditional waveform matching criterion providing
high quality for both speech and background noise. The coder has a
basic structure similar to that of the 7.4 kbit/s D-AMPS EFR coder,
with a 10th order LPC, high resolution adaptive codebook and a 4-pulse
algebraic codebook. The performance for speech signals is equivalent
to or better than that of state-of the-art 8 kbit/s coders, while for
background noise conditions the performance is significantly improved.
Authors:
Harald Pobloth,
W. Bastiaan Kleijn,
Page (NA) Paper number 2338
Abstract:
In this paper we define perceptual phase capacity as the size of a
codebook of phase spectra necessary to represent all possible phase
spectra in a perceptually accurate manner. We determine the perceptual
phase capacity for voiced speech. To this purpose, we use an auditory
model which indicates if phase spectrum changes are audible or not.
The correct performance of the model was adjusted and verified by listening
tests. The perceptual phase capacity in low pitched speech is found
to be much higher than it is for high pitched speech. Our results are
consistent with the well known fact that speech coding schemes which
preserve the phase accurately work better for male voices, while coders
which put more weight on the amplitude spectrum of the speech signal
result in better quality for female speech.
|