Voice and Media Processing

Home
Full List of Titles
1: Speech Processing
CELP Coding
Large Vocabulary Recognition
Speech Analysis and Enhancement
Acoustic Modeling I
ASR Systems and Applications
Topics in Speech Coding
Speech Analysis
Low Bit Rate Speech Coding I
Robust Speech Recognition in Noisy Environments
Speaker Recognition
Acoustic Modeling II
Speech Production and Synthesis
Feature Extraction
Robust Speech Recognition and Adaptation
Low Bit Rate Speech Coding II
Speech Understanding
Language Modeling I
2: Speech Processing, Audio and Electroacoustics, and Neural Networks
Acoustic Modeling III
Lexical Issues/Search
Speech Understanding and Systems
Speech Analysis and Quantization
Utterance Verification/Acoustic Modeling
Language Modeling II
Adaptation /Normalization
Speech Enhancement
Topics in Speaker and Language Recognition
Echo Cancellation and Noise Control
Coding
Auditory Modeling, Hearing Aids and Applications of Signal Processing to Audio and Acoustics
Spatial Audio
Music Applications
Application - Pattern Recognition & Speech Processing
Theory & Neural Architecture
Signal Separation
Application - Image & Nonlinear Signal Processing
3: Signal Processing Theory & Methods I
Filter Design and Structures
Detection
Wavelets
Adaptive Filtering: Applications and Implementation
Nonlinear Signals and Systems
Time/Frequency and Time/Scale Analysis
Signal Modeling and Representation
Filterbank and Wavelet Applications
Source and Signal Separation
Filterbanks
Emerging Applications and Fast Algorithms
Frequency and Phase Estimation
Spectral Analysis and Higher Order Statistics
Signal Reconstruction
Adaptive Filter Analysis
Transforms and Statistical Estimation
Markov and Bayesian Estimation and Classification
4: Signal Processing Theory & Methods II, Design and Implementation of Signal Processing Systems, Special Sessions, and Industry Technology Tracks
System Identification, Equalization, and Noise Suppression
Parameter Estimation
Adaptive Filters: Algorithms and Performance
DSP Development Tools
VLSI Building Blocks
DSP Architectures
DSP System Design
Education
Recent Advances in Sampling Theory and Applications
Steganography: Information Embedding, Digital Watermarking, and Data Hiding
Speech Under Stress
Physics-Based Signal Processing
DSP Chips, Architectures and Implementations
DSP Tools and Rapid Prototyping
Communication Technologies
Image and Video Technologies
Automotive Applications / Industrial Signal Processing
Speech and Audio Technologies
Defense and Security Applications
Biomedical Applications
Voice and Media Processing
Adaptive Interference Cancellation
5: Communications, Sensor Array and Multichannel
Source Coding and Compression
Compression and Modulation
Channel Estimation and Equalization
Blind Multiuser Communications
Signal Processing for Communications I
CDMA and Space-Time Processing
Time-Varying Channels and Self-Recovering Receivers
Signal Processing for Communications II
Blind CDMA and Multi-Channel Equalization
Multicarrier Communications
Detection, Classification, Localization, and Tracking
Radar and Sonar Signal Processing
Array Processing: Direction Finding
Array Processing Applications I
Blind Identification, Separation, and Equalization
Antenna Arrays for Communications
Array Processing Applications II
6: Multimedia Signal Processing, Image and Multidimensional Signal Processing, Digital Signal Processing Education
Multimedia Analysis and Retrieval
Audio and Video Processing for Multimedia Applications
Advanced Techniques in Multimedia
Video Compression and Processing
Image Coding
Transform Techniques
Restoration and Estimation
Image Analysis
Object Identification and Tracking
Motion Estimation
Medical Imaging
Image and Multidimensional Signal Processing Applications I
Segmentation
Image and Multidimensional Signal Processing Applications II
Facial Recognition and Analysis
Digital Signal Processing Education

Author Index
A B C D E F G H I
J K L M N O P Q R
S T U V W X Y Z

A Realtime Software MPEG Transcoder Using A Novel Motion Vector Reuse And A SIMD Optimization Techniques

Authors:

Yuzo Senda,
Hidenobu Harasaki,

Page (NA) Paper number 2038

Abstract:

A realtime software MPEG transcoder has been developed. A novel motion vector reuse and a SIMD optimization techniques are introduced to accelerate the transcoder without any quality degradation. Mean absolute error approximation criteria are employed in the reuse technique to refine scaled motion vectors. The developed transcoder on Pentium II 266MHz runs 2.5 times as fast as realtime, when scaling an MPEG-1 bitstream to half size.

IC992038.PDF (From Author) IC992038.PDF (Rasterized)

TOP


Distributed Processing in the Home using a PC with a Wireless Speech Interface

Authors:

Debashis Chowdhury,
Ser J Chia,

Page (NA) Paper number 3022

Abstract:

The Personal Computer is evolving from a purely personal device to one that can support multiple applications in different locations simultaneously. This paper describes how the connectivity and processing capability of the PC can be used in a distributed manner in the home to provide a variety of services like speech activated environmental command and control functions, digital video decoding, Internet telephony and entertainment control. As they are architected today, current PC's are challenged when trying to perform intensive signal processing tasks while managing several external connections (e.g. dial-up internet) and multiple internal connections (e.g. cordless phone interface) at the same time. We will describe some of these challenges, and what remains to be done to make the PC more capable in undertaking such a multifunctional challenge.

IC993022.PDF (From Author) IC993022.PDF (Rasterized)

TOP


Speech Recognition Over The Internet Using Java

Authors:

Zhemin Tu,
Philipos C Loizou,

Page (NA) Paper number 2114

Abstract:

A speech recognition system based on an Internet client -server model is presented in this paper. A Java applet records the voice at the client computer, sends the recorded speech file over the Internet, and the server computer recognizes the speech and displays the recognized text back to the user. Using this structure, an isolated digit recognition application was realized.

IC992114.PDF (From Author) IC992114.PDF (Rasterized)

TOP


Implementation of a High-Quality Dolby* Digital Decoder Using MMX Technology

Authors:

James C Abel,
Michael A Julier,

Page (NA) Paper number 3019

Abstract:

Software decoding of Dolby Digital allows it to become a baseline capability on the PC, with greater flexibility than a hardware approach. Intel's MMX technology provides instructions that can significantly speed up the execution of the Dolby Digital decoder, freeing up the processor to perform other tasks such as video decoding and/or audio enhancement. Intel has worked closely with Dolby Laboratories to define an implementation of Dolby Digital based on MMX technology that has achieved Dolby's certification of quality.

IC993019.PDF (From Author) IC993019.PDF (Rasterized)

TOP


Joint Source Channel Coding of Images over Frequency Selective Channels Using DCT and Multicarrier BPAM

Authors:

Venceslav Kafedziski,

Page (NA) Paper number 2491

Abstract:

A novel approach to joint source and channel coding for frequency selective channels is presented. Multicarrier modulation is used to obtain an equivalent vector channel to the frequency selective channel and utilize the linear coding procedure of Lee and Petersen. The use of the block pulse amplitude transmission results in graceful degradation of the decoded signal for low channel SNR. The new procedure shows very good performance in the very low channel SNR region, for image transmission over frequency selective channels with deep nulls in the frequency response. Both encoder and decoder are computationally very inexpensive in terms of design and implementation, compared to the digital transmission with channel optimized vector quantization. Results for transmission of Gauss-Markov source and "Lena" image on several typical channels are presented.

IC992491.PDF (Scanned)

TOP


Multi-rate speech coding for wireless and Internet applications

Authors:

John E Kleider,
Richard J Pattison,

Page (NA) Paper number 1193

Abstract:

Fixed-rate speech codecs are unable to provide synthesized speech with fixed delay when the channel capacity changes, and can not dedicate additional forward error correction bits for protection against noisy channels. We propose a multi-rate method for variable bandwidth applications, such as the Internet, and severely degraded wireless channels, such as mobile cellular. The technique uses a multi-rate version of the sinusoidal transform coder (MRSTC), operates at 9.6/4.8/2.4/1.2 kilobits/sec (kb/s), and is switchable "on-the-fly." The algorithm produces high quality speech, even when transitioning between rates. We compare two switching techniques, one method uses a "frame-deletion" (FD) technique, and a second method which utilizes "parameter-history" (PH) information. PH produces the best speech quality. FD is attractive because it requires no additional speech memory. Experimental results show greater than a 9 dB gain in receiver C/No operating range using the MRSTC over a fixed-rate system with STC operating at 9.6 kb/s.

IC991193.PDF (From Author) IC991193.PDF (Rasterized)

TOP