ICASSP99 Speech Under Stress

Speech Under Stress
Home Full List of Titles 1: Speech Processing CELP Coding Large Vocabulary Recognition Speech Analysis and Enhancement Acoustic Modeling I ASR Systems and Applications Topics in Speech Coding Speech Analysis Low Bit Rate Speech Coding I Robust Speech Recognition in Noisy Environments Speaker Recognition Acoustic Modeling II Speech Production and Synthesis Feature Extraction Robust Speech Recognition and Adaptation Low Bit Rate Speech Coding II Speech Understanding Language Modeling I 2: Speech Processing, Audio and Electroacoustics, and Neural Networks Acoustic Modeling III Lexical Issues/Search Speech Understanding and Systems Speech Analysis and Quantization Utterance Verification/Acoustic Modeling Language Modeling II Adaptation /Normalization Speech Enhancement Topics in Speaker and Language Recognition Echo Cancellation and Noise Control Coding Auditory Modeling, Hearing Aids and Applications of Signal Processing to Audio and Acoustics Spatial Audio Music Applications Application - Pattern Recognition & Speech Processing Theory & Neural Architecture Signal Separation Application - Image & Nonlinear Signal Processing 3: Signal Processing Theory & Methods I Filter Design and Structures Detection Wavelets Adaptive Filtering: Applications and Implementation Nonlinear Signals and Systems Time/Frequency and Time/Scale Analysis Signal Modeling and Representation Filterbank and Wavelet Applications Source and Signal Separation Filterbanks Emerging Applications and Fast Algorithms Frequency and Phase Estimation Spectral Analysis and Higher Order Statistics Signal Reconstruction Adaptive Filter Analysis Transforms and Statistical Estimation Markov and Bayesian Estimation and Classification 4: Signal Processing Theory & Methods II, Design and Implementation of Signal Processing Systems, Special Sessions, and Industry Technology Tracks System Identification, Equalization, and Noise Suppression Parameter Estimation Adaptive Filters: Algorithms and Performance DSP Development Tools VLSI Building Blocks DSP Architectures DSP System Design Education Recent Advances in Sampling Theory and Applications Steganography: Information Embedding, Digital Watermarking, and Data Hiding Speech Under Stress Physics-Based Signal Processing DSP Chips, Architectures and Implementations DSP Tools and Rapid Prototyping Communication Technologies Image and Video Technologies Automotive Applications / Industrial Signal Processing Speech and Audio Technologies Defense and Security Applications Biomedical Applications Voice and Media Processing Adaptive Interference Cancellation 5: Communications, Sensor Array and Multichannel Source Coding and Compression Compression and Modulation Channel Estimation and Equalization Blind Multiuser Communications Signal Processing for Communications I CDMA and Space-Time Processing Time-Varying Channels and Self-Recovering Receivers Signal Processing for Communications II Blind CDMA and Multi-Channel Equalization Multicarrier Communications Detection, Classification, Localization, and Tracking Radar and Sonar Signal Processing Array Processing: Direction Finding Array Processing Applications I Blind Identification, Separation, and Equalization Antenna Arrays for Communications Array Processing Applications II 6: Multimedia Signal Processing, Image and Multidimensional Signal Processing, Digital Signal Processing Education Multimedia Analysis and Retrieval Audio and Video Processing for Multimedia Applications Advanced Techniques in Multimedia Video Compression and Processing Image Coding Transform Techniques Restoration and Estimation Image Analysis Object Identification and Tracking Motion Estimation Medical Imaging Image and Multidimensional Signal Processing Applications I Segmentation Image and Multidimensional Signal Processing Applications II Facial Recognition and Analysis Digital Signal Processing Education Author Index A B C D E F G H I J K L M N O P Q R S T U V W X Y Z	Speech Under Stress Conditions: Overview Of The Effect On Speech Production And On System Performance Authors: Herman J.M. Steeneken, John H.L. Hansen, Page (NA) Paper number 3033 Abstract: ABSTRACT The NATO research study group on ``Speech and Language Technology'' recently completed a three year project on the effect of ``stress'' on speech production and system performance. For this purpose various speech databases were collected. A definition of various states of stress and the corresponding type of stressor is proposed. Results are reported from analysis and assessment studies performed with the databases collected for this project. IC993033.PDF (From Author) TOP The Lombard Effect: A Reflex To Better Communicate With Others In Noise Authors: Jean-Claude Junqua, Steven C Fincke, Kenneth L Field, Page (NA) Paper number 3034 Abstract: To study the Lombard reflex, more realistic databases representing real-world conditions need to be recorded and analyzed. In this paper we 1) summarize a procedure to record Lombard data which provides a good approximation of realistic conditions, 2) present an analysis per class of sounds for duration and energy of words recorded while subjects are listening to noise through open-ear headphones a) when speakers are in communication with a recognition device and b) when reading a list, and 3) report on the influence of speaking style on speaker-dependent and speaker-independent experiments. This paper extends a previous study aimed at analyzing the influence of the communication factor on the Lombard reflex. We also show evidence that it is difficult to separate the speaker from the environment stressor (in this case the noise) when studying the Lombard reflex. The main conclusion of our pilot study is that the communication factor should not be neglected because it strongly influences the Lombard reflex. IC993034.PDF (Scanned) TOP Methods For Stress Classification: Nonlinear TEO And Linear Speech Based Features Authors: Guojun Zhou, Duke University (U.K.) John H.L. Hansen, Duke University (U.K.) James F. Kaiser, Duke University (U.K.) Page (NA) Paper number 3035 Abstract: Speech production variations due to perceptually induced stress contribute significantly to reduced speech processing performance. One approach that can improve the robustness of speech processing (e.g., recognition) algorithms against stress is to formulate an objective classification of speaker stress based upon the acoustic speech signal. In this paper, an overview of recent methods for stress classification is presented. First, we review traditional pitch-based methods for stress detection and classification. Second, neural network based stress classifiers with cepstral-based features, as well as wavelet-based classification algorithms are considered. The effect of stress on linear speech features is discussed, followed by the application of linear features and Teager Energy Operator (TEO) based nonlinear features for effective stress classification. A new evaluation for stress classification and assessment is presented using a critical band frequency partition based TEO feature and the combination of several linear features. Results using NATO databases of actual speech under stress are presented. Finally, we discuss issues relating to stress classification across known and unknown speakers and suggest areas for further research. IC993035.PDF (From Author) IC993035.PDF (Rasterized) TOP Analysis Of Mrate, Shimmer, Jitter And Fo Contour Features Across Stress And Speaking Style In The SUSAS Database Authors: Raymond E. Slyh, W. Todd Nelson, Eric G. Hansen, Page (NA) Paper number 3036 Abstract: This paper highlights the results of an investigation of several features across the style classes of the ``simulated'' portion of the SUSAS database. The features considered here include a recently-introduced measure of speaking rate called mrate, measures of shimmer, measures of jitter, and features derived from fundamental frequency (F0) contours. The F0 contour features are the means of F0 and Delta F0 over the first, middle, and last thirds of the ordered set of voiced frames for each word. Mrate exhibits differences between the Fast, Neutral, and Slow styles and between the Loud, Neutral, and Soft styles. Shimmer and jitter exhibit differences that are similar to those of mrate; however, the shimmer and jitter differences are less consistent than the mrate differences across the speakers in the database. Several F0 contour features exhibit differences between the Angry, Loud, Lombard, and Question styles and most of the other styles. IC993036.PDF (From Author) IC993036.PDF (Rasterized) TOP Some Characteristics Of Speech Produced Under High G-Force And Pressure Breathing Authors: Allan J South, Page (NA) Paper number 3000 Abstract: The performance of speech recognisers in combat aircraft is degraded seriously by the extreme physical stresses to which the crew are subjected. This paper describes measurements of first and second formant frequencies of nine vowels from one speaker recorded under high levels of acceleration, with and without positive pressure breathing. Under acceleration alone, F2 is reduced for high front vowels, while F1 remains constant, but for back and mid vowels, F1 reduces with little change in F2. When positive pressure breathing is introduced, nearly all vowels are affected, and the "vowel triangle" on the F1-F2 plane collapses inwards, towards the neutral vowel position. If these changes are found to be consistent between speakers, it is hoped to develop techniques of voice transformation to reverse them, and thus improve the performance of speech recognisers in this harsh environment. IC993000.PDF (From Author) IC993000.PDF (Rasterized) TOP