ICASSP99 Speech Understanding

Speech Understanding
Home Full List of Titles 1: Speech Processing CELP Coding Large Vocabulary Recognition Speech Analysis and Enhancement Acoustic Modeling I ASR Systems and Applications Topics in Speech Coding Speech Analysis Low Bit Rate Speech Coding I Robust Speech Recognition in Noisy Environments Speaker Recognition Acoustic Modeling II Speech Production and Synthesis Feature Extraction Robust Speech Recognition and Adaptation Low Bit Rate Speech Coding II Speech Understanding Language Modeling I 2: Speech Processing, Audio and Electroacoustics, and Neural Networks Acoustic Modeling III Lexical Issues/Search Speech Understanding and Systems Speech Analysis and Quantization Utterance Verification/Acoustic Modeling Language Modeling II Adaptation /Normalization Speech Enhancement Topics in Speaker and Language Recognition Echo Cancellation and Noise Control Coding Auditory Modeling, Hearing Aids and Applications of Signal Processing to Audio and Acoustics Spatial Audio Music Applications Application - Pattern Recognition & Speech Processing Theory & Neural Architecture Signal Separation Application - Image & Nonlinear Signal Processing 3: Signal Processing Theory & Methods I Filter Design and Structures Detection Wavelets Adaptive Filtering: Applications and Implementation Nonlinear Signals and Systems Time/Frequency and Time/Scale Analysis Signal Modeling and Representation Filterbank and Wavelet Applications Source and Signal Separation Filterbanks Emerging Applications and Fast Algorithms Frequency and Phase Estimation Spectral Analysis and Higher Order Statistics Signal Reconstruction Adaptive Filter Analysis Transforms and Statistical Estimation Markov and Bayesian Estimation and Classification 4: Signal Processing Theory & Methods II, Design and Implementation of Signal Processing Systems, Special Sessions, and Industry Technology Tracks System Identification, Equalization, and Noise Suppression Parameter Estimation Adaptive Filters: Algorithms and Performance DSP Development Tools VLSI Building Blocks DSP Architectures DSP System Design Education Recent Advances in Sampling Theory and Applications Steganography: Information Embedding, Digital Watermarking, and Data Hiding Speech Under Stress Physics-Based Signal Processing DSP Chips, Architectures and Implementations DSP Tools and Rapid Prototyping Communication Technologies Image and Video Technologies Automotive Applications / Industrial Signal Processing Speech and Audio Technologies Defense and Security Applications Biomedical Applications Voice and Media Processing Adaptive Interference Cancellation 5: Communications, Sensor Array and Multichannel Source Coding and Compression Compression and Modulation Channel Estimation and Equalization Blind Multiuser Communications Signal Processing for Communications I CDMA and Space-Time Processing Time-Varying Channels and Self-Recovering Receivers Signal Processing for Communications II Blind CDMA and Multi-Channel Equalization Multicarrier Communications Detection, Classification, Localization, and Tracking Radar and Sonar Signal Processing Array Processing: Direction Finding Array Processing Applications I Blind Identification, Separation, and Equalization Antenna Arrays for Communications Array Processing Applications II 6: Multimedia Signal Processing, Image and Multidimensional Signal Processing, Digital Signal Processing Education Multimedia Analysis and Retrieval Audio and Video Processing for Multimedia Applications Advanced Techniques in Multimedia Video Compression and Processing Image Coding Transform Techniques Restoration and Estimation Image Analysis Object Identification and Tracking Motion Estimation Medical Imaging Image and Multidimensional Signal Processing Applications I Segmentation Image and Multidimensional Signal Processing Applications II Facial Recognition and Analysis Digital Signal Processing Education Author Index A B C D E F G H I J K L M N O P Q R S T U V W X Y Z	Incorporating Confidence Measures in the Dutch Train Timetable Information System Developed in the ARISE Project Authors: Gies Bouwman, Janienke Sturm, Louis Boves, Page (NA) Paper number 1504 Abstract: The use of Confidence Measures (CMs) in Spoken Dialog System (SDS) applications to suppress the number of verification turns for 'reliably correctly recognised utterances' can greatly reduce average dialog length which enhances usability and increases user satisfac- tion [1]. This paper gives a brief but clear review of the method of CM assessment, which was presented in [2]. It proceeds by demonstrating how the Dutch ARISE (Automatic Railways Information Systems in Europe) SDS was equipped with this technology and shows in deep detail how the parameters involved are to be optimised. The evaluation reveals and explains a typical beha- viour of this method with train timetable information- alike systems. This results in a set of conclusions that were not foreseen when the method was first deve- loped for a directory information system. The paper ends with an outlook for solutions in new research directions. IC991504.PDF (From Author) IC991504.PDF (Rasterized) TOP HMM and Neural Network based Speech Act Detection Authors: Klaus Ries, Page (NA) Paper number 2173 Abstract: We present an incremental lattice generation approach to speech act detection for spontaneous and overlapping speech in telephone concersations (CallHome Spanish). At each stage of the process it is therefore possible to use different models after the initial HMM models have generated a reasonable set of hypothesis. These lattices can then be processed further by more complex models. This study shows how neural networks can be used very effectively in the classification of speech acts. We find that speech acts can be classified better using the neural net based approach than using the more classical ngram backoff model approach. The best resulting neural network operates only on unigrams and the integration of the ngram backoff model as a prior to the model reduces the performance of the model. The neural network can therefore more likely be robust against errors from an LVCSR system and can potentially be trained from a smaller database. IC992173.PDF (From Author) IC992173.PDF (Rasterized) TOP The LIMSI ARISE System for Train Travel Information Authors: Lori F Lamel, Sophie Rosset, Jean-Luc S Gauvain, Samir K Bennacef, Page (NA) Paper number 2240 Abstract: In the context of the LE-3 ARISE project we have been developing a dialog system for vocal access to rail travel information. The system provides schedule information for the main French intercity connections, as well as, simulated fares and reservations, reductions and services. Our goal is to obtain high dialog success rates with a very open structure, where the user is free to ask any question or to provide any information at any point in time. In order to improve performance with such an open dialog strategy, we make use of implicit confirmation using the callers wording (when possible), and change to a more constrained dialog level when the dialog is not going well. In addition to own assessment, the prototype system undergoes periodic user evaluations carried out by the our partners at the French Railways. IC992240.PDF (From Author) IC992240.PDF (Rasterized) TOP Improving The Suitability Of Imperfect Transcriptions For Information Retrieval From Spoken Documents Authors: Matthew A Siegler, Michael J. Witbrock, Page (NA) Paper number 2442 Abstract: Recently there has been a considerable focus on information retrieval for multimedia databases. When speech is used as the source material for multimedia indexing, the effect of transcriber error on retrieval effectiveness must be considered. This paper describes a method for measuring the relevance of documents to queries when information about the probability of word transcription error is available. To support the use of this technique, a method is presented for estimating word error probability in speech recognition engines that use word graphs (lattices). An information retrieval experiment using this technique on a large corpus of spoken documents is discussed. The method was able to reduce the difference in retrieval effectiveness between reference texts and hypothesized texts by 13%-38% depending on the size of the document set. IC992442.PDF (From Author) IC992442.PDF (Rasterized) TOP Automatic Topic Identification for Two-Level Call Routing Authors: John A Golden, Owen Kimball, Man-Hung Siu, Herbert Gish, Page (NA) Paper number 2468 Abstract: This paper presents an approach to routing telephone calls automatically, based upon their speech content. Our data consist of a set of calls collected from a customer-service center with a two-level menu, which allows jumping past the second level, and we view the routing of these calls as a topic-identification problem. Our topic identifier employs a multinomial model for keyword occurrences. We describe the call-routing task in detail, discuss the multinomial model, and present experiments which investigate several issues that arise from using the model for this task. IC992468.PDF (From Author) IC992468.PDF (Rasterized) TOP Named Entity Tagged Language Models Authors: Yoshihiko Gotoh, Steve Renals, Gethin Williams, Page (NA) Paper number 1984 Abstract: We introduce Named Entity (NE) Language Modelling, a stochastic finite state machine approach to identifying both words and NE categories from a stream of spoken data. We provide an overview of our approach to NE tagged language model (LM) generation together with results of the application of such a LM to the task of out-of-vocabulary (OOV) word reduction in large vocabulary speech recognition. Using the Wall Street Journal and Broadcast News corpora, it is shown that the tagged LM was able to reduce the overall word error rate by 14%, detecting up to 70% of previously OOV words. We also describe an example of the direct tagging of spoken data with NE categories. IC991984.PDF (From Author) IC991984.PDF (Rasterized) TOP Speech Translation: Coupling of Recognition and Translation Authors: Hermann Ney, Lehrstuhl fuer Informatik VI, RWTH Aachen, University of Technology, D-52056 Aachen, Germany (Germany) Page (NA) Paper number 1675 Abstract: In speech translation, we are faced with the problem of how to couple the speech recognition process and the translation process. Starting from the Bayes decision rule for speech translation, we analyze how the interaction between the recognition process and the translation process can be modelled. In the light of this decision rule, we discuss the already existing approaches to speech translation. None of the existing approaches seems to have addressed this direct interaction. We suggest two new methods, the local averaging approximation and the monotone alignments. IC991675.PDF (From Author) IC991675.PDF (Rasterized) TOP Probabilistic Models For Topic Detection And Tracking Authors: Frederick G Walls, Hubert Jin, Sreenivasa Sista, Richard Schwartz, Page (NA) Paper number 2404 Abstract: We present probabilistic models for use in detecting and tracking topics in broadcast news stories. Our information retrieval (IR) models are formally explained. The Topic Detection and Tracking (TDT) initiative is discussed. The application of probabilistic models to the topic detection and tracking tasks is developed, and enhancements are discussed. We discuss four variations of these models, and we report our preliminary test results from the current TDT corpus. IC992404.PDF (From Author) IC992404.PDF (Rasterized) TOP