Authors IndexSessionsTechnical programAttendees

 

Mark Gales - Cambridge University (U.K.)

Title: Adaptive Training for Robust ASR

Abstract: Adaptive Training is a powerful training technique for building speech recognition systems on non-homogeneous data. The aim is to remove unwanted variability, such as changes in speaker, channel or acoustic environment, from desired changes, the acoustic differences between words. Thus during training two sets of models are generated. A canonical model set, normally a collection of HMMs, for the desired ``true'' variability of the speech data, and a set of transforms for the unwanted variability. The canonical model trained in this fashion should be more "amenable" to being adapted to a particular target condition. In addition, it should represent only the desired variability of the data. During recognition a transform to the target domain is trained. This target specific transform and the canonical model are then used in the recognition process. This paper examines the underlying theory and assumptions used in adaptive training. Furthermore, the use of adaptive training schemes in current state-of-the-art tasks is described, along with a discussion of how such schemes may be used in the future.

Curriculum: Mark Gales studied for the B.A. in Electrical and Information Sciences at the University of Cambridge from 1985-88. Following graduation he worked as a consultant at Roke Manor Research Ltd. In 1991 he took up a position as a Research Associate in the Speech Vision and Robotics group in the Engineering Department at Cambridge University. In 1996 he completed his doctoral thesis: Model-Based Techniques for Robust Speech Recognition supervised by Professor Steve Young. From 1995-1997 he was a Research Fellow at Emmanuel College Cambridge. He was then a Research Staff Member in the Speech group at the IBM T.J.Watson Research Center until 1999. He is currently a University Lecturer at Cambridge University Engineering Department and a Fellow of Emmanuel College. His research interests are large vocabulary continuous speech recognition, robust speech recognition and machine learning.

Gales.ps Gales.pdf