Authors IndexSessionsTechnical programAttendees

 

Session: Audio-video Information Retrieval and Digital Archives - Multilingual and Speech-to-Speech
Translation

Title: UNSUPERVISED TRAINING OF ACOUSTIC MODELS FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION

Authors: Frank Wessel, Hermann Ney

Abstract: For speech recognition systems, the amount of acoustic training data is of crucial importance. In the past, large amounts of speech were thus recorded and transcribed manually for training. Since untranscribed speech is available in various forms these days, the unsupervised training of a speech recognizer on recognized transcriptions is studied in this paper. A low-cost recognizer trained with only one hour of manually transcribed speech is used to recognize 72 hours of untranscribed acoustic data. These transcriptions are then used in combination with confidence measures to train an improved recognizer. The effect of confidence measures which are used to detect possible recognition errors is studied systematically. Finally, the unsupervised training is applied iteratively. Using this method, the recognizer is trained with very little manual effort while loosing only 14.3% relative on the Broadcast News '96 and 18.6% relative on the Broadcast News '98 evaluation test sets.

a01fw018.ps a01fw018.pdf