Authors IndexSessionsTechnical programAttendees

 

Session: Other Topics in ASR Robustness, Adaptation and Language Modeling

Title: Automatic Selection of Transcribed Training Material

Authors: Teresa Kamm, Gerard Meyer

Abstract: Conventional wisdom says that incorporating more training data is the surest way to reduce the error rate of a speech recognition system. This, in turn, guarantees that speech recognition systems are expensive to train, because of the high cost of annotating training data. In this paper, we propose an iterative training algorithm that seeks to improve the error rate of a speech recognizer without incurring additional transcription cost, by selecting a subset of the already available transcribed training data. We apply the proposed algorithm to an alphadigit recognition problem and reduce the error rate from 10.3% to 9.4% on a particular test set.

a01tk068.ps a01tk068.pdf