Title: Automatic Transcription of Spontaneous Lecture Speech
Authors: Tatsuya Kawahara, Hiroaki Nanjo, Sadaoki Furui
Abstract:
We introduce our extensive projects on spontaneous speech processing and current trials of lecture speech recognition. A large corpus of lecture presentations and talks is being collected in the project. We have trained initial baseline models and confirmed significant difference of real lectures and written notes. In spontaneous lecture speech, the speaking rate is generally faster and changes a lot, which makes it harder to apply fixed segmentation and decoding settings. Therefore, we propose sequential decoding and speaking-rate dependent decoding strategies. The sequential decoder simultaneously performs automatic segmentation and decoding of input utterances. Then, the most adequate acoustic analysis, phone models and decoding parameters are applied according to the current speaking rate. These strategies achieve improvement on automatic transcription of real lecture speech.
|