Authors IndexSessionsTechnical programAttendees

 

Session: Other Topics in ASR Robustness, Adaptation and Language Modeling

Title: A Language Model Adaptation Using Multiple Varied Corpora

Authors: Hirofumi Yamamoto, Yoshinori Sagisaka

Abstract: A new language model adaptation scheme is proposed to cope with multiple varied speech recognition tasks. Both topic difference and sentence style difference resulting from the speaker's role are reflected in the proposed language model adaptation. An adaptation is carried out using two different corpora where only the topic or speaker's style is matched.

New word clustering techniques are introduced to extract the topic or style dependency separately. Word neighboring characteristics in the two adaptation source data regarded as different features in this clustering. All words are classified into commonly use classes and topic or style dependent classes. Furthermore, target topic and sentence style dependent words and their neighboring characteristics are emphasized according to their frequency in the adaptation target data.

In the evaluation experiment, the proposed method shows a 13% lower perplexity and a 9% lower word error rate compared with the conventional adaptation method.

a01hy011.ps a01hy011.pdf