Authors IndexSessionsTechnical programAttendees

 

Session: Other Topics in ASR Robustness, Adaptation and Language Modeling

Title: Pseudo 2-Dimensional Hidden Markov Models in Speech Recognition

Authors: Steffen Werner, Gerhard Rigoll

Abstract: In this paper, the usage of pseudo 2-dimensional Hidden Markov Models for speech recognition is discussed. This image processing method should better model the time-frequency structure in speech signals. The method calculates the emission probability of a standard HMM by embedded HMMs for each state. If a temporal sequence of spectral vectors is imagined as a spectrogram, this leads to a 2-dimensional warping of the spectrogram. This additional warping of the frequency axis could be useful for speaker-independent recognition and can be considered to be similar to a vocal tract normalization. The effects of this paradigm are investigated in this paper using the TI-Digits database.

a01sw088.ps a01sw088.pdf