Title: Gaussian Mixture Models of Phonetic Boundaries For Speech Recognition
Authors: Mohamed Kamal Omar, Mark Hasegawa-Johnson, Stephen Levinson
Abstract:
A new approach to represent temporal correlation in an automatic speech recognition system is described.
It introduces an acoustic feature set that captures the dynamics of speech signal at the phoneme boundaries in combination with the traditional acoustic feature
set representing the periods that are assumed to be quasi-stationary of speech. This newly introduced feature set represents
an observed random vector associated with the state transition in HMM. For the same complexity and number of parameters, this
approach improves the phoneme recognition accuracy by 3.5% compared to the context-independent HMM models. Stop consonant recognition accuracy is increased by 40%.
|