On Training and Evaluation of Grapheme-to-Phoneme Mappings with Limited Data
Abstract:
When scaling to low resource languages for speech synthesis or speech recognition in an industrial setting, a common challenge is the absence of a readily available pronunciation lexicon. Common alternatives are handwritten letter-to-sound rules and data-driven grapheme-to-phoneme (G2P) models, but without a pronunciation lexicon it is hard to even determine their quality. We identify properties of a good quality metric and note drawbacks of naive estimates of G2P quality in the domain of small test sets. We demonstrate a novel method for reliable evaluation of G2P accuracy with minimal human effort. We also compare behavior of known state-of-the-art approaches for training with limited data. Finally we evaluate a new active learning approach for training G2P models in the low resource setting.
Cite as: Sharma, D. (2018) On Training and Evaluation of Grapheme-to-Phoneme Mappings with Limited Data. Proc. Interspeech 2018, 2858-2862, DOI: 10.21437/Interspeech.2018-1920.
BiBTeX Entry:
@inproceedings{Sharma2018,
author={Dravyansh Sharma},
title={On Training and Evaluation of Grapheme-to-Phoneme Mappings with Limited Data},
year=2018,
booktitle={Proc. Interspeech 2018},
pages={2858--2862},
doi={10.21437/Interspeech.2018-1920},
url={http://dx.doi.org/10.21437/Interspeech.2018-1920} }