Leveraging Translations for Speech Transcription in Low-resource Settings
Antonios Anastasopoulos and David Chiang
Abstract:
Recently proposed data collection frameworks for endangered language documentation aim not only to collect speech in the language of interest, but also to collect translations into a high-resource language that will render the collected resource interpretable. We focus on this scenario and explore whether we can improve transcription quality under these extremely low-resource settings with the assistance of text translations. We present a neural multi-source model and evaluate several variations of it on three low-resource datasets. We find that our multi-source model with shared attention outperforms the baselines, reducing transcription character error rate by up to 12.3%.
Cite as: Anastasopoulos, A., Chiang, D. (2018) Leveraging Translations for Speech Transcription in Low-resource Settings. Proc. Interspeech 2018, 1279-1283, DOI: 10.21437/Interspeech.2018-2162.
BiBTeX Entry:
@inproceedings{Anastasopoulos2018,
author={Antonios Anastasopoulos and David Chiang},
title={Leveraging Translations for Speech Transcription in Low-resource Settings},
year=2018,
booktitle={Proc. Interspeech 2018},
pages={1279--1283},
doi={10.21437/Interspeech.2018-2162},
url={http://dx.doi.org/10.21437/Interspeech.2018-2162} }