HUB



Improved Accented Speech Recognition Using Accent Embeddings and Multi-task Learning

Abhinav Jain, Minali Upreti and Preethi Jyothi

Abstract:

One of the major remaining challenges in modern automatic speech recognition (ASR) systems for English is to be able to handle speech from users with a diverse set of accents. ASR systems that are trained on speech from multiple English accents still underperform when confronted with a new speech accent. In this work, we explore how to use accent embeddings and multi-task learning to improve speech recognition for accented speech. We propose a multi-task architecture that jointly learns an accent classifier and a multi-accent acoustic model. We also consider augmenting the speech input with accent information in the form of embeddings extracted by a separate network. These techniques together give significant relative performance improvements of 15% and 10% over a multi-accent baseline system on test sets containing seen and unseen accents, respectively.


Cite as: Jain, A., Upreti, M., Jyothi, P. (2018) Improved Accented Speech Recognition Using Accent Embeddings and Multi-task Learning. Proc. Interspeech 2018, 2454-2458, DOI: 10.21437/Interspeech.2018-1864.


BiBTeX Entry:

@inproceedings{Jain2018,
author={Abhinav Jain and Minali Upreti and Preethi Jyothi},
title={Improved Accented Speech Recognition Using Accent Embeddings and Multi-task Learning},
year=2018,
booktitle={Proc. Interspeech 2018},
pages={2454--2458},
doi={10.21437/Interspeech.2018-1864},
url={http://dx.doi.org/10.21437/Interspeech.2018-1864} }