Interspeech 2018

Robust Speaker Recognition from Distant Speech under Real Reverberant Environments Using Speaker Embeddings

Mahesh Kumar Nandwana, Julien van Hout, Mitchell McLaren, Allen Stauffer, Colleen Richey, Aaron Lawson and Martin Graciarena

Abstract:

This article focuses on speaker recognition using speech acquired using a single distant or far-field microphone in an indoors environment. This study differs from the majority of speaker recognition research, which focuses on speech acquisition over short distances, such as when using a telephone handset or mobile device or far-field microphone arrays, for which beamforming can enhance distant speech signals. We use two large-scale corpora collected by retransmitting speech data in reverberant environments with multiple microphones placed at different distances. We first characterize three different speaker recognition systems ranging from a traditional universal background model (UBM) i-vector system to a state-of-the-art deep neural network (DNN) speaker embedding system with a probabilistic linear discriminant analysis (PLDA) back-end. We then assess the impact of microphone distance and placement, background noise and loudspeaker orientation on the performance of speaker recognition system for distant speech data. We observe that the recently introduced DNN speaker embedding based systems are far more robust compared to i-vector based systems, providing a significant relative improvement of up to 54% over the baseline UBM i-vector system and 45.5% over prior DNN-based speaker recognition technology.

Cite as: Nandwana, M.K., van Hout, J., McLaren, M., Stauffer, A., Richey, C., Lawson, A., Graciarena, M. (2018) Robust Speaker Recognition from Distant Speech under Real Reverberant Environments Using Speaker Embeddings. Proc. Interspeech 2018, 1106-1110, DOI: 10.21437/Interspeech.2018-2221.

BiBTeX Entry:

@inproceedings{Nandwana2018,
author={Mahesh Kumar Nandwana and Julien {van Hout} and Mitchell McLaren and Allen Stauffer and Colleen Richey and Aaron Lawson and Martin Graciarena},
title={Robust Speaker Recognition from Distant Speech under Real Reverberant Environments Using Speaker Embeddings},
year=2018,
booktitle={Proc. Interspeech 2018},
pages={1106--1110},
doi={10.21437/Interspeech.2018-2221},
url={http://dx.doi.org/10.21437/Interspeech.2018-2221} }