Interspeech 2018

Large Vocabulary Concatenative Resynthesis

Soumi Maiti, Joey Ching and Michael Mandel

Abstract:

Traditional speech enhancement systems reduce noise by modifying the noisy signal, which suffer from two problems: under-suppression of noise and over-suppression of speech. As an alternative, in this paper, we use the recently introduced concatenative resynthesis approach where we replace the noisy speech with its clean resynthesis. The output of such a system can produce speech that is both noise-free and high quality. This paper generalizes our previous small-vocabulary system to large vocabulary. To do so, we employ efficient decoding techniques using fast approximate nearest neighbor (ANN) algorithms. Firstly, we apply ANN techniques on the original small vocabulary task and get 5X speedup. We then apply the techniques to the construction of a large vocabulary concatenative resynthesis system and scale the system up to 12X larger dictionary. We perform listening tests with five participants to measure subjective quality and intelligibility of the output speech.

Cite as: Maiti, S., Ching, J., Mandel, M. (2018) Large Vocabulary Concatenative Resynthesis. Proc. Interspeech 2018, 1190-1194, DOI: 10.21437/Interspeech.2018-2383.

BiBTeX Entry:

@inproceedings{Maiti2018,
author={Soumi Maiti and Joey Ching and Michael Mandel},
title={Large Vocabulary Concatenative Resynthesis},
year=2018,
booktitle={Proc. Interspeech 2018},
pages={1190--1194},
doi={10.21437/Interspeech.2018-2383},
url={http://dx.doi.org/10.21437/Interspeech.2018-2383} }