Deep Learning for Acoustic Echo Cancellation in Noisy and Double-Talk Scenarios
Abstract:
Traditional acoustic echo cancellation (AEC) works by identifying an acoustic impulse response using adaptive algorithms. We formulate AEC as a supervised speech separation problem, which separates the loudspeaker signal and the near-end signal so that only the latter is transmitted to the far end. A recurrent neural network with bidirectional long short-term memory (BLSTM) is trained to estimate the ideal ratio mask from features extracted from the mixtures of near-end and far-end signals. A BLSTM estimated mask is then applied to separate and suppress the far-end signal, hence removing the echo. Experimental results show the effectiveness of the proposed method for echo removal in double-talk, background noise and nonlinear distortion scenarios. In addition, the proposed method can be generalized to untrained speakers.
Cite as: Zhang, H., Wang, D. (2018) Deep Learning for Acoustic Echo Cancellation in Noisy and Double-Talk Scenarios. Proc. Interspeech 2018, 3239-3243, DOI: 10.21437/Interspeech.2018-1484.
BiBTeX Entry:
@inproceedings{Zhang2018,
author={Hao Zhang and DeLiang Wang},
title={Deep Learning for Acoustic Echo Cancellation in Noisy and Double-Talk Scenarios},
year=2018,
booktitle={Proc. Interspeech 2018},
pages={3239--3243},
doi={10.21437/Interspeech.2018-1484},
url={http://dx.doi.org/10.21437/Interspeech.2018-1484} }