Interspeech 2018

Training Recurrent Neural Network through Moment Matching for NLP Applications

Yue Deng, Yilin Shen, KaWai Chen and Hongxia Jin

Abstract:

Recurrent neural network (RNN) is conventionally trained in the supervised mode but used in the free-running mode for inferences on testing samples. The supervised mode takes ground truth token values as RNN inputs but the free-running mode can only use self-predicted token values as surrogating inputs. Such inconsistency inevitably results in poor generalizations of RNN on out-of-sample data. We propose a moment matching (MM) training strategy to alleviate such inconsistency by simultaneously taking these two distinct modes and their corresponding dynamics into consideration. Our MM-RNN shows significant performance improvements over existing approaches when tested on practical NLP applications including logic form generation and image captioning.

Cite as: Deng, Y., Shen, Y., Chen, K., Jin, H. (2018) Training Recurrent Neural Network through Moment Matching for NLP Applications. Proc. Interspeech 2018, 3353-3357, DOI: 10.21437/Interspeech.2018-1369.

BiBTeX Entry:

@inproceedings{Deng2018,
author={Yue Deng and Yilin Shen and KaWai Chen and Hongxia Jin},
title={Training Recurrent Neural Network through Moment Matching for NLP Applications},
year=2018,
booktitle={Proc. Interspeech 2018},
pages={3353--3357},
doi={10.21437/Interspeech.2018-1369},
url={http://dx.doi.org/10.21437/Interspeech.2018-1369} }