Detection of Glottal Closure Instants from Speech Signals: A Convolutional Neural Network Based Method
Shuai Yang, Zhiyong Wu, Binbin Shen and Helen Meng
Abstract:
Most conventional methods to detect glottal closure instants (GCI) are based on signal processing technologies and different GCI candidate selection methods. This paper proposes a classification method to detect glottal closure instants from speech waveforms using convolutional neural network (CNN). The procedure is divided into two successive steps. Firstly, a low-pass filtered signal is computed, whose negative peaks are taken as candidates for GCI placement. Secondly, a CNN-based classification model determines for each peak whether it corresponds to a GCI or not. The method is compared with three existing GCI detection algorithms on two publicly available databases. For the proposed method, the detection accuracy in terms of F1-score is 98.23%. Additional experiment indicates that the model can perform better after trained with the speech data from the speakers who are the same as those in the test set.
Cite as: Yang, S., Wu, Z., Shen, B., Meng, H. (2018) Detection of Glottal Closure Instants from Speech Signals: A Convolutional Neural Network Based Method. Proc. Interspeech 2018, 317-321, DOI: 10.21437/Interspeech.2018-1281.
BiBTeX Entry:
@inproceedings{Yang2018,
author={Shuai Yang and Zhiyong Wu and Binbin Shen and Helen Meng},
title={Detection of Glottal Closure Instants from Speech Signals: A Convolutional Neural Network Based Method},
year=2018,
booktitle={Proc. Interspeech 2018},
pages={317--321},
doi={10.21437/Interspeech.2018-1281},
url={http://dx.doi.org/10.21437/Interspeech.2018-1281} }