A Deep Identity Representation for Noise Robust Spoofing Detection
Alejandro Gómez Alanís, Antonio M. Peinado, Jose A. Gonzalez and Angel Gomez
Abstract:
The issue of the spoofing attacks which may affect automatic speaker verification systems (ASVs) has recently received an increased attention, so that a number of countermeasures have been developed for detecting high technology attacks such as speech synthesis and voice conversion. However, the performance of anti-spoofing systems degrades significantly in noisy conditions. To address this issue, we propose a deep learning framework to extract spoofing identity vectors, as well as the use of soft missing-data masks. The proposed feature extraction employs a convolutional neural network (CNN) plus a recurrent neural network (RNN) in order to provide a single deep feature vector per utterance. Thus, the CNN is treated as a convolutional feature extractor that operates at the frame level. On top of the CNN outputs, the RNN is employed to obtain a single spoofing identity representation of the whole utterance. Experimental evaluation is carried out on both a clean and a noisy version of the ASVSpoof2015 corpus. The experimental results show that our proposals clearly outperforms other methods recently proposed such as the popular CQCC+GMM system or other similar deep feature systems for both seen and unseen noisy conditions.
Cite as: Gómez Alanís, A., Peinado, A.M., Gonzalez, J.A., Gomez, A. (2018) A Deep Identity Representation for Noise Robust Spoofing Detection. Proc. Interspeech 2018, 676-680, DOI: 10.21437/Interspeech.2018-1909.
BiBTeX Entry:
@inproceedings{Gómez Alanís2018,
author={Alejandro {Gómez Alanís} and Antonio M. Peinado and Jose A. Gonzalez and Angel Gomez},
title={A Deep Identity Representation for Noise Robust Spoofing Detection},
year=2018,
booktitle={Proc. Interspeech 2018},
pages={676--680},
doi={10.21437/Interspeech.2018-1909},
url={http://dx.doi.org/10.21437/Interspeech.2018-1909} }