How vulnerable are prosodic features to professional imitators?

Mireia Farrus, Michael Wagner, Jan Anguita and Javier Hernando

Abstract

Voice imitation is one of the potential threats to security systems that use automatic speaker recognition. Since prosodic features have been considered for state-of-the-art recognition systems in recent years, the question arises as to how vulnerable these features are to voice mimicking. In this study, two experiments are conducted for twelve individual features in order to determine how a prosodic speaker identification system would perform against professionally imitated voices. By analysing prosodic parameters, the results show that the identification error rate increases for most of the features, except for the range of the fundamental frequency, which seems to be relatively robust against voice mimicking. When all twelve features are fused, the identification error rate increases from 5% between the target voices and the imitators’ natural voices to 22% between the target voices and the imitators’ impersonations.

full text

 

Winelands1
Winelands2
1399019 vinyard
Winelands3

BuiltWithNOF

ABSTRACTS

sun-small
spescom1
Resize of Resize of ISCA_logo2