Speaker Recognition

Home
Full List of Titles
1: Speech Processing
CELP Coding
Large Vocabulary Recognition
Speech Analysis and Enhancement
Acoustic Modeling I
ASR Systems and Applications
Topics in Speech Coding
Speech Analysis
Low Bit Rate Speech Coding I
Robust Speech Recognition in Noisy Environments
Speaker Recognition
Acoustic Modeling II
Speech Production and Synthesis
Feature Extraction
Robust Speech Recognition and Adaptation
Low Bit Rate Speech Coding II
Speech Understanding
Language Modeling I
2: Speech Processing, Audio and Electroacoustics, and Neural Networks
Acoustic Modeling III
Lexical Issues/Search
Speech Understanding and Systems
Speech Analysis and Quantization
Utterance Verification/Acoustic Modeling
Language Modeling II
Adaptation /Normalization
Speech Enhancement
Topics in Speaker and Language Recognition
Echo Cancellation and Noise Control
Coding
Auditory Modeling, Hearing Aids and Applications of Signal Processing to Audio and Acoustics
Spatial Audio
Music Applications
Application - Pattern Recognition & Speech Processing
Theory & Neural Architecture
Signal Separation
Application - Image & Nonlinear Signal Processing
3: Signal Processing Theory & Methods I
Filter Design and Structures
Detection
Wavelets
Adaptive Filtering: Applications and Implementation
Nonlinear Signals and Systems
Time/Frequency and Time/Scale Analysis
Signal Modeling and Representation
Filterbank and Wavelet Applications
Source and Signal Separation
Filterbanks
Emerging Applications and Fast Algorithms
Frequency and Phase Estimation
Spectral Analysis and Higher Order Statistics
Signal Reconstruction
Adaptive Filter Analysis
Transforms and Statistical Estimation
Markov and Bayesian Estimation and Classification
4: Signal Processing Theory & Methods II, Design and Implementation of Signal Processing Systems, Special Sessions, and Industry Technology Tracks
System Identification, Equalization, and Noise Suppression
Parameter Estimation
Adaptive Filters: Algorithms and Performance
DSP Development Tools
VLSI Building Blocks
DSP Architectures
DSP System Design
Education
Recent Advances in Sampling Theory and Applications
Steganography: Information Embedding, Digital Watermarking, and Data Hiding
Speech Under Stress
Physics-Based Signal Processing
DSP Chips, Architectures and Implementations
DSP Tools and Rapid Prototyping
Communication Technologies
Image and Video Technologies
Automotive Applications / Industrial Signal Processing
Speech and Audio Technologies
Defense and Security Applications
Biomedical Applications
Voice and Media Processing
Adaptive Interference Cancellation
5: Communications, Sensor Array and Multichannel
Source Coding and Compression
Compression and Modulation
Channel Estimation and Equalization
Blind Multiuser Communications
Signal Processing for Communications I
CDMA and Space-Time Processing
Time-Varying Channels and Self-Recovering Receivers
Signal Processing for Communications II
Blind CDMA and Multi-Channel Equalization
Multicarrier Communications
Detection, Classification, Localization, and Tracking
Radar and Sonar Signal Processing
Array Processing: Direction Finding
Array Processing Applications I
Blind Identification, Separation, and Equalization
Antenna Arrays for Communications
Array Processing Applications II
6: Multimedia Signal Processing, Image and Multidimensional Signal Processing, Digital Signal Processing Education
Multimedia Analysis and Retrieval
Audio and Video Processing for Multimedia Applications
Advanced Techniques in Multimedia
Video Compression and Processing
Image Coding
Transform Techniques
Restoration and Estimation
Image Analysis
Object Identification and Tracking
Motion Estimation
Medical Imaging
Image and Multidimensional Signal Processing Applications I
Segmentation
Image and Multidimensional Signal Processing Applications II
Facial Recognition and Analysis
Digital Signal Processing Education

Author Index
A B C D E F G H I
J K L M N O P Q R
S T U V W X Y Z

Speaker Verification Performance And The Length Of Test Sentence

Authors:

Jialong He,
Li Liu,

Page (NA) Paper number 1021

Abstract:

It is known that the performance of a speaker verification system improves with the length of test sentences. However, little is known about the exact relation between the performance and the test length. That makes it difficult to compare the results from various studies in which different test lengths have been used to evaluate the systems. In this paper, we have proposed a method to calculate the verification error rates at any lengths of test sentences, as long as the error rates at two different lengths are given. The accuracy of this calculation method is demonstrated with a speaker verification experiment and with the results reported in literature. Good agreement is shown between the calculated values and that measured through experiments.

IC991021.PDF (From Author) IC991021.PDF (Rasterized)

TOP


On The Use Of Some Divergence Measures In Speaker Recognition

Authors:

Rivarol Vergin,
Douglas O'Shaughnessy,

Page (NA) Paper number 1336

Abstract:

The first motivation for using Gaussian Mixture Models for text-independent speaker identification is based on the observation that a linear combination of Gaussian basis functions is capable of representing a large class of sample distributions. While this technique gives generally good results little is known about which specific part of a speech signal bests identifies a speaker. This contribution suggests a procedure, based on Jensen divergence measure, to automatically extract from the input speech signal the part that best contributes to identify a speaker. Experiments conducted using the Spidre database indicate a significant improvement in the performance of the speaker recognition system.

IC991336.PDF (From Author) IC991336.PDF (Rasterized)

TOP


Improving a GMM Speaker Verification System by Phonetic Weighting

Authors:

Roland Auckenthaler,
Eluned S Parris,
Michael J Carey,

Page (NA) Paper number 1440

Abstract:

This paper compares two approaches to speaker verification, Gaussian mixture models (GMMs) and Hidden Markov models (HMMs). The GMM based system outperformed the HMM system, this was mainly due to the ability of the GMM to make better use of the training data. The best scoring GMM frames were strongly correlated with particular phonemes e.g. vowels and nasals. Two techniques were used to try and exploit the different amounts of discrimination provided by the phonemes to improve the performance of the GMM based system. Applying linear weighting to the phonemes showed that less than half of the phonemes were contributing to the overall system performance. Using an MLP to weight the phonemes provided a significant improvement in performance for male speakers but no improvement has yet been achieved for women.

IC991440.PDF (From Author) IC991440.PDF (Rasterized)

TOP


A Hybrid Score Measurement For HMM-Based Speaker Verification

Authors:

Yong Gu, Vocalis Ltd., UK (U.K.)
Trevor Thomas, Vocalis Ltd., UK (U.K.)

Page (NA) Paper number 1636

Abstract:

In speaker verification the world model based approach and the cohort model based approach have been used for better HMM score measurements for verification comparison. From theoretical analysis these two approaches represent two different paradigms for verification decision-making strategy. Two techniques could be combined for a better solution. In the paper we present a hybrid score measurement which combines the world model based technique and the cohort model based technique together. The method is evaluated with the YOHO database. The results show that the combination can lead a better score measurement which improves speaker verification performance. An experimental comparison between the world model based approach and the cohort model based approach with the YOHO database can also be found in the paper.

IC991636.PDF (From Author) IC991636.PDF (Rasterized)

TOP


Polynomial Classifier Techniques for Speaker Verification

Authors:

William M Campbell,
Khaled T Assaleh,

Page (NA) Paper number 1735

Abstract:

Modern speaker verification applications require high accuracy at low complexity. We propose the use of a polynomial-based classifier to achieve this objective. We demonstrate a new combination of techniques which makes polynomial classification accurate and powerful for speaker verification. We show that discriminative training of polynomial classifiers can be performed on large data sets. A prior probability compensation method is detailed which increases accuracy and normalizes the output score range. Results are given for the application of the new methods to YOHO.

IC991735.PDF (From Author) IC991735.PDF (Rasterized)

TOP


Channel-Robust Speaker Identification using Modified-Mean Cepstral Mean Normalization with Frequency Warping

Authors:

Alvin A Garcia,
Richard J Mammone,

Page (NA) Paper number 2165

Abstract:

The performance of automatic speaker recognition systems is significantly degraded by acoustic mismatches between training and testing conditions. Such acoustic mismatches are commonly encountered in systems that operate on speech collected over telephone networks, where different handsets and different network routes impose varying convolutional distortions on the speech signal. A new algorithm, the Modified-Mean Cepstral Mean Normalization with Frequency Warping (MMCMNFW) method, which improves upon the commonly-employed Cepstral Mean Subtraction method, has been developed. Experimental results on closed-set speaker identification tasks on a channel-corrupted subset of the TIMIT database and on a subset of the NTIMIT database are presented. The new algorithm is shown to offer improved recognition rates over other existing channel normalization methods on these databases.

IC992165.PDF (From Author) IC992165.PDF (Rasterized)

TOP


Feature Selection Using Genetics-Based Algorithm and Its Application to Speaker Identification

Authors:

Mubeccel Demirekler,
Ali Haydar,

Page (NA) Paper number 5026

Abstract:

This paper introduces the use of genetics-based algorithm in the reduction of 24 parameter set (i.e. the base set) to a 5, 6, 7, 8 or 10 parameter set, for each speaker in text-independent speaker identification. The feature selection is done by finding the best features that discriminates a person from his/her two closest neighbors. The experimental results show that there is approximately 5% increase in the recognition rate when the reduced set of parameters are used. Also the amount of calculation necessary for speaker recognition using the reduced set of features is much less than the amount of calculation required using the complete feature set in the testing phase. Hence it is more desirable to use the subset of the complete feature set found using the genetic algorithm suggested.

IC995026.PDF (Scanned)

TOP