SpacerHome

Spacer
Mirror Sites
Spacer
General Information
Spacer
Confernce Schedule
Spacer
Technical Program
Spacer
     Plenary Sessions
Spacer
     Special Sessions
Spacer
     Expert Summaries
Spacer
     Tutorials
Spacer
     Industry Technology Tracks
Spacer
     Technical Sessions
    
By Date
    March 16
    March 17
    March 18
    March 19
    
By Category
    AE     COMM
    DISPS     DSPE
    ESS     IMDSP
    ITT     MMSP
    NNSP     SAM
    SP     SPEC
    SPTM
    
By Author
        A    B    C    D   
        E    F    G    H   
        I    J    K    L   
        M    N    O    P   
        Q    R    S    T   
        U    V    W    X   
        Y    Z   
Spacer
Tutorials
Spacer
Industry Technology Tracks
Spacer
Exhibits
Spacer
Sponsors
Spacer
Registration
Spacer
Coming to Phoenix
Spacer
Call for Papers
Spacer
Author's Kit
Spacer
On-line Review
Spacer
Future Conferences
Spacer
Help

Abstract: Session SP-10

Conference Logo

SP-10.1  

PDF File of Paper Manuscript
SPEAKER VERIFICATION PERFORMANCE AND THE LENGTH OF TEST SENTENCE
Jialong He, Li Liu (Dept. Speech & Hearing Science, Arizona State University)

It is known that the performance of a speaker verification system improves with the length of test sentences. However, little is known about the exact relation between the performance and the test length. That makes it difficult to compare the results from various studies in which different test lengths have been used to evaluate the systems. In this paper, we have proposed a method to calculate the verification error rates at any lengths of test sentences, as long as the error rates at two different lengths are given. The accuracy of this calculation method is demonstrated with a speaker verification experiment and with the results reported in literature. Good agreement is shown between the calculated values and that measured through experiments.


SP-10.2  

PDF File of Paper Manuscript
ON THE USE OF SOME DIVERGENCE MEASURES IN SPEAKER RECOGNITION
Rivarol Vergin (Universite de Moncton), Douglas O'Shaughnessy (INRS-Telecommunications)

The first motivation for using Gaussian Mixture Models for text-independent speaker identification is based on the observation that a linear combination of Gaussian basis functions is capable of representing a large class of sample distributions. While this technique gives generally good results little is known about which specific part of a speech signal bests identifies a speaker. This contribution suggests a procedure, based on Jensen divergence measure, to automatically extract from the input speech signal the part that best contributes to identify a speaker. Experiments conducted using the Spidre database indicate a significant improvement in the performance of the speaker recognition system.


SP-10.3  

PDF File of Paper Manuscript
Improving a GMM Speaker Verification System by Phonetic Weighting
Roland Auckenthaler, Eluned S Parris, Michael J Carey (Ensigma Ltd)

This paper compares two approaches to speaker verification, Gaussian mixture models (GMMs) and Hidden Markov models (HMMs). The GMM based system outperformed the HMM system, this was mainly due to the ability of the GMM to make better use of the training data. The best scoring GMM frames were strongly correlated with particular phonemes e.g. vowels and nasals. Two techniques were used to try and exploit the different amounts of discrimination provided by the phonemes to improve the performance of the GMM based system. Applying linear weighting to the phonemes showed that less than half of the phonemes were contributing to the overall system performance. Using an MLP to weight the phonemes provided a significant improvement in performance for male speakers but no improvement has yet been achieved for women.


SP-10.4  

PDF File of Paper Manuscript
A HYBRID SCORE MEASUREMENT FOR HMM-BASED SPEAKER VERIFICATION
Yong Gu, Trevor Thomas (Vocalis Ltd., UK)

In speaker verification the world model based approach and the cohort model based approach have been used for better HMM score measurements for verification comparison. From theoretical analysis these two approaches represent two different paradigms for verification decision-making strategy. Two techniques could be combined for a better solution. In the paper we present a hybrid score measurement which combines the world model based technique and the cohort model based technique together. The method is evaluated with the YOHO database. The results show that the combination can lead a better score measurement which improves speaker verification performance. An experimental comparison between the world model based approach and the cohort model based approach with the YOHO database can also be found in the paper.


SP-10.5  

PDF File of Paper Manuscript
Polynomial Classifier Techniques for Speaker Verification
William M Campbell (Motorola SSG), Khaled T Assaleh (Rockwell Semiconductor Systems)

Modern speaker verification applications require high accuracy at low complexity. We propose the use of a polynomial-based classifier to achieve this objective. We demonstrate a new combination of techniques which makes polynomial classification accurate and powerful for speaker verification. We show that discriminative training of polynomial classifiers can be performed on large data sets. A prior probability compensation method is detailed which increases accuracy and normalizes the output score range. Results are given for the application of the new methods to YOHO.


SP-10.6  

PDF File of Paper Manuscript
Channel-Robust Speaker Identification using Modified-Mean Cepstral Mean Normalization with Frequency Warping
Alvin A Garcia (SpeakEZ/T-NETIX, Inc.), Richard J Mammone (CAIP Center, Rutgers University)

The performance of automatic speaker recognition systems is significantly degraded by acoustic mismatches between training and testing conditions. Such acoustic mismatches are commonly encountered in systems that operate on speech collected over telephone networks, where different handsets and different network routes impose varying convolutional distortions on the speech signal. A new algorithm, the Modified-Mean Cepstral Mean Normalization with Frequency Warping (MMCMNFW) method, which improves upon the commonly-employed Cepstral Mean Subtraction method, has been developed. Experimental results on closed-set speaker identification tasks on a channel-corrupted subset of the TIMIT database and on a subset of the NTIMIT database are presented. The new algorithm is shown to offer improved recognition rates over other existing channel normalization methods on these databases.


SP-10.7  

PDF File of Paper Manuscript
Feature Selection Using Genetics-Based Algorithm and Its Application to Speaker Identification
Mubeccel Demirekler (Electrical & Electronics Eng. Dept., Middle East Technical University), Ali Haydar (Electrical & Electronics Eng. Dept., Eastern Mediterranean University)

This paper introduces the use of genetics-based algorithm in the reduction of 24 parameter set (i.e. the base set) to a 5, 6, 7, 8 or 10 parameter set, for each speaker in text-independent speaker identification. The feature selection is done by finding the best features that discriminates a person from his/her two closest neighbors. The experimental results show that there is approximately 5% increase in the recognition rate when the reduced set of parameters are used. Also the amount of calculation necessary for speaker recognition using the reduced set of features is much less than the amount of calculation required using the complete feature set in the testing phase. Hence it is more desirable to use the subset of the complete feature set found using the genetic algorithm suggested.


SP-9 SP-11 >


Last Update:  February 4, 1999         Ingo Höntsch
Return to Top of Page