HUB



Estimation of the Vocal Tract Length of Vowel Sounds Based on the Frequency of the Significant Spectral Valley

TV Ananthapadmanabha and Ramakrishnan A G

Abstract:

Estimating the vocal tract length (VTL), given the acoustic signal of a vowel sound, is an important problem, which is useful in speaker normalization for vowel recognition, in the inversion problem and in acoustic-phonetic studies. The common approach of using the formant data to estimate VTL works for a neutral vowel approximating a uniform tube. However, for natural vowels, formant data shift considerably away from the resonant frequencies of a uniform tube. The proposed method is motivated from these observations: (a) the frequency of a spectral valley, F_v, depends inversely on VTL; (b) there is much smaller shift in F_v, across vowels, from the corresponding valley frequency of a uniform tube; (c) F_v can be estimated from the spectral envelope itself. VTL has been estimated for the Peterson and Barney (33 male and 28 female speakers) and the TIMIT (326 male and 136 female speakers) databases. When the estimated F_v is used for normalization, the spread in the formant data due to gender differences is considerably reduced. The normalization procedure is vowel and speaker intrinsic. Additionally, we report applications such as Front/Back classification, gender recognition and phonetic feature mapping.


Cite as: Ananthapadmanabha, T., A G, R. (2018) Estimation of the Vocal Tract Length of Vowel Sounds Based on the Frequency of the Significant Spectral Valley. Proc. Interspeech 2018, 2102-2106, DOI: 10.21437/Interspeech.2018-1105.


BiBTeX Entry:

@inproceedings{Ananthapadmanabha2018,
author={TV Ananthapadmanabha and Ramakrishnan {A G}},
title={Estimation of the Vocal Tract Length of Vowel Sounds Based on the Frequency of the Significant Spectral Valley},
year=2018,
booktitle={Proc. Interspeech 2018},
pages={2102--2106},
doi={10.21437/Interspeech.2018-1105},
url={http://dx.doi.org/10.21437/Interspeech.2018-1105} }