A New Frequency Coverage Metric and a New Subband Encoding Model, with an Application in Pitch Estimation
Abstract:
The auditory filterbank has been a well-accepted and important tool for speech feature extraction. It decomposes the speech signal into subbands usually on an equivalent rectangular bandwidth frequency scale before further subband analysis and processing, such as auto-correlation and cross-correlation. However, the choice of the number of subbands and subband center frequencies for a given frequency range has been essentially empirical in the literature. Moreover, correlation of subband signals may not produce distinct peaks for feature extraction. This paper proposes a novel frequency coverage metric to calculate the required number of subbands. It also presents a new subband encoding model for correlation processing, inspired by psychoacoustic studies and statistical analysis. The proposed frequency coverage metric and the subband encoding model are applied to a pitch estimation method as an example of their possible implementations in the speech feature extraction. Compared with state-of-the-art methods, evaluation results demonstrate the benefits of the proposed methods.
Cite as: Lin, S. (2018) A New Frequency Coverage Metric and a New Subband Encoding Model, with an Application in Pitch Estimation. Proc. Interspeech 2018, 2147-2151, DOI: 10.21437/Interspeech.2018-2590.
BiBTeX Entry:
@inproceedings{Lin2018,
author={Shoufeng Lin},
title={A New Frequency Coverage Metric and a New Subband Encoding Model, with an Application in Pitch Estimation},
year=2018,
booktitle={Proc. Interspeech 2018},
pages={2147--2151},
doi={10.21437/Interspeech.2018-2590},
url={http://dx.doi.org/10.21437/Interspeech.2018-2590} }