Tracking the frequency components in speech and music signals

Xiaoshu Qian, University of Rhode Island

Abstract

This dissertation addressed the problem of detecting and estimating the frequency components in a time varying signal. Specifically, we studied the problems of estimating the pitch frequency of a voiced speech signal and extracting the frequency components in a music signal.^ Our pitch estimation research clarified some relationships among the existing pitch determination algorithms (PDAs). It also resulted in a few new PDAs: Firstly, a multi-channel PDA was proposed for use with speech data recorded using a microphone array. This algorithm jointly estimates the pitch and signal's time delay at each microphone. Secondly, a class of variable frame PDAs was proposed. In addition, we presented a new ML formulation of the pitch estimation problem. This formulation removes the singularity in the previous formulation caused by the number of unknown model parameters increasing with the unknown pitch period. Two joint pitch and spectrum estimators were derived, respectively, for the white and autoregressive spectrum models. Finally, we generalized these two estimators such that they can use multi-channel data to jointly estimate pitch, spectrum and time delay of the signal.^ As for the problem of estimating the frequency components in a music tone signal, we proposed a phase interpolation algorithm for use in an existing music synthesis algorithm and a new analysis-based music analysis and synthesis algorithm. The proposed phase interpolation algorithm uses quadratic spline functions to model the phase track. With respect to the perceived sound quality, it is much better than the existing algorithm using the quadratic phase and is comparable to the one that uses a cubic phase. However, the latter requires more computation and storage space.^ The proposed analysis-based music analysis-and-synthesis algorithm aims to improve analysis accuracy and increase synthesis efficiency of the similar algorithms in the literature. The analysis accuracy is improved by using a global waveform fitting approach to estimating the model parameters, in contrast to the frame-based Fourier transform approach used in the existing algorithms. The synthesis efficiency in both computation time and storage space is increased by using a quadratic phase model rather than the cubic phase model used in the previous algorithms. ^

Subject Area

Engineering, Electronics and Electrical

Recommended Citation

Xiaoshu Qian, "Tracking the frequency components in speech and music signals" (1996). Dissertations and Master's Theses (Campus Access). Paper AAI9723570.
http://digitalcommons.uri.edu/dissertations/AAI9723570



Share

COinS