Date of Award
Doctor of Philosophy (PhD)
G. Faye Boudreaux-Bartels
In this thesis, we analyze the complexity involved in the production of unvoiced speech signals with measures from nonlinear dynamics and chaos theory. Previous research successfully characterized some speech signals as chaotic. However, in this dissertation, we use multifractal measures to postulate the presence of various fractal regimes present in the attractors of unvoiced speech signals. We extend prior work which used only correlation dimension D2 and Lyapunov Exponents to analyze some speech sounds. We capture the chaotic properties of unvoiced speech signals in the embedded vector space more succinctly by not only estimating the correlation dimension D2, but also estimating the generalized dimension Dq. The (non-constant) generalized dimension were estimated from phase space reconstructed vectors of single scalar variable realization of unvoiced speech signals. The largest of those dimensions is an indicator of the minimum dimension required in the phase space of any realistic dynamic model of speech signals.
Results of the generalized dimension estimation support the hypothesis that unvoiced speech signals indeed have multifractal measures. The multifractal analysis also reveals that unvoiced speech signals exhibit low-dimensional chaos as well as "soft" turbulence. This is in contrast to the opinion that unvoiced speech signals are generated from what is technically known as "hard" turbulent flow, in which the dimension of a dynamical model is very high. Unvoiced speech signals may actually be generated from "soft" turbulent flow.
In this dissertation, we explore the relationship between the estimated generalized dimension Dq and the singularity spectrum ƒ(α). Existing algorithms for accurately estimated the resulting singularity spectrum ƒ(α) from the samples of generalized dimensions Dq of a multifractal chaotic time series use either (a) linear interpolation of the known, coarsely sampled, Dq values or (b) a finely sampled Dq curve obtained at great computational/experimental expense. Also, in conventional techniques the derivative in the expression for Legendre transform necessary to go from Dq to ƒ(α) is approximated using first order centered difference equation. Finely sampling the Dq is computationally intensive and the simple linear approximations to interpolation and differentiation give erroneous end points in the ƒ(α) curve. We propose using standard min-max filter design methods to more accurately interpolate between known samples of the Dq values and compute the differentiation needed to evaluate the Legendre transform. We use optimum (min-max) interolators and differentiators designed with the Parks-McClellan algorithm. We have computed the generalized dimensions and singularity spectrum of 20 unvoiced speech sounds from the ISOLET database. The results not only indicated multifractality of certain unvoiced speech sounds, but also may lead to nonlinear maps that may be useful in improving the nonlinear dynamical modeling of speech sounds.
This new approach to ƒ(α) singularity spectrum calculation exhibits computational reduction and improved accuracy. The proposed method also provides estimates of the generalized dimensions at D∞ and D-∞ which are almost impossible to obtain from real data with limited number of data samples. Also, the asymmetric spread of α values with the corresponding ƒ(α) around the maximum of ƒ(α) reveal the inhomogeneity in the attractors of unvoiced speech signals just like the variations in the Dq values. The asymmetric spread of α values may also be an indication that the turbulent energy fields generated during unvoiced speech production are made of non-homogeneous fractals.
Adeyemi, Olufemi A., "Multifractal Analysis of Unvoiced Speech Signals" (1997). Open Access Dissertations. Paper 468.