Multifractal analysis of unvoiced speech signals

Olufemi A Adeyemi, University of Rhode Island

Abstract

In this thesis, we analyze the complexity involved in the production of unvoiced speech signals with measures from nonlinear dynamics and chaos theory. Previous research successfully characterized some speech signals as chaotic. However, in this dissertation, we use multifractal measures to postulate the presence of various fractal regimes present in the attractors of unvoiced speech signals. We extend prior work which used only correlation dimension $D\sb2$ and Lyapunov Exponents to analyze some speech sounds. We capture the chaotic properties of unvoiced speech signals in the embedded vector space more succinctly by not only estimating the correlation dimension $D\sb2$, but by also estimating the generalized dimension $D\sb{q}$. The (non-constant) generalized dimensions were estimated from phase space reconstructed vectors of single scalar variable realization of unvoiced speech signals. The largest of those dimensions is an indicator of the minimum dimension required in the phase space of any realistic dynamical model of speech signals.^ Results of the generalized dimension estimation support the hypothesis that unvoiced speech signals indeed have multifractal measures. The multifractal analysis also reveals that unvoiced speech signals exhibit low-dimensional chaos as well as "soft" turbulence. This is in contrast to the opinion that unvoiced speech signals are generated from what is technically known as "hard" turbulent flow, in which the dimension of a dynamical model is very high. Unvoiced speech signals may actually be generated from "soft" turbulent flow.^ In this dissertation, we also explore the relationship between the estimated generalized dimensions $D\sb{q}$ and the singularity spectrum $f(\alpha).$ Existing algorithms for accurately estimating the resulting singularity spectrum $f(\alpha)$ from the samples of generalized dimensions $D\sb{q}$ of a multifractal chaotic time series use either (a) linear interpolation of the known, coarsely sampled, $D\sb{q}$ values or (b) a finely sampled $D\sb{q}$ curve obtained at great computational/experimental expense. Also, in conventional techniques the derivative in the expression for Legendre transform necessary to go from $D\sb{q}$ to $f(\alpha)$ is approximated using first order centered difference equation. Finely sampling the $D\sb{q}$ is computationally intensive and the simple linear approximations to interpolation and differentiation give erroneous end points in the $f(\alpha)$ curve. We propose using standard min-max filter design methods to more accurately interpolate between known samples of the $D\sb{q}$ values and compute the differentiation needed to evaluate the Legendre transform. We use optimum (min-max) interpolators and differentiators designed with the Parks-McClellan algorithm. We have computed the generalized dimensions and singularity spectrum of 20 unvoiced speech sounds from the ISOLET database. The results not only indicate multifractality of certain unvoiced speech sounds, but also may lead to nonlinear maps that may be useful in improving the nonlinear dynamical modeling of speech sounds.^ This new approach to $f(\alpha)$ singularity spectrum calculation exhibits computational reduction and improved accuracy. The proposed method also provides estimates of the generalized dimensions at $D\sb{\infty}$ and $D\sb{-\infty}$ which are almost impossible to obtain from real data with limited number of data samples. Also, the asymmetric spread of $\alpha$ values with the corresponding $f(\alpha)$ around the maximum of $f(\alpha)$ reveal the inhomogeneity in the attractors of unvoiced speech signals just like the variations in the $D\sb{q}$ values. The asymmetric spread of $\alpha$ values may also be an indication that the turbulent energy fields generated during unvoiced speech production are made of non-homogeneous fractals. ^

Subject Area

Engineering, Electronics and Electrical|Physics, Acoustics

Recommended Citation

Olufemi A Adeyemi, "Multifractal analysis of unvoiced speech signals" (1997). Dissertations and Master's Theses (Campus Access). Paper AAI9805227.
http://digitalcommons.uri.edu/dissertations/AAI9805227



Share

COinS