Date of Award

2015

Degree Type

Dissertation

Degree Name

Doctor of Philosophy in Electrical Engineering

Department

Electrical, Computer, and Biomedical Engineering

First Advisor

Ramdas Kumaresan

Abstract

The mammalian auditory system is a more robust and versatile sound analyzer than any artificial system that has been developed to date. Nature found a simple yet elegant solution for the hearing mechanism. Incorporating some key aspects of the functional organization of the mammalian auditory system into artificial signal- processing systems may drastically simplify problems of auditory representation and scene analysis such that capabilities for acoustic signal separation, detection, classification, recognition and identification can be greatly improved. The objective of the thesis is to mimic the functionality of the mammalian peripheral auditory system in a digital computer by developing a synchrony capture filterbank (SCFB) algorithm. This thesis is primarily inspired by two aspects of the peripheral auditory system: (1) synchrony capture, a phenomenon observed in the auditory nerve which involves the preferential synchronization of the discharges in a given frequency region of the cochlea to a single dominant frequency component in that region. In other words, a strong dominant frequency component suppresses any interfering weaker tones. (2) the spatial arrangement of the mammalian cochleae. The SCFB algorithm is used to track the frequency components of a speech signal, extract the pitch or fundamental frequency of quasi-periodic sounds.

To emulate synchrony capture, the proposed filterbank is designed as a two step process, which includes a coarse and a _ne analysis. The first stage is a broad filter, followed by a bank of three adaptively tunable narrower bandpass filters, which resembles the basilar membrane and the three rows of outer hair cells in the inner ear. This filterbank attempts to emulate synchrony capture-like behavior using these adaptive filters, by creating a competition for different channels amongst frequency components that not only accurately reflect their relative magnitudes, but is also invariant with respect to absolute signal amplitude. These bandpass filters are tuned by using a voltage controlled oscillator (VCO) whose frequency is steered by a frequency discriminator loop (FDL). The resulting filterbank is used to process synthetic signals and speech, and it is shown that the VCOs can track the individual low frequency harmonics and the strongest harmonic present in each formant region.

Finally, these SCFB outputs are used to compute fundamental frequency or pitch, f0 of quasi-periodic sounds present in the signal. Currently, auto-correlation based models are widely used for pitch extraction. Although there is overwhelming neurophysiological evidence for auto-correlation-like representations of sounds in the temporal _ring patterns of neurons in the auditory nerve and brainstem, how the central auditory system makes use of these representations is still not well understood. Although neuronal populations that carry out a binaural cross- correlation operation have been long identified in the auditory brainstem, no obvious analogous neural time-delay architectures for monaural auto-correlation have yet been found. This motivates the search for an alternative signal processing strategy. An approach based on SCFB is proposed as a possible alternative to autocorrelation computation. The outputs of the SCFB are adaptively phase aligned with respect to a common time reference and added to compute a summary phase aligned function (SPAF), from which fundamental frequency or pitch, f0 can then be extracted. Results show that component frequencies are f0 are faithfully tracked.

Share

COinS