Bayesian probability approach to feature significance for infrared spectra of bacteria

Document Type


Date of Original Version



The significance of a spectral feature is defined as the probability that the feature captures the structure of the data set at hand. In particular, the significance is equal to a value proportional to the variance of a feature within a particular data set. The larger the variance, the higher the probability that the feature will capture the underlying structure. This approach is particularly useful when significance is used to select features differentiating clusters of samples and for the construction of selforganizing maps (SOMs) of clusters. A significance spectrum is obtained by plotting significance as a function of wavenumber. After developing the approach for feature significance, the significance framework was applied to the construction of SOMs for clustering infrared spectra of bacteria. The significance framework consistently chooses features that make it possible to construct maps with reduced feature sets that are at least as good as the maps constructed on full feature sets. In addition, significance reliably picks features that are consistent with biological interpretations of the spectra. © 2012 Society for Applied Spectroscopy.

Publication Title, e.g., Journal

Applied Spectroscopy