Bayesian probability approach to feature significance for infrared spectra of bacteria
Document Type
Article
Date of Original Version
2-1-2012
Abstract
The significance of a spectral feature is defined as the probability that the feature captures the structure of the data set at hand. In particular, the significance is equal to a value proportional to the variance of a feature within a particular data set. The larger the variance, the higher the probability that the feature will capture the underlying structure. This approach is particularly useful when significance is used to select features differentiating clusters of samples and for the construction of selforganizing maps (SOMs) of clusters. A significance spectrum is obtained by plotting significance as a function of wavenumber. After developing the approach for feature significance, the significance framework was applied to the construction of SOMs for clustering infrared spectra of bacteria. The significance framework consistently chooses features that make it possible to construct maps with reduced feature sets that are at least as good as the maps constructed on full feature sets. In addition, significance reliably picks features that are consistent with biological interpretations of the spectra. © 2012 Society for Applied Spectroscopy.
Publication Title, e.g., Journal
Applied Spectroscopy
Volume
66
Issue
1
Citation/Publisher Attribution
Hamel, Lutz, and Chris W. Brown. "Bayesian probability approach to feature significance for infrared spectra of bacteria." Applied Spectroscopy 66, 1 (2012): 48-59. doi: 10.1366/10-06155.