Unsupervised learning in spectral genome analysis
Document Type
Conference Proceeding
Date of Original Version
12-1-2007
Abstract
The tree representation as a model for organismal evolution has been in use since before Darwin. However, with the recent unprecedented access to biomolecular data it has been discovered that, especially in the microbial world, individual genes making up the genome of an organism give rise to different and sometimes conflicting evolutionary tree topologies. This discovery calls into question the notion of a single evolutionary tree for an organism and gives rise to the notion of an evolutionary consensus tree based on the evolutionary patterns of the majority of genes in a genome embedded in a network of gene histories. Here we discuss an approach to the analysis of genomic data of multiple genomes using bipartition spectral analysis and unsupervised learning. An interesting observation is that genes within genomes that have evolutionary tree topologies that are in significant conflict with the evolutionary consensus tree of an organism point to possible horizontal gene transfer events which often delineate significant evolutionary events. © 2007 IEEE.
Publication Title, e.g., Journal
Proceedings of the Frontiers in the Convergence of Bioscience and Information Technologies, FBIT 2007
Citation/Publisher Attribution
Hamel, Lutz, Neha Nahar, Maria S. Poptsova, Olga Zhaxybayeva, and J. P. Gogarten. "Unsupervised learning in spectral genome analysis." Proceedings of the Frontiers in the Convergence of Bioscience and Information Technologies, FBIT 2007 (2007): 317-321. doi: 10.1109/FBIT.2007.81.