Toward protein structure analysis with self-organizing maps

Document Type

Conference Proceeding

Date of Original Version



Establishing structure-function relationships on the proteomic scale is a unique challenge faced by bioinformatics and molecular biosciences. Large protein families represent natural libraries of analogues of a given catalytic or protein function, thus making them ideal targets for the investigation of structure-function relationships in proteins. To this end, we have developed a new technique for analyzing large amounts of detailed molecular structure information focusing on the functional centers of homologous proteins. Our approach uses unsupervised machine learning, in particular, self-organizing maps. The information captured by a self-organizing map and stored in its reference models highlights the essential structure of the proteins under investigation and can be effectively used to study detailed structural differences and similarities among homologous proteins. Our preliminary results obtained with a prototype based on these techniques demonstrate that we can classify proteins and identify common and unique structures within a family and, more importantly, identify common and unique structural features of different conformations of the same protein. The approach developed here outperforms many of today's structure analysis tools. These tools are usually either limited by the number of proteins they can process at the same time or they are limited by the structural resolution they can accommodate, that is, many of the structural analysis tools that can handle multiple proteins at the same time limit themselves to secondary structure analysis and therefore miss fine structural nuances within proteins. It is worthwhile noting that the ability of our approach to analyze different conformations of the same protein is beyond the capabilities of multiple residue sequence alignment techniques. © 2005 IEEE.

Publication Title

Proceedings of the 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB '05