Bioinformatics Software Design Projects

2016-01-A3DSRINP-CSC-STA-CMB-522-BPS-542

Raymond Pulver, URIFollow
Neal Buxton, URIFollow
Xiaodong Wang, URIFollow
John Lucci, URIFollow
Jean Yves Hervé, University of Rhode IslandFollow
Lenore Martin, University of Rhode IslandFollow

Other Title

2016-01-TechnicalReport-Advanced 3D Structural Reconstruction of Independent Nanoparticle Images Obtained from Cryogenic Transmission Electron Microscopy

Document Type

Report

Comments

Technical Documents are zip files containing final deliverables from the URI project-oriented graduate course:

CMB-CSC-STA-522-BPS-542 Bioinformatics I started in 2000 by Joan Peckham (Computer Science and Statistics) and Lenore M. Martin (Cell and Molecular Biology).

The projects are software engineering design and partial implementation for Biology Research

Date of Original Version

5-12-2016

Abstract

Cholesterol is carried and transported through bloodstream by lipoproteins. There are two types of lipoproteins: low density lipoprotein, or LDL, and high density lipoprotein, or HDL. LDL cholesterol is considered “bad” cholesterol because it can form plaque and hard deposit leading to arteries clog and make them less flexible. Heart attack or stroke will happen if the hard deposit blocks a narrowed artery. HDL cholesterol helps to remove LDL from the artery back to the liver.

Traditionally, particle counts of LDL and HDL plays an important role to understanding and prediction of heart disease risk. But recently research suggested that geometric parameters of lipoprotein macromolecules, such as size and shape might be better correlated with cardiovascular risk than just the particle counts and density in the serum. No matter the particle is LDL or HDL, the particles that fall on the large end of the particle size spectrum are correlated to better health while smaller particles tied to worse health. The small end of the LDL packages in the overall LDL size spectrum are considered to be an important cause of arterial plaque initiation that can eventually lead to heart attacks and stroke. Although HDL is conventionally considered to be ”good” cholesterol, high particle counts of smaller sized HDL particles leading to a 15-fold increase in the risk of heart disease.

All of studies conclude that more factors are at play than just the size of the particle: shape and/or other geometric distribution may be even more important, and more innovative method needs to be defined to learn the geometric distribution the LDL and HDL particles.

Electron Microscopy (CryoTEM) is most frequently used for the evaluation of the ultrastructure of LDL and HDL particles. It can be used for evaluation the internal structure of the particles and shows in their native environment. Derivation of the structure of a 3D solid object is often obtained from surface properties, applicable in this instance due to the monolayer membrane of the LDL particle and the cholesterol ester interior.

Silhouette determination of shape, as credited to Baumgart, requires estimation of the visual hull of the object; the maximum volume consistent with the silhouette appearance from a series of angles. Back projecting the silhouette from each plane occurs through intersecting solid visual cones. The comparison of points across two images is the basis of stereo vision. These contour control vertices of similarity can be utilized to analyze concave regions captured in the projection contour.

A major computational method for reconstructing “solid” objects is to use multiple images that consist of a series of 2D “slices” of the target. By adjusting the slice depth of field and the focus plane, volumetric rendering can be partaken to construct a 3D model from the edges of the 2D slices.

In contrast, electron tomography reconstructs 3D structures from projections taken at regular tilt intervals. The central section of the Fourier Transform (FT) is a thin slice of the 2D FT of the orthonormal projection image. If the FT is extended toward infinity in length, the central slice in real space collapses to a single line. These slices (the line) can be used to determine the alignment of the images by determining the angle of the slices that match. These slices, with appropriate filtering to limit the oversampled low frequency contribution, can also be used to reconstruct the base image by creating a new FT from the central slices based on alignment angles. By having many thousands of such projections, a suitable reconstruction can be created.

However, the problem of reconstruction is different when one has only a single orthographic projection image from which to derive a 3D model, as opposed to having many projections at different angles. The Fourier slice approach cannot be applied when there is only a single image for reference.

In this research, we explore another option for deriving a 3D reconstruction of a particle from a single projection image, one that uses the hidden Markov model (HMM) to find the most likely 3D structure that could have produced the observed image. The hidden Markov model is a tool for modeling a discrete finite automaton where the next state of the system is based only on the current state, and for each state the system emits an observable output. The states themselves that cause the output cannot be observed.

Given a matrix representing the probabilities that the system transitions from one given state to another, a matrix representing the probabilities that a state emits an output, and a vector representing the probabilities of the initial state of the system, we are able to answer questions about the system such as 1. What is the most likely sequence of states that generated? 2. What is the most likely state at a given position in the sequence? Both problems can be solved efficiently using the dynamic programming algorithms known as the Viterbi algorithm, which solves problem 1, and the forward backward algorithm, which solves problem 2. In addition to solving these problems, you can also train the original model by adjusting the state transition matrix and state to observation matrix based on new data in a process called “training,” which can be performed efficiently using the BaumWelch algorithm.

In this research, we consider the state of a 3D volume, representing an LDL particle in this case, as being in one of three states:

1) Membrane

2) Lipid

3) Buffer

And consider a state transition to be the transition between one 3D voxel to another in one direction. An observation is the density value of the orthographic projection image. We calculate state transition and state to observation matrices from a 3D model representing the particle, train the matrices with projection images generated from the model, and use that statistical data to compute the most likely series of voxels in the genuine 3D representation of the particle, using the methods described above.

The method of HMM 3D reconstruction used in this project is not only applicable to the lipoprotein, but any particle. Applying the self learning property on the protein structure database will facilitate the 3D structure analysis process and allow for greater precision of the image reconstruction.

Link to Source Code

https://github.com/martinlenore/A3DSRINP/

A3DSRINP-FinalReport.pdf (214 kB)
Written software design documentation

This document is currently not available here.

COinS

Bioinformatics Software Design Projects

2016-01-A3DSRINP-CSC-STA-CMB-522-BPS-542

Other Title

Document Type

Comments

Date of Original Version

Abstract

Link to Source Code

Search

Browse

Author Corner

Bioinformatics Software Design Projects

2016-01-A3DSRINP-CSC-STA-CMB-522-BPS-542

Authors

Other Title

Document Type

Comments

Date of Original Version

Abstract

Link to Source Code

Share

Search

Browse

Author Corner