Protein Structure Elastic Network Models and the Rank 3 Positive Semidefinite Matrix Manifold

Li, Xiao-Bo

Protein Structure Elastic Network Models and the Rank 3 Positive Semidefinite Matrix Manifold

Files

Li_Xiao-Bo.pdf (6.38 MB)

Date

2019-01-09

Authors

Li, Xiao-Bo

Advisor

Burkowski, Forbes

Publisher

University of Waterloo

Abstract

This thesis is a contribution to the study of protein dynamics using elastic network models (ENMs). An ENM is an abstraction of a protein structure where inter-atomic interactions are assumed to be modelled by a Hookean potential energy which is a function if inter-atomic distances. This model has been studied by various authors, and despite being a very simple model, can nonetheless provide a realistic understanding of protein dynamics. For example, it was shown by Tirion, that the Hookean potential energy can reproduce the normal mode fluctuations of the more complicated semi-empirical potential. In addition, it was shown by Tekpinar and Zheng that an ENM can correctly model the order of local conformational changes before global conformational changes during ATP-driven conformational changes. The purpose of this thesis is to provide a second mathematical formulation for modelling ENMs. This thesis suggests removing the square-root in the Hookean potential which leads to a positive semidefinite (PSD) potential that is a function of quadrances rather than distances. There are many similarities between the two approaches, but also many differences. One main difference is PSD matrices are linearly related to quadrance, the square of distance, which opens the way to model the PSD potential using perceptrons whose weight matrix is a rank 3 PSD matrix. This interesting consequence is left as a topic of future research. The PSD potential is just as appropriate for modelling ENMs as observed by the following two agreements: The PSD potential produces normal mode fluctuations that agree with the Hookean potential introduced by Tirion. This agreement suggests both potentials provide the same information about a protein structure's flexibility. The generalization of the Hookean iENM potential (introduced by Tekpinar and Zheng) to the PSD iENM potential also interpolates the local conformational changes before the global conformational changes, in agreement with the original Hookean observations. Recall that the equations of motion in classical mechanics is formulated using an abstract Riemannian manifold. This abstraction gives modellers the flexibility to consider different Riemannian manifolds appropriate to the problem. After the introduction of the Hookean potential, the study of protein dynamics still uses the 3n dimensional Euclidean space as the Riemannian manifold, the same Riemannian manifold used by the semi-empirical potential. This is because both the semi-empirical potential and the Hookean potential assume the atomic coordinates of a protein structure are represented by a 3n by 1 vector. However, with the introduction of the PSD potential, the protein structure's atomic coordinates are represented as a point on the rank 3 n by n PSD matrix manifold. Consequently, a new Riemannian manifold for modelling protein dynamics has been proposed. In order to model protein dynamics on the rank 3 PSD matrix manifold, the equations of motion needs to be defined. This thesis presents the geometric objects: horizontal projection, gradient, Hessian, and retraction required for formulating the equations of motion for protein structures as an optimization problem on the rank 3 PSD matrix manifold. These formulas are a modification of the original formulas introduced by Journée et al. to allow constraints relevant to a protein structure to be described. Rosen's correction to the constraint manifold was already introduced in 1961, and was reintroduced by Goldenthal et al. in 2007 under the name of ``the fast projection algorithm''. Rosen's, Goldenthal et al.'s, and Journée et al.'s work are all closely related but were developed independently. This thesis makes their relationship more apparent.