dc.description.abstract | Protein crystals populate diverse conformational ensembles. Despite much evidence that
there is widespread conformational polymorphism in protein side chains, most of the xray
crystallography data are modelled by single conformations in the Protein Data Bank.
The ability to extract or to predict these conformational polymorphisms is of crucial importance,
as it facilitates deeper understanding of protein dynamics and functionality.
This dissertation describes a computational strategy capable of predicting side-chain polymorphisms.
The applied approach extends a particular class of algorithms for side-chain
prediction by modelling the side-chain dihedral angles more appropriately as continuous
rather than discrete variables. Employing a new inferential technique known as particle
belief propagation (PBP), we predict residue-speci c distributions that encode information
about side-chain polymorphisms. The predicted polymorphisms are in relatively close
agreement with results from a state-of-the-art approach based on x-ray crystallography
data. This approach characterizes the conformational polymorphisms of side chains using
electron density information, and has successfully discovered previously unmodelled
conformations.
Furthermore, it is known that coupled
uctuations and concerted motions of residues
can reveal pathways of communication used for information propagation in a molecule
and hence, can help in understanding the \allostery" phenomenon in proteins. In order
to characterize the coupled motions, most existing methods infer structural dependencies
among a protein's residues. However, recent studies have highlighted the role of coupled
side-chain
uctuations alone in the allosteric behaviour of proteins, in contrast to a
common belief that the backbone motions play the main role in allostery. These studies
and the aforementioned recent discoveries about prevalent alternate side-chain conformations
(conformational polymorphism) accentuate the need to devise new computational
approaches that acknowledge side chains' roles. As well, these approaches must consider
the polymorphic nature of the side chains, and incorporate e ects of this phenomenon
(polymorphism) in the study of information transmission and functional interactions of
residues in a molecule. Such frameworks can provide a more accurate understanding of the
allosteric behaviour.
Hence, as a topic related to the conformational polymorphism, this dissertation addresses
the problem of inferring directly coupled side chains, as well. First, we present a
novel approach to generate an ensemble of conformations and an e cient computational
method to extract direct couplings of side chains in allosteric proteins. These direct couplings
are used to provide sparse network representations of the coupled side chains. The
framework is based on a fairly new statistical method, named graphical lasso (GLASSO),
iii
devised for sparse graph estimation. In the proposed GLASSO-based framework, the sidechain
conformational polymorphism is taken into account. It is shown that by studying
the intrinsic dynamics of an inactive structure alone, we are able to construct a network of
functionally crucial residues. Second, we show that the proposed method is capable of providing
a magni ed view of the coupled and conformationally polymorphic side chains. This
model reveals couplings between the alternate conformations of a coupled residue pair. To
the best of our knowledge, this is the rst computational method for extracting networks
of side chains' alternate conformations. Such networks help in providing a detailed image
of side-chain dynamics in functionally important and conformationally polymorphic sites,
such as binding and/or allosteric sites. This information may assist in new drug-design
alternatives.
Side-chain conformations are commonly represented by multivariate angular variables.
However, the GLASSO and other existing methods that can be applied to the aforementioned
inference task are not capable of handling multivariate angular data. This dissertation
further proposes a novel method to infer direct couplings from this type of data, and
shows that this method is useful for identifying functional regions and their interactions in
allosteric proteins. The proposed framework is a novel extension of canonical correlation
analysis (CCA), which we call \kernelized partial CCA" (or simply KPCCA). Using the
conformational information and
uctuations of the inactive structure alone for allosteric
proteins in the Ras and other Ras-like families, the KPCCA method identi ed allosterically
important residues not only as strongly coupled ones but also in densely connected
regions of the interaction graph formed by the inferred couplings. The results were in good
agreement with other empirical ndings and outperformed those obtained by the GLASSO-based framework. By studying distinct members of the Ras, Rho, and Rab sub-families,
we show further that KPCCA is capable of inferring common allosteric characteristics in
the small G protein super-family. | en |