Kernel Methods in Computer-Aided Constructive Drug Design
Loading...
Date
2009-05-14T15:55:35Z
Authors
Wong, William Wai Lun
Advisor
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
A drug is typically a small molecule that interacts with the binding site of some
target protein. Drug design involves the optimization of this interaction so that the
drug effectively binds with the target protein while not binding with other proteins
(an event that could produce dangerous side effects). Computational drug design
involves the geometric modeling of drug molecules, with the goal of generating
similar molecules that will be more effective drug candidates. It is necessary that
algorithms incorporate strategies to measure molecular similarity by comparing
molecular descriptors that may involve dozens to hundreds of attributes. We use
kernel-based methods to define these measures of similarity. Kernels are general
functions that can be used to formulate similarity comparisons.
The overall goal of this thesis is to develop effective and efficient computational
methods that are reliant on transparent mathematical descriptors of molecules with
applications to affinity prediction, detection of multiple binding modes, and generation
of new drug leads. While in this thesis we derive computational strategies for
the discovery of new drug leads, our approach differs from the traditional ligandbased
approach. We have developed novel procedures to calculate inverse mappings
and subsequently recover the structure of a potential drug lead. The contributions
of this thesis are the following:
1. We propose a vector space model molecular descriptor (VSMMD) based on
a vector space model that is suitable for kernel studies in QSAR modeling.
Our experiments have provided convincing comparative empirical evidence
that our descriptor formulation in conjunction with kernel based regression
algorithms can provide sufficient discrimination to predict various biological
activities of a molecule with reasonable accuracy.
2. We present a new component selection algorithm KACS (Kernel Alignment
Component Selection) based on kernel alignment for a QSAR study. Kernel
alignment has been developed as a measure of similarity between two kernel
functions. In our algorithm, we refine kernel alignment as an evaluation tool,
using recursive component elimination to eventually select the most important
components for classification. We have demonstrated empirically and proven
theoretically that our algorithm works well for finding the most important
components in different QSAR data sets.
3. We extend the VSMMD in conjunction with a kernel based clustering algorithm
to the prediction of multiple binding modes, a challenging area of
research that has been previously studied by means of time consuming docking
simulations. The results reported in this study provide strong empirical
evidence that our strategy has enough resolving power to distinguish multiple
binding modes through the use of a standard k-means algorithm.
4. We develop a set of reverse engineering strategies for QSAR modeling based
on our VSMMD. These strategies include:
(a) The use of a kernel feature space algorithm to design or modify descriptor
image points in a feature space.
(b) The deployment of a pre-image algorithm to map the newly defined
descriptor image points in the feature space back to the input space of
the descriptors.
(c) The design of a probabilistic strategy to convert new descriptors to meaningful
chemical graph templates.
The most important aspect of these contributions is the presentation of strategies that actually generate the structure of a new drug candidate. While the training
set is still used to generate a new image point in the feature space, the reverse engineering
strategies just described allows us to develop a new drug candidate that is
independent of issues related to probability distribution constraints placed on test
set molecules.
Description
Keywords
drug design, kernel methods, QSAR, inverse-QSAR, feature selection