The Libraries will be performing routine maintenance on UWSpace on July 15th-16th, 2025. UWSpace will be available, though users may experience service lags during this time. We recommend all users avoid submitting new items to UWSpace until maintenance is completed.
 

Probabilistic Graphical Models and Algorithms for

Loading...
Thumbnail Image

Date

2008-05-26T16:24:08Z

Authors

Jiao, Feng

Advisor

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

In this thesis I present research in two fields: machine learning and computational biology. First, I develop new machine learning methods for graphical models that can be applied to protein problems. Then I apply graphical model algorithms to protein problems, obtaining improvements in protein structure prediction and protein structure alignment. First,in the machine learning work, I focus on a special kind of graphical model---conditional random fields (CRFs). Here, I present a new semi-supervised training procedure for CRFs that can be used to train sequence segmentors and labellers from a combination of labeled and unlabeled training data. Such learning algorithms can be applied to protein and gene name entity recognition problems. This work provides one of the first semi-supervised discriminative training methods for structured classification. Second, in my computational biology work, I focus mainly on protein problems. In particular, I first propose a tree decomposition method for solving the protein structure prediction and protein structure alignment problems. In so doing, I reveal why tree decomposition is a good method for many protein problems. Then, I propose a computational framework for detection of similar structures of a target protein with sparse NMR data, which can help to predict protein structure using experimental data. Finally, I propose a new machine learning approach---LS_Boost---to solve the protein fold recognition problem, which is one of the key steps in protein structure prediction. After a thorough comparison, the algorithm is proved to be both more accurate and more efficient than traditional z-Score method and other machine learning methods.

Description

Keywords

machine learning, computational biology

LC Subject Headings

Citation