Show simple item record

dc.contributor.authorWu, Ying
dc.date.accessioned2016-07-21 17:49:16 (GMT)
dc.date.available2016-07-21 17:49:16 (GMT)
dc.date.issued2016-07-21
dc.date.submitted2016
dc.identifier.urihttp://hdl.handle.net/10012/10592
dc.description.abstractThis thesis is concerned with statistical modeling and prediction of disease processes subject to intermittent observation. Times of disease progression are interval-censored when progression status is only known at a series of assessment times. This situation arises routinely in clinical trials and cohort studies when events of interest are only detectable upon imaging, based on blood tests, or upon careful clinical examination. The work that follows is motivated by the study of demographic, genetic and clinical data available from the University of Toronto Psoriasis Registry and the University of Toronto Psoriatic Arthritis Registry, each involving cohorts of several hundred patients with the respective diseases. Chapter 2 deals with the problem of selecting important prognostic biomarkers from a large set of candidates biomarkers when the status with respect to an event of interest (e.g. disease progression) is only known at irregularly spaced and individual-specific assessment times. Penalized regression techniques (e.g. LASSO, adaptive LASSO and SCAD) are adapted to deal with the interval-censored event times arising from this observation scheme. An expectation-maximization algorithm is developed which is demonstrated to perform well in extensive simulation studies involving independent and correlated continuous and binary covariates. Application to the motivating study of the development of arthritis mutilans in patients with psoriatic arthritis is given and several important human leukocyte antigen (HLA) variables are identified for further investigation. Extensions of this algorithm are developed for settings in which data from different sources with distinct disease-related entry conditions are to be synthesized. The extended Turnbull-type expectation-maximization algorithm is based on a complete data likelihood which incorporates missing information from individuals not meeting the entry criteria of the respective registries. Simulation studies demonstrate good empirical performance and an application to the motivating study identifies HLA markers associated with the onset of psoriatic arthritis among individuals with psoriasis. This analysis is carried out using data from a psoriasis registry in which the times to psoriatic arthritis are left-truncated, and psoriatic arthritis registry in which the onset times are right-truncated. Chapter 3 deals with the challenge of assessing the accuracy of a predictive model when response times are interval-censored. Inverse probability weighted (IPW) and augmented inverse probability weighted (AIPW) estimators of predictive accuracy are developed and evaluated based on the mean prediction error and the area under the receiver operating characteristic curve. The weights are estimated from a multistate model which jointly considers the event process, the inspection process, and the right-censoring processes. We investigate the performance of the proposed methods by simulation and illustrate their application in the context of a motivating rheumatology study in which HLA markers are used for predicting disease progression in psoriatic arthritis. A two-phase model is developed in Chapter 4 for chronic diseases which feature an indolent phase followed by a phase with more active disease resulting in progression and damage. The time-scales for the intensity functions for the active phase are more naturally based on the time since the start of the active phase, corresponding to a semi-Markov formulation. In cohort studies for which the disease status is only known at a series of clinical assessment times, transition times are interval-censored which means the time origin for phase II is interval-censored. Weakly parametric models with piecewise constant baseline hazard and rate functions are specified and an expectation-maximization algorithm is described for model fitting. A computationally faster two-stage estimation procedure is also developed and the asymptotic variances of the resulting estimators are derived. Simulation studies examining the performance of the proposed model show good performance under both maximum likelihood and two-stage estimation. An application to data from the motivating study of disease progression in psoriatic arthritis illustrates the procedure, and identifies new human leukocyte antigens associated with the duration of the indolent phase, and others associated with disease progression in the active phase. Open problems and topics for ongoing and future research are discussed in Chapter 5.en
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.titleModeling and Prediction of Disease Processes Subject to Intermittent Observationen
dc.typeDoctoral Thesisen
dc.pendingfalse
uws-etd.degree.departmentStatistics and Actuarial Scienceen
uws-etd.degree.disciplineStatistics (Biostatistics)en
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.degreeDoctor of Philosophyen
uws.contributor.advisorCook, Richard
uws.contributor.affiliation1Faculty of Mathematicsen
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages