Measurement Error and Misclassification in Interval-Censored Life History Data
White, Bethany Joy Giddings
MetadataShow full item record
In practice, data are frequently incomplete in one way or another. It can be a significant challenge to make valid inferences about the parameters of interest in this situation. In this thesis, three problems involving such data are addressed. The first two problems involve interval-censored life history data with mismeasured covariates. Data of this type are incomplete in two ways. First, the exact event times are unknown due to censoring. Second, the true covariate is missing for most, if not all, individuals. This work focuses primarily on the impact of covariate measurement error in progressive multi-state models with data arising from panel (i.e., interval-censored) observation. These types of problems arise frequently in clinical settings (e.g. when disease progression is of interest and patient information is collected during irregularly spaced clinic visits). Two and three state models are considered in this thesis. This work is motivated by a research program on psoriatic arthritis (PsA) where the effects of error-prone covariates on rates of disease progression are of interest and patient information is collected at clinic visits (Gladman et al. 1995; Bond et al. 2006). Information regarding the error distributions were available based on results from a separate study conducted to evaluate the reliability of clinical measurements that are used in PsA treatment and follow-up (Gladman et al. 2004). The asymptotic bias of covariate effects obtained ignoring error in covariates is investigated and shown to be substantial in some settings. In a series of simulation studies, the performance of corrected likelihood methods and methods based on a simulation-extrapolation (SIMEX) algorithm (Cook \& Stefanski 1994) were investigated to address covariate measurement error. The methods implemented were shown to result in much smaller empirical biases and empirical coverage probabilities which were closer to the nominal levels. The third problem considered involves an extreme case of interval censoring known as current status data. Current status data arise when individuals are observed only at a single point in time and it is then determined whether they have experienced the event of interest. To complicate matters, in the problem considered here, an unknown proportion of the population will never experience the event of interest. Again, this type of data is incomplete in two ways. One assessment is made on each individual to determine whether or not an event has occurred. Therefore, the exact event times are unknown for those who will eventually experience the event. In addition, whether or not the individuals will ever experience the event is unknown for those who have not experienced the event by the assessment time. This problem was motivated by a series of orthopedic trials looking at the effect of blood thinners in hip and knee replacement surgeries. These blood thinners can cause a negative serological response in some patients. This response was the outcome of interest and the only available information regarding it was the seroconversion time under current status observation. In this thesis, latent class models with parametric, nonparametric and piecewise constant forms of the seroconversion time distribution are described. They account for the fact that only a proportion of the population will experience the event of interest. Estimators based on an EM algorithm were evaluated via simulation and the orthopedic surgery data were analyzed based on this methodology.