Statistical Methods on Survival Data with Measurement Error
In survival data analysis, covariates are often subject to measurement error. A naive analysis with measurement error ignored commonly leads to biased estimation of parameters of survival models. Measurement error also causes efficiency loss for detecting possible association between risk factors and time to event. Furthermore, it induces difficulty on model building and model checking, because the presence of measurement error frequently masks true underlying patterns of data. Although there has been a large body of literature to handle error-prone survival data since the paper by Prentice (1982), many important issues still remain unexplored in this area. This thesis focuses on several important issues of survival analysis with covariate measurement error. One problem that has received little attention is on misspecification of measurement error models. In this thesis, we investigate this important problem with the attention particularly paid to error-contaminated survival data under the Cox model. In particular, we conduct bias analysis which offers a way to unify many existing methods of survival data with measurement error, and study the impact of misspecifying the error models in survival data analysis. A simple expression is obtained to quantify the bias of "working" estimators derived under misspecified error models. Consistent estimators under general error models are derived based on this simple expression. Furthermore, we study hypothesis testing with both model misspecification and measurement error present. A second problem of our interest is about the validity of survival model assumptions when measurement error is involved. In the literature, a large number of methods have been developed to correct for measurement error effects, and these methods basically assume the survival model to be the Cox model. When the Cox model or the error model assumptions fail to hold, existing methods would break down. In this thesis, we address the issue of checking the Cox model assumptions with measurement error. We propose valid goodness of fit tests for survival data with covariate measurement error. This research offers us an important addition to the literature of survival data with measurement error. Our third topic concerns survival data analysis under additive hazards models with covariate measurement error. The additive hazards model is a useful and important alternative to the Cox model. However, this model is relatively less studied for situations where covariates are measured with error. In this thesis, we make important contributions to this topic. Specifically, we explore asymptotic bias induced from ignoring measurement error. A number of inference methods are developed to correct for error effects. The validity of the proposed methods is justified both theoretically and empirically. We investigate issues of model checking and model misspecification as well. In many studies, collection of data often includes a large number of variables in which many of them are unimportant in explaining survival of an individual. An important task is thus to identify relevant risk factors which are linked to the hazards of subjects. Although there is work on variable selection for survival data analysis, the available methods typically require all variables be precisely measured. This requirement is, however, often infeasible. More challengingly, in some studies, the dimension of the risk factors can be quite large or even much larger than the size of subjects. Our fourth topic concerns about estimation and variable selection for survival data with high dimensional mismeasured covariates. We propose corrected penalized methods. Our methods can adjust for measurement error effects, and perform estimation and variable selection simultaneously. Our research on this topic closes multiple gaps among the areas of survival analysis, measurement error and variable selection.