Matrix-Variate Regression with Measurement Error
Abstract
Matrix-variate regression models are useful for featuring data with a matrix structure, such as brain imaging data. However, those methods do not apply to data with measurement error or misclassification. While mismeasurement is an inevitable issue in the data collecting process, little research has been available to handle matrix-variate regression with mismeasurement. In this thesis, we explore several important problems concerning matrix-variate regression with error-contaminated data.
In Chapter 1, we provide a brief introduction for matrix-variate data and review relevant topics including logistic regression analysis, measurement error/misclassification mechanisms, regularization methods, and Bayesian inference procedures.
In Chapter 2, we discuss matrix-variate logistic regression for handling error-contaminated data. Measurement error in covariates has been extensively studied in many conventional regression settings where covariate information is typically expressed in a vector form. However, there has been little work on error-prone matrix-variate data which commonly arise from studies with imaging, spatial-temporal structures. We particularly focus on matrix-variate logistic measurement error models. We examine the biases induced from the naive analysis which ignores measurement error. Two measurement error correction methods are developed to adjust for measurement error effects. The proposed methods are justified both theoretically and empirically. We analyze a data set arising from a study examining electroencephalography(EEG) correlates of genetic predisposition to alcoholism with the proposed methods.
In Chapter 3, we consider a problem complement to that in Chapter 2. Instead of examining mismeasurement in covariates, here we study mismeasurement in binary responses. We particularly investigate the response misclassification effects on the matrix- variate logistic regression model. Matrix-variate logistic regression is useful in facilitating the relationship between the binary response and matrix-variates which arise commonly from medical imaging research. However, such a model is impaired by the presence of the response misclassification. It is imperative to account for misclassification effects when employing matrix-variate logistic regression to handle such data. In this chapter, we develop two inferential methods which account for misclassification effects. The first method is an imputation method which replaces the response variable with an observed and unbiased pseudo-response variable in the estimation procedure. The second method is derived from the likelihood function for the observed response surrogate. Our development is carried out for two settings where misclassification rates are either known or estimated from validation data. The proposed methods are justified both theoretically and empirically. We analyze the breast cancer Wisconsin prognostic data with the proposed methods.
Chapter 4 is a continuation and extension of Chapter 3. We consider regularized matrix- variate logistic regression with response misclassification, where matrix-variate data may assume a sparsity structure. With a limited sample size, the presence of a large number of redundant parameters entails the difficulty of estimation. In this chapter, we develop inferential methods which account for misclassification effects in combination with the inclusion of penalty functions to deal with the sparsity of matrix-variate data. We examine the biases induced from the naive analysis which ignores the response misclassification. Our development is carried out for two settings where misclassification rates are either known or estimated from validation data. The proposed methods are justified both theoretically and empirically. We analyze the breast cancer Wisconsin prognostic data with the proposed methods.
In Chapter 5, we shift our attention to the Bayesian framework. We consider applying Bayesian analysis to matrix-variate logistic regression. We propose a Bayesian algorithm to estimate the matrix-variate parameters element-wisely in combination with the use of horse-shore shrinkage prior. We investigate the influence on parameter estimation when ignoring the response misclassification and propose an algorithm to accommodate the effects of response misclassification. The performance of the proposed method is evaluated through numerical studies. We analyze the Lee Silverman voice treatment (LSVT) Companion data with the proposed method.
Finally, Chapter 6 summarizes the thesis work and presents some future work.
Collections
Cite this version of the work
Junhan Fang
(2020).
Matrix-Variate Regression with Measurement Error. UWSpace.
http://hdl.handle.net/10012/16391
Other formats