Information Matrices in Estimating Function Approach: Tests for Model Misspecification and Model Selection

Zhou, Qian

Information Matrices in Estimating Function Approach: Tests for Model Misspecification and Model Selection

Files

Qian_Zhou_thesis.pdf (1.11 MB)

Date

2009-08-26T20:41:50Z

Authors

Zhou, Qian

Publisher

University of Waterloo

Abstract

Estimating functions have been widely used for parameter estimation in various statistical problems. Regular estimating functions produce parameter estimators which have desirable properties, such as consistency and asymptotic normality. In quasi-likelihood inference, an important example of estimating functions, correct specification of the first two moments of the underlying distribution leads to the information unbiasedness, which states that two forms of the information matrix: the negative sensitivity matrix (negative expectation of the first order derivative of an estimating function) and the variability matrix (variance of an estimating function) are equal, or in other words, the analogue of the Fisher information is equivalent to the Godambe information. Consequently, the information unbiasedness indicates that the model-based covariance matrix estimator and sandwich covariance matrix estimator are equivalent. By comparing the model-based and sandwich variance estimators, we propose information ratio (IR) statistics for testing model misspecification of variance/covariance structure under correctly specified mean structure, in the context of linear regression models, generalized linear regression models and generalized estimating equations. Asymptotic properties of the IR statistics are discussed. In addition, through intensive simulation studies, we show that the IR statistics are powerful in various applications: test for heteroscedasticity in linear regression models, test for overdispersion in count data, and test for misspecified variance function and/or misspecified working correlation structure. Moreover, the IR statistics appear more powerful than the classical information matrix test proposed by White (1982). In the literature, model selection criteria have been intensively discussed, but almost all of them target choosing the optimal mean structure. In this thesis, two model selection procedures are proposed for selecting the optimal variance/covariance structure among a collection of candidate structures. One is based on a sequence of the IR tests for all the competing variance/covariance structures. The other is based on an ``information discrepancy criterion" (IDC), which provides a measurement of discrepancy between the negative sensitivity matrix and the variability matrix. In fact, this IDC characterizes the relative efficiency loss when using a certain candidate variance/covariance structure, compared with the true but unknown structure. Through simulation studies and analyses of two data sets, it is shown that the two proposed model selection methods both have a high rate of detecting the true/optimal variance/covariance structure. In particular, since the IDC magnifies the difference among the competing structures, it is highly sensitive to detect the most appropriate variance/covariance structure.