Statistical inference for heterogeneous event history data

Ng, Edmund Tze-Man

Statistical inference for heterogeneous event history data

Files

nq22223.pdf (8.74 MB)

Date

1997

Authors

Ng, Edmund Tze-Man

Publisher

University of Waterloo

Abstract

There has recently been considerable interest in the development of statistical methodology for the analysis of event history data. Most of the existing methods are directed to single-event time data or to transitional data based on Markov and semi-Markov assumptions. In many longitudinal studies, however, extensive subject-to-subject variability is present. Although the literature of statistical methods for the analysis of heterogeneous failure time data is vast, there remains a need to further investigate a number of issues pertaining to frailty models for failure time and more general event history data. The goal of this thesis is to develop and investigate statistical methods for modeling heterogeneous event history data. Specifically, we will focus on three areas: (i) tests of homogeneity; (ii) estimation with multiplicative random effects for intensity models; and (iii) marginal models based on cumulative mean functions for point processes. A strategy used throughout this thesis to adopt piecewise constant baseline functions as a compromise between standard parametric and semi-parametric models. Score tests are often used to test for homogeneity. We provide empirical evidence that score tests tend to have poor performance in the context of point processes with small to moderately large samples. Adjustments for the bias of the score statistics, induced by the substitution of parameter estimates, are derived for Poisson processes with parametric and semi-parametric specifications. Simulation studies suggest that the adjustment to the score tests leads to much better performance in small samples. The tests based on piecewise constant intensities proves to be particularly attractive in terms of the type I error rate. Methods of parameter estimation for mixed point processes are investigated by simulation based on Gauss-Hermite integration and the EM algorithm for log-normal and non-parametric random effects distributions respectively. Mixed Poisson and mixed renewal processes are considered. We find that the parameters of the intensity function can be estimated with negligible bias and wit h quite efficient variance estimates by these methods, regardless of the true underlying mixing distribution. However, the estimate for the dispersion parameter tends to be positively biased for the Gauss-Hermite integration when the true mixing distribution is highly discrete. In contrast, variance estimates for the estimates of the masses and mass points are inflated based on the EM algorithm if the true dispersion parameter is large. These methods of estimation are also investigated in the context of a mixed two-state processes. Models which accommodate multiple time scales are also examined here. Finally, when interest lies in relating the number of events of a point process to covariates, an alternative approach based on mean functions and estimating functions may be employed. We develop and investigate such a model in the context of bivariate point processes. The model formulation only requires correct specification of the mean functions and has full probabilistic specification of the processes is avoided. An optimal criterion is proposed for the estimation function of the mean function parameters. Estimating functions arising from mixed bivariate Poisson processes are introduced as a working model when the covariance structure is unknown. Simulation studies indicate that the mixed Poisson estimating function performs satisfactorily. Data from a recently completed asthma clinical trial are used to illustrate this approach. Although only univariate and bivariate processes are studied here, the methods developed here provide insight and lay the groundwork for methodology directed at the analysis of higher dimensional processes.