Applications of Geometry in Optimization and Statistical Estimation

Maroufy, Vahed

Applications of Geometry in Optimization and Statistical Estimation

Files

Maroufy_Vahed.pdf (2.75 MB)

Date

2016-01-25

Authors

Maroufy, Vahed

Advisor

Marriott, Paul

Publisher

University of Waterloo

Abstract

Geometric properties of statistical models and their influence on statistical inference and asymptotic theory reveal the profound relationship between geometry and statistics. This thesis studies applications of convex and differential geometry to statistical inference, optimization and modelling. We, particularly, investigate how geometric understanding assists statisticians in dealing with non-standard inferential problems by developing novel theory and designing efficient computational algorithms. The thesis is organized in six chapters as it follows. Chapter 1 provides an abstract overview to a wide range of geometric tools, including affine, convex and differential geometry. It also provides the reader with a short literature review on the applications of geometry in statistical inference and exposes the geometric structure of commonly used statistical models. The contributions of this thesis are organized in the following four chapters, each of which is the focus of a submitted paper which is either accepted or under revision. Chapter 2 introduces a new parametrization to general family of mixture models of the exponential family. Despite the flexibility and popularity of mixture models, their associated parameter spaces are often difficult to represent due to fundamental identification problems. Other related problems include the difficulty of estimating the number of components, possible unboundedness and non-concavity of the log-likelihood function, non-finite Fisher information, and boundary problems giving rise to non-standard analysis. For instance, the order of a finite mixture is not well defined and often can not be estimated from a finite sample when components are not well separated, or some are not observed in the sample. We introduce a novel family of models, called the discrete mixture of local mixture models, which reparametrizes the space of general mixtures of the exponential family, in a way that the parameters are identifiable, interpretable, and, due to a tractable geometric structure, the space allows fast computational algorithms. This family also gives a well-defined characterization to the number of components problem. The component densities are flexible enough for fitting mixture models with unidentifiable components, and our proposed algorithm only includes the components for which there is enough information in the sample. Chapter 3 uses geometric concepts to characterize the parameter space of local mixture models (LMM), introduced in \cite{Marriott2002} as a local approximation to continuous mixture models. Although LMMs are shown to satisfy nice inferential properties, their parameter space is restricted by two types of boundaries, called the hard boundary and the soft boundary. The hard boundary guarantees that an LMM is a density function, while the soft boundary ensures that it behaves locally in a similar way to a mixture model. The boundaries are shown to have particular geometric structures that can be characterized by geometry of polytopes, ruled surface and developable surfaces. As working examples the LMM of a normal model and the LMM of a Poisson distribution are considered. The boundaries described in this chapter have both discrete aspects, (i.e. the ability to be approximated by polytopes), and smooth aspects (i.e. regions where the boundaries are exactly or approximately smooth). Chapter 4 uses the model space introduced in Chapter 2 for extending a prior model and defining a perturbation space in the Bayesian sensitivity analysis. This perturbation space is well-defined, tractable, and consistent with the elicited prior knowledge, the three properties that improve the methodology in \cite{Gustafson1996}. We study both local and global sensitivity in conjugate Bayesian models. In the local analysis the worst direction of sensitivity is obtained by maximizing the directional derivative of a functional between the perturbation space and the space of posterior expectations. For finding the maximum global sensitivity, however, two criteria are used; the divergence between posterior predictive distributions and the difference between posterior expectations. Both local and global analyses lead to optimization problems with a smooth boundary restriction. Chapter 5 studies Cox's proportional hazard model with an unobserved frailty for which no specific distribution is assumed. The likelihood function, which has a mixture structure with an unknown mixing distribution, is approximated by the model introduced in Chapter 2, which is always identifiable and estimable. The nuisance parameters in the approximating model, which represent the frailty distribution through its moments, lie in a convex space with a smooth boundary, characterized as a smooth manifold. Using differential geometric tools, a new algorithm is proposed for maximizing the likelihood function restricted by the smooth yet non-trivial boundary. The regression coefficients, the parameters of interest, are estimated in a two step optimization process, unlike the existed methodology in \cite{Klein1992} which assumes a gamma assumption and uses Expectation-Maximization approach. Simulation studies and data examples are also included, illustrating that the new methodology is promising as it returns small estimation bias; however, it produces larger standard deviation compared to the EM method. The larger standard deviation can be the result of using no information about the shape of the frailty model, while the EM model assumes the gamma model in advance; however, there are still ways to improve this methodology. Also, the simulation section and data analysis in this chapter is rather incomplete and more work needs to be done. Chapter 6 outlines a few topics as future directions and possible extensions to the methodologies developed in this thesis.