Structured Mixture Models

Hou-Liu, Jason

Structured Mixture Models

dc.contributor.advisor	Browne, Ryan P.
dc.contributor.author	Hou-Liu, Jason
dc.date.accessioned	2023-08-11T13:27:46Z
dc.date.available	2023-08-11T13:27:46Z
dc.date.issued	2023-08-11
dc.date.submitted	2023-08-08
dc.description.abstract	Finite mixture models are a staple of model-based clustering approaches for distinguishing subgroups. A common mixture model is the finite Gaussian mixture model, whose degrees of freedom scales quadratically with increasing data dimension. Methods in the literature often tackle the degrees of freedom of the Gaussian mixture model by sharing parameters between the eigendecomposition of covariance matrices across all mixture components. We posit finite Gaussian mixture models with alternate forms of parameter sharing by imposing additional structure on the parameters, such as sharing parameters with other components as a convex combination of the corresponding parent components or by imposing a sequence of hierarchical clustering structure in orthogonal subspaces with common parameters across levels. Estimation procedures using the Expectation-Maximization (EM) algorithm are derived throughout, with application to simulated and real-world datasets. As well, the proposed model structures have an interpretable meaning that can shed light on clustering analyses performed by practitioners in the context of their data. The EM algorithm is a popular estimation method for tackling issues of latent data, such as in finite mixture models where component memberships are often latent. One aspect of the EM algorithm that hampers estimation is a slow rate of convergence, which affects the estimation of finite Gaussian mixture models. To explore avenues of improvement, we explore the extrapolation of the sequence of conditional expectations admitting general EM procedures, with minimal modifications for many common models. With the same mindset of accelerating iterative algorithms, we also examine the use of approximate sketching methods in estimating generalized linear models via iteratively re-weighted least squares, with emphasis on practical data infrastructure constraints. We propose a sketching method that controls for both data transfer and computation costs, the former of which is often overlooked in asymptotic complexity analyses, and are able to achieve an approximate result in much faster wall-clock time compared to the exact solution on real-world hardware, and can estimate standard errors in addition to point estimates.	en
dc.identifier.uri	http://hdl.handle.net/10012/19676
dc.language.iso	en	en
dc.pending	false
dc.publisher	University of Waterloo	en
dc.subject	mixture models	en
dc.subject	expectation-maximization	en
dc.subject	sketching	en
dc.subject	model-based clustering	en
dc.subject	generalized linear models	en
dc.title	Structured Mixture Models	en
dc.type	Doctoral Thesis	en
uws-etd.degree	Doctor of Philosophy	en
uws-etd.degree.department	Statistics and Actuarial Science	en
uws-etd.degree.discipline	Statistics	en
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.embargo.terms	0	en
uws.contributor.advisor	Browne, Ryan P.
uws.contributor.affiliation1	Faculty of Mathematics	en
uws.peerReviewStatus	Unreviewed	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.scholarLevel	Graduate	en
uws.typeOfResource	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Hou-Liu_Jason.pdf
Size:: 8.48 MB
Format:: Adobe Portable Document Format
Description:: Manuscript

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.4 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Theses
Statistics and Actuarial Science