Show simple item record

dc.contributor.authorJaini, Priyank 15:11:36 (GMT) 15:11:36 (GMT)
dc.description.abstractMultivariate density estimation is a central problem in unsupervised machine learning that has been studied immensely in both statistics and machine learning. Several methods have thus been proposed for density estimation including classical techniques like histograms, kernel density estimation methods, mixture models, and more recently neural density estimation that leverages the recent advances in deep learning and neural networks to tractably represent a density function. In today's age, when large amounts of data are being generated in almost every field, it is of paramount importance to develop density estimation methods that are cheap both computationally and in memory cost. The main contribution of this thesis is in providing a principled study of parametric density estimation methods using mixture models and triangular maps for neural density estimation. The first part of the thesis focuses on the compact representation of mixture models using deep architectures like latent tree models, hidden Markov models, tensorial mixture models, hierarchical tensor formats and sum-product networks. It provides a unifying view of possible representations of mixture models using such deep architectures. The unifying view allows us to prove exponential separation between deep mixture models and mixture models represented using shallow architectures, demonstrating the benefits of depth in their representation. In a surprising result thereafter, we prove that a deep mixture model can be approximated using the conditional gradient algorithm by a shallow architecture of polynomial size w.r.t. the inverse of the approximation accuracy. Next, we address the more practical problem of density estimation of mixture models for streaming data by proposing an online Bayesian Moment Matching algorithm for Gaussian mixture models that can be distributed over several processors for fast computation. Exact Bayesian learning of mixture models is intractable because the number of terms in the posterior grows exponentially w.r.t. to the number of observations. We circumvent this problem by projecting the exact posterior on to a simple family of densities by matching a set of sufficient moments. Subsequently, we extend this algorithm for sequential data modeling using transfer learning by learning a hidden Markov model over the observations with Gaussian mixtures. We apply this algorithm on three diverse applications of activity recognition based on smartphone sensors, sleep stage classification for predicting neurological disorders using electroencephalography data and network size prediction for telecommunication networks. In the second part, we focus on neural density estimation methods where we provide a unified framework for estimating densities using monotone and bijective triangular maps represented using deep neural networks. Using this unified framework we study the limitations and representation power of recent flow based and autoregressive methods. Based on this framework, we subsequently propose a novel Sum-of-Squares polynomial flow that is interpretable, universal and easy to trainen
dc.publisherUniversity of Waterlooen
dc.subjectmachine learningen
dc.subjectunsupervised learningen
dc.subjectdeep learningen
dc.subjectprobabilitic graphical modelsen
dc.titleLikelihood-based Density Estimation using Deep Architecturesen
dc.typeDoctoral Thesisen
dc.pendingfalse R. Cheriton School of Computer Scienceen Scienceen of Waterlooen
uws-etd.degreeDoctor of Philosophyen
uws.contributor.advisorPoupart, Pascal
uws.contributor.advisorYu, Yaoliang
uws.contributor.affiliation1Faculty of Mathematicsen

Files in this item


This item appears in the following Collection(s)

Show simple item record


University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages