Online Bayesian Learning in Probabilistic Graphical Models using Moment Matching with Applications

Omar, Farheen

Online Bayesian Learning in Probabilistic Graphical Models using Moment Matching with Applications

Files

Omar_Farheen.pdf (1.55 MB)

Date

2016-05-18

Authors

Omar, Farheen

Advisor

Poupart, Pascal

Publisher

University of Waterloo

Abstract

Probabilistic Graphical Models are often used to e fficiently encode uncertainty in real world problems as probability distributions. Bayesian learning allows us to compute a posterior distribution over the parameters of these distributions based on observed data. One of the main challenges in Bayesian learning is that the posterior distribution can become exponentially complex as new data becomes available. Secondly, many algorithms require all the data to be present in memory before the parameters can be learned and may require retraining when new data becomes available. This is problematic for big data and expensive for streaming applications where new data arrives constantly. In this work I have proposed an online moment matching algorithm for Bayesian learning called Bayesian Moment Matching (BMM). This algorithm is based on Assumed Density Filtering (ADF) and allows us to update the posterior in a constant amount of time as new data arrives. In BMM, after new data is received, the exact posterior is projected onto a family of distributions indexed by a set of parameters. This projection is accomplished by matching the moments of this approximate posterior with those of the exact one. This allows us to update the posterior at each step in constant time. The eff ectiveness of this technique has been demonstrated on two real world problems. - Topic Modelling: Latent Dirichlet Allocation (LDA) is a statistical topic model that examines a set of documents and based on the statistics of the words in each document, discovers what is the distribution over topics for each document. - Activity Recognition: Tung et al have developed an instrumented rolling walker with sensors and cameras to autonomously monitor the user outside the laboratory setting. I have developed automated techniques to identify the activities performed by users with respect to the walker (e.g.,walking, standing, turning) using a Bayesian network called Hidden Markov Model. This problem is signi cant for applied health scientists who are studying the eff ectiveness of walkers to prevent falls. My main contributions in this work are: - In this work, I have given a novel interpretation of moment matching by showingthat there exists a set of initial distributions (di erent from the prior) for which exact Bayesian learning yields the same first and second order moments in the posterior as moment matching. Hence the Bayesian Moment matching algorithm is exact with respect to an implicit posterior. - Label switching is a problem which arises in unsupervised learning because labels can be assigned to hidden variables in a Hidden Markov Model in all possible permutations without changing the model. I also show that even though the exact posterior has n! components each corresponding to a permutation of the hidden states, moment matching for a slightly di fferent distribution can allow us to compute the moments without enumerating all the permutations. - In traditional ADF, the approximate posterior at every time step is constructed by minimizing KL divergence between the approximate and exact posterior. In case the prior is from the exponential family, this boils down to matching the "natural" moments. This can lead to a time complexity which is the order of the number of variables in the problem at every time step. This can become problematic particularly in LDA, where the number of variables is of the order of the dictionary size which can be very large. I have derived an algorithm for moment matching called Linear Moment Matching which updates all the moments in O(n) where n is the number of hidden states. - I have derived a Bayesian Moment Matching algorithm (BMM) for LDA and compared the performance of BMM against existing techniques for topic modelling using multiple real world data sets. -I have developed a model for activity recognition using Hidden Markov Models (HMMs). I also analyse existing parameter learning techniques for HMMs in terms of accuracy. The accuracy of the generative HMM model is also compared to that of a discriminative CRF model. - I have also derived a Bayesian Moment Matching algorithm for Activity Recognition. The e ffectiveness of this algorithm on learning model parameters is analysed using two experiments conducted with real patients and a control group of walker users.